Why make videos accessible?
Making videos accessible usually means adding some sort of text equivalent of spoken words through captions, transcripts, or description of the audio. Captions are usually used for individuals who cannot hear the audio it also benefits non-native speakers, users with audio muted or viewers watching a video with poor quality audio.
For education purposes, if a lecture is recorded and put on the web, having captions or transcripts allows everyone to read what is being said or discussed in class. This can provide better comprehension, material review, and information processing for all students with different learning styles.
- By January 1, 2014, new internet websites and web content on those sites must conform with WCAG 2.0 Level A.
- As of January 1, 2021, all internet websites and web content must conform with WCAG 2.0 Level AA, other than, success criteria 1.2.4 Captions (Live), and success criteria 1.2.5 Audio Descriptions (Pre-recorded).
WCAG 2.0 Guideline 1.2.1 - "An alternative for time-based media is provided that presents equivalent information for prerecorded audio-only content."
WCAG 2.0 Guideline 1.2.2 - "Captions are provided for all prerecorded audio content in synchronized media, except when the media is a media alternative for text and is clearly labelled as such."
Please refer to "Video Captioning and Audio Transcripts" of the Website Accessibility section for further information.
Closed vs. Open Captions
Most people are familiar with closed captioning. Closed Captions provide a text equivalent of the audio and can be turned on or off. Most TV’s include the technology to turn on or off closed captions for programming viewed by persons who are deaf or hard of hearing. It is also useful in noisy environments like restaurants and bars.
Open Captions look and display the same information as closed captions except they cannot be turned off. They are a permanent part of the video and are always displayed much like subtitles in a foreign language film.
Transcripts allow anyone that cannot access content from web audio or video to read a text transcript instead. Transcripts do not have to be verbatim accounts of the spoken word in a video but they can contain additional descriptions, explanations, or comments that may be beneficial. Transcripts allow deaf/blind users to get content through the use of refreshable Braille and other devices. For most web video, both captions and a text transcript should be provided.
You can make your own transcripts using speech recognition software such as Dragon Naturally Speaking or speech-to-text built-in to Windows or Mac OSX Lion. The results should be reviewed for errors, especially when low-quality audio or unusual words are used.
Audio descriptions are intended for users with visual disabilities. They provide additional information about what is visible on the screen (e.g. describes non-verbal actions in a program). They are extremely useful on the web if visual content in web video provides important content not available through the audio alone.
Timed Text or DFXP
Stands for "Distribution Format Exchange Profile", is a World Wide Web Consortium (W3C) draft standard. It is an XML markup language specifically designed for marking up timed text, or captions. It is the most common format used by Flash video players that support closed captions.
SMIL (Synchronized Multimedia Integration Language)
A standards-based language used by Quicktime and RealPlayer to control the layout and presentation of visual and audible items. SMIL is used to control the display, positioning, and timing of captions and audio/video multimedia. The captions themselves are stored in a Text Track file if you’re using Quicktime or a RealText file if you’re using RealPlayer.
SAMI (Synchronized Accessible Media Interchange)
This is Microsoft’s technique for adding captions for Windows Media Player. A SAMI file contains the text to be displayed within the captions and information that synchronizes individual caption displays to the multimedia presentation.
SubRip (.srt) and SubViewer (.sub)
These formats are very simple, but slightly different, text formats. Both are officially supported by YouTube, as described on the YouTube help page Getting Started: Adding/Editing Captions. Although it is not specifically documented, YouTube also supports captions uploaded in DFXP and SAMI formats.
Accessible Web-based Videos
Captioning a video on the web involved embedding a file that contains time synchronized text along with the video. On the web, the primary multimedia technologies are Microsoft's Windows Media Player, Apple's Quicktime, RealNetwork's RealPlayer, and Macromedia Flash. Each media player handles captions differently.
To achieve WCAG 2.0 Level A compliance, you need to include either a descriptive text transcript or an audio description (WCAG 2.0 1.2.3). On your webpage, you should take the completed transcript of your captions and add descriptive text that relates what else is going in the video (e.g. actions, body language, scene changes, etc.). Then add a “Transcript” link directly below the video on your web page, and have it link to a separate HTML page containing the transcript.
To provide transcripts and captions you will find you will either have to:
- Edit the automated captions and transcripts (in the case of videos uploaded to YouTube);
- Create your own;
- Or out-source to service.
- Accessibility Hub - Video Captions and Audio Transcripts
- WGBH Captioning FAQ
- AMI - Described Video (Audio Description) Best Practices
Uploading a Video to YouTube
YouTube can create machine-generated captions automatically. However, the resulting captions may be inaccurate depending on several factors such as audio quality, background noise, or number of speakers.
YouTube also provides a transcript link enabling the viewer to read all the captions in one place. But a transcript of just the captions is not sufficient so you need to add descriptive text (e.g. actions, body language, scene changes, etc).
Providing Captions and Transcripts for YouTube Videos
Editing YouTube’s Automated Captions
When you upload a video to YouTube you may use automatically generated captioning to automatically make captions available. However, the quality of the captions may vary from video to video.
Using YouTube’s caption editor you can correct and clean-up your captions that are automatically generated or captions you have uploaded separately. YouTube’s caption editor allows you to:
- Correct misspellings and remove “like” and “umms”.
- Correct the pacing by shifting words to the next or preceding caption to ensure complete phrases never bridge two captions (i.e. nouns and verbs are connected to their modifiers, and prepositional phrases are not separated)
- Delete blank time segments. It’s best to select the timer on your full-sentence caption and increase the time to run for the duration of the full sentence. The idea is to steal time from pauses to fit in the full text, in situations where it’s difficult to get all the words in… but not to go as far as to replace pauses that are part of “the story”.
Adding Descriptive Transcripts to YouTube
- On the YouTube video page, go to the video’s transcript by selecting the “Transcript” icon and copy and paste it to a text file (e.g. Word);
- Remove/delete the timecodes. Alternatively, you may download the .srt file from YouTube’s caption editor, and remove the timecodes using the free software Aegisub for Windows, OS X, or UNIX;
- Search and replace any extra hard-returns and replace all line endings with single spaces;
- Insert any descriptive text. This can be copy/pasted from scripts if available. E.g. “The woman quietly enters the room and turns on the light”;
- Insert the resulting text into a separate web page on your website. Create a “Transcript” link below the embedded video that links to that web page (e.g. Access Forward: Introduction Video for General Requirements Module). Alternatively, if you have room or a short video, you may copy the transcript text and put it below the video (e.g. Health Canada: Reducing Radon in Your Home).
Creating Your Own Transcripts, Audio Descriptions, and Captions
Making your own transcript
Using Voice Recognition
Frequently, people or departments develop scripts to be used in their videos. These scripts already provide the text for a caption or transcript. If you need to create a transcript from scratch you may wish to investigate using voice recognition software. One of the most used is Dragon Naturally Speaking , even the Windows and Mac OSX operating systems have built-in voice recognition. Unfortunately, speech recognition does not offer 100% accuracy and some editing or training with the product may be required.
Speech recognition works best when the system is familiar or trained with one speaker's voice. It does not work well with two or more people having a conversation as the software cannot distinguish between multiple voices. Voice recognition software may also have a difficultly with accurately transcribing certain words like medical or engineering terms. Another drawback is these systems can be easily stymied when a speaker has an accent.
When these conditions exist, the general results of creating a transcript may be poor and usable. Even correcting the files may be more time consuming than creating a transcription from scratch.
Using MAGPie 2.0
MAGpie 2.0 is a free captioning and audio description authoring tool for making multimedia accessible to persons with sensory disabilities. MAGPie is Mac/PC compatible. Other software would be similar to use.
Below is a link to a tutorial that walks you through the process of creating audio description for a video using MAGPie.
Creating your own Captions
Creating your own captions often requires using captioning software. Depending on the type of software you use and the length of the video you are captioning this may take time. Like most tasks, once you do this once or twice, captioning becomes much easier.
Common Best Practices
Common web accessibility guidelines indicate that captions should be synchronized at approximately the same time that audio would be available, equivalent to that of the spoken word, and accessible to those who need it.
In order to provide a closed caption file for use with a video player you can choose to make your own using software or out-source to a company for a fee.
- Camtasia (cost involved)
- Captionate (cost involved)
- CC for Flash (free)
- Hi-Caption (cost involved, PC)
- MAGpie (free PC and Mac)
- MovieCaptioner (cost involved, Mac and Windows)
Captioning with MAGpie 2.0
Please note: for the purposes of this tutorial we will show how to caption videos using MAGpie 2.0 which is free and Mac/PC compatible. Other software would be similar to use.
Once MAGpie is downloaded and installed, open the software and select “File” then “New Project…” or Control + N keys.
Now set the styles you want for your captions and speakers. The default styles work for most captions, although it is preferred that captions be centred (#2 on screenshot). At the bottom of the window you may enter values for video width and height for the size in which you want your video to display. Caption width should usually be set to the same width as your video. The caption height should be 80 pixels high (#3 on screenshot) which should be the default. Select “OK” when you’re finished.
This opens the “Create New Project Track” window automatically. Here you can create captions or audio descriptions for your media. Since we’re creating closed captions select “Captions” then select “OK”.
MAGpie’s Main Interface
Contained on the main interface you will find text areas to input start and end times, the speaker, and the caption text. There are also controls to control the playback of the media you’re captioning which will open in a separate window automatically.
There are two modes available for navigating the main MAGpie captioning window - Editing Mode and Navigation Mode. Editing Mode allows you to enter caption information into the Window. Navigation Mode allows you to move from caption to caption or between the different options for each caption. To enter Navigation Mode, press Shift + Enter. You can then use the arrow keys to navigate through the table cells. To enter Editing Mode, press Enter or simply begin typing in one of the table cells.
Media Playback Controls
The media clip can be controlled by the menu as stated above but it may be simpler to note that it may also be controlled by the following keyboard commands:
|Insert Start Time from Player||F9|
|Insert End Time from Player||F10|
Make sure your media file is at the beginning by looking timecode area to the right of the playback buttons. It should read 0:00:00.00. These numbers represent the time, with the first digit representing hours, the next two minutes, the next two seconds, and the final two (after the decimal) represent 100ths of a second.
Press F6 to listen to the first sentence or portion of a sentence and then press F6 again to pause. Select the Caption area for Row 1 and type what was spoken. You typically want to limit each row to one or two short sentences. There is a limit as to how much text can be displayed for each caption. If a complete sentence does not fit, you can break it into two captions. Once you have entered the text for the first caption, press the Enter key twice in quick succession. This will insert a new row for a second caption.
Continue until the audio in the media is transcribed. This work can be tedious and time consuming depending on the length of the media you’re using. Do a quick double check to make sure all the captions are of an appropriate length. MAGpie has a built-in spell check under “Edit” then “Check Spelling”.
MAGpie provides a text importer that allows the user to transcribe text using a word-processing document other than the MAGpie interface. Common word-processing programs include MS WordPad, MS Word, Corel WordPerfect, etc.
This is a handy option if for example a professor wants to upload a video of a lecture. The professor could use speech-to-text software (e.g. Dragon Naturally Speaking, speech-to-text built-in to Windows or Mac OSX Lion) during the lecture to capture what they said in MS Word. By repeating questions from students and other dialogue from interactions a transcript of the lecture may be produced quite easily.
- Edit the transcriptions for errors and spelling mistakes.
- Insert a carriage return between any block of texts or sentences.
- Make sure to insert one extra carriage return after the last line of text in your transcribed file.
- Save your information as a Text file (with or without line breaks).
- Place the cursor in the first cell of the Caption column in the MAGpie interface and then select “Captions” from the menu bar, then “Insert Captions from file…“.
Formatting and Editing Captions
Once you have imported the transcribed audio portion of the multimedia clip into the MAGpie interface, it may be necessary to format or edit the appearance of the captions in the media player. Select a cell that contains a caption. The cell will be highlighted by a green border. Under the Style command on the menu bar, the text color, font alignment, font appearance can be manipulated. It is generally recommended to maintain consistency of the appearance of the captions for the duration of the multimedia clip.
To synchronize the video to the captioned dialogue:
- Position the multimedia clip you wish to caption at the beginning of the presentation. The timecode should read 0:00:00.00.
- Press F6 key to start the multimedia clip.
- When you hear the initial words of the first caption, press the F9 key. This will “capture” the timecode of multimedia clip and insert this timecode under the Start Time column heading.
- Continue listening to the media and pressing F9 for each caption.
Please Note: It is very important to save your work often by selecting “File” then “Save As” from the menu.
Exporting Caption Files
Now you’re ready to export your caption information in a format that can be used in Quicktime, RealPlayer, Windows Media Player, or Flash. From the menu select “Export” then the media format of the media you are captioning. Review Captioning file types.
Even though your media may be captioned, you should still provide a transcript. Select “Export” then “Plain text”.
If you are embedding your video into a website, you may do so using an inline frame or iFrame (e.g. YouTube uses iframes). In general, iFrames are accessible on a screen reader, but a TITLE for the iFrame is recommended.
Example HTML code:
<iframe allowfullscreen="" frameborder="0" height="225" src="[VIDEO URL]" title="Melissa Vassallo on disability - Queen's Accessibility Hub" width="400"></iframe>
Out-Source Captioning Services (cost involved)
You may decide to leave it up to the professionals to make caption videos for you. The cost companies charge usually depends on the length of the video and the requested turn-around time. Companies usually charge per minute of video and higher fees quicker turn-around times.