Skip to main content

Closed Captions vs Subtitles: Key Differences

These terms are used interchangeably, but they are not the same thing. The distinction matters for accessibility compliance, international distribution, and how you create each type.

By TranscribeVideo.ai Editorial Team

The core distinction

Closed captions are a text representation of all audio content in a video — spoken dialogue, yes, but also non-speech audio: [music playing], [door slams], [applause], [narrator], [JOHN:]. They are designed for viewers who cannot hear the audio at all. The word "closed" means they can be toggled on or off by the viewer (as opposed to "open" captions, which are burned into the video and always visible).

Subtitles are a text translation of the spoken dialogue only. They assume the viewer can hear the audio — they just don't understand the language. Subtitles do not include sound effects, music descriptions, or speaker identification. A French film with English subtitles is a classic example: the viewer hears the French audio; the subtitles provide the English meaning of what is being said.

When does this distinction matter?

Practically speaking, on most video platforms (YouTube, Vimeo, social media) the terms are used interchangeably by the general public. But the distinction matters in three contexts:

  1. Legal compliance. Accessibility laws require captions, not subtitles. The ADA (Americans with Disabilities Act) and Section 508 of the Rehabilitation Act in the US mandate that video content be made accessible to people with hearing disabilities. "Accessible" means captions that include all audio information — not just speech.
  2. International distribution. When localising video for foreign markets, you are producing subtitles — translating speech into another language. This is a different production workflow from captioning.
  3. File format and workflow. Both captions and subtitles typically use .SRT or .VTT files with timestamps. But caption files for professional use sometimes include speaker labels and sound effect notations that subtitle files omit.

Legal requirements for captions

ADA (Americans with Disabilities Act). Applies to places of public accommodation. Courts have increasingly interpreted websites and streaming video as places of public accommodation, requiring captions for video content. Several high-profile lawsuits against streaming services and educational institutions have resulted in mandatory captioning requirements.

Section 508. Applies to federal agencies and organisations receiving federal funding. All video content — whether on websites, internal systems, or distributed digitally — must be captioned.

CVAA (Twenty-First Century Communications and Video Accessibility Act). Requires that video programming shown on television with captions must also have captions when distributed online.

WCAG 2.1 guidelines. Web Content Accessibility Guidelines specify that pre-recorded audio content in video must have captions (Success Criterion 1.2.2, Level A). This is the baseline standard adopted by most international accessibility frameworks.

The practical implication: if you are a business publishing video on your website, an educational institution posting lecture recordings, or any organisation covered by these laws, you need captions — not just subtitles.

Open captions vs closed captions

A further distinction worth knowing:

  • Closed captions are delivered as a separate text track that viewers toggle on or off. They appear as an overlay that can be styled by the viewer's device settings. This is the standard for broadcast TV and most video platforms.
  • Open captions are baked into the video image itself — they are always visible and cannot be turned off. Instagram and TikTok content often uses open captions because those platforms do not always render separate caption tracks reliably.

How to create captions using a transcript

The most reliable workflow for creating accurate captions:

  1. Get the transcript. Use TranscribeVideo.ai to transcribe your video from a URL. The tool generates a time-coded transcript automatically.
  2. Export as .SRT. The .SRT format includes timestamps that sync text segments to the video timeline. Most video platforms (YouTube, Vimeo, Wistia) and editing tools (Premiere Pro, Final Cut) accept .SRT caption files directly.
  3. Add non-speech audio notations. For true accessibility compliance, review the transcript and add descriptions of significant non-speech sounds: [upbeat music], [phone ringing], [crowd cheering]. This step is important if you need to meet ADA or Section 508 requirements.
  4. Upload to your platform. On YouTube, go to YouTube Studio → Subtitles → Add → Upload file. On Vimeo and Wistia, the process is similar.

Creating subtitles for foreign language distribution

If you need subtitles in another language rather than captions in the same language:

  1. Get the English transcript (or the source language transcript) using TranscribeVideo.ai
  2. Translate the transcript using DeepL (best quality for most language pairs) or ChatGPT for less common languages
  3. Format the translated text as an .SRT file with the same timestamps as the original
  4. Upload to your platform as a separate subtitle track for that language

YouTube supports multiple subtitle/caption tracks on a single video, so you can have English captions, Spanish subtitles, and French subtitles all available on the same upload.

Side-by-side: what each format actually contains

To make the distinction concrete, here is how the same 10-second video clip would appear as captions versus subtitles:

Original audio: A woman walks through a doorway. We hear the door creak open. Off-screen, ominous music begins. She says: “I never should have come back here.”

Closed captions (English, for hearing-impaired viewers):

[door creaks open]
[ominous music plays]
SARAH: I never should have come back here.

Subtitles (Spanish, for Spanish-speaking viewers):

Nunca debí haber regresado aquí.

Both lines appear at the same timestamp. The captions describe what is happening sonically; the subtitles only translate the dialogue. A deaf Spanish-speaking viewer would need both.

SDH — the third option you should know about

SDH stands for “Subtitles for the Deaf and Hard of Hearing.” SDH bridges the gap between subtitles and closed captions: it's a foreign-language subtitle track that also includes the non-speech audio descriptions that closed captions provide. Netflix, Disney+, Apple TV, and other streaming platforms now distinguish between three options:

  • Subtitles (e.g., Spanish): Dialogue translation only.
  • SDH (e.g., Spanish [SDH] or Spanish for Deaf and Hard of Hearing): Dialogue translation plus sound-effect descriptions and speaker IDs, in the same target language.
  • Closed captions (English): Same-language dialogue plus sound effects and speaker IDs.

If you're distributing video on a streaming platform and want to be both internationally accessible and disability-accessible, you need both subtitles and SDH tracks for each target language. This is why streaming originals often have 10+ caption/subtitle tracks per title.

Captions vs subtitles vs transcripts

One more term often confused with these two: a transcript is the full text content of a video as a single document, without timestamps tied to playback. Transcripts are useful for:

  • Reading through video content without watching
  • SEO content (search engines index transcript text)
  • Repurposing video into blog posts, articles, or newsletters
  • Searchable archives of long-form content (podcasts, lectures, interviews)

A captions or subtitles file is technically a transcript with timestamps. But a transcript on its own — without timestamps and without the timed display — is a different deliverable for different uses. TranscribeVideo.ai outputs both: the text-only transcript for reading and an SRT file with timestamps for video use.

Why Gen Z watches everything with captions on

An interesting cultural shift over the past five years: caption usage has surged among hearing viewers, especially Gen Z. Surveys consistently find 70-80% of Gen Z viewers watch video content with captions enabled, even when there's no language barrier and no hearing impairment. Reasons commonly cited:

  • Sound-off viewing in public. Phone in coffee shop, train, airport, open office.
  • Comprehension. Modern TV mixes prioritise atmospheric sound over dialogue clarity. Captions catch what the audio mix loses.
  • Multitasking. Watching while doing something else; captions allow looking away while still following.
  • Accent unfamiliarity. Captions help with regional accents the viewer is unfamiliar with — even within their native language.
  • Vocabulary. Captions display unfamiliar words spelled out, easier than parsing them by ear.

The practical implication for content creators: captions are no longer just an accessibility feature. They're a core part of how video is consumed. A captioned video reaches more of your audience and gets watched longer than an uncaptioned one — across every demographic.

Caption file formats — SRT, VTT, SCC, TTML

The text content of captions and subtitles is delivered in standardised file formats. The four you'll encounter:

  • SRT (.srt) — SubRip Subtitle. The most universally supported format. Plain text. Used by YouTube, Vimeo, video editors, and most streaming platforms. Full SRT reference →
  • VTT (.vtt) — Web Video Text Tracks. Required for HTML5 video and modern web embeds. Similar to SRT but with CSS styling support and a WEBVTT header.
  • SCC (.scc) — Scenarist Closed Caption. Used in professional broadcast workflows; mirrors the CEA-608 byte stream of analog TV captions.
  • TTML / DFXP (.xml). XML-based format used by Netflix, Disney+, and high-end streaming. Supports rich styling, positioning, and multiple languages in a single file.

For 95% of use cases — uploading to YouTube, embedding in your own website, importing into a video editor — SRT is the right format. VTT is needed only if you're building a custom HTML5 video player. SCC and TTML are for professional broadcast and streaming workflows respectively.

Common mistakes when adding captions to video

  • Auto-captions never reviewed. YouTube's auto-captions are 90-95% accurate but get proper nouns wrong (people's names, places, brands). Always review before publishing.
  • Missing sound effect descriptions. A caption file without [music], [door slams], [laughter] doesn't meet ADA accessibility requirements. It's subtitles, not captions.
  • Caption blocks too long. Each visible caption block should be under 2 lines and under ~7 seconds on screen. Long blocks cause readers to lose track.
  • Speaker IDs missing in dialogue scenes. When two or more people speak, the viewer needs to know who. Use ALL CAPS speaker labels (JOHN:, SARAH:) or hyphens for back-and-forth.
  • Captions in burned-in form when toggleable would be better. Burned-in captions can't be turned off, sized up, or styled by the viewer. Use them only when the platform doesn't support caption tracks (older social platforms, embedded video in some contexts).
  • No captions on the “short clips”. Even a 15-second teaser needs captions if the parent video does. Accessibility doesn't scale with length.

Quick decision: do I need captions, subtitles, or both?

Use caseWhat you need
YouTube channel, English-only audienceClosed captions in English
Educational institution, students globallyCaptions + subtitles in 1-3 target languages
U.S. business website with videoClosed captions required (ADA Title III)
Federal agency / contractorClosed captions required (Section 508)
Streaming platform originalCaptions + subtitles + SDH per language
Foreign film, English-speaking audienceEnglish subtitles
TikTok or Instagram ReelOpen captions (burned in for sound-off viewing)
Live event, mixed audienceLive captions via Otter, Zoom, or Google Meet

Related guides

TV

TranscribeVideo.ai Editorial Team

TranscribeVideo.ai is built by a team focused on making video content accessible through AI transcription. We test every feature we write about.