What Is a VTT File?
A VTT file is a plain-text subtitle file in the WebVTT format — short for Web Video Text Tracks. WebVTT is the W3C standard for subtitles, captions, chapters, and metadata associated with HTML5 video. If you've added subtitles to a video on a webpage using the HTML <track> element, the file you uploaded was almost certainly VTT.
Get a transcript from any video URL →Definition
VTT (Web Video Text Tracks) is a W3C-standardised subtitle format introduced specifically for use with HTML5 video. It was designed by the WHATWG and W3C in the early 2010s to give the open web a native subtitle format with rich features — styling, positioning, multiple cue tracks — that the older SRT format didn't support. The .vtt file extension is mandatory for use with the HTML5 <track> element; browsers will not load a SRT file as a track. A VTT file is plain UTF-8 text. It opens in any text editor (Notepad, TextEdit, VS Code) and can be edited by hand. Structurally, a VTT file starts with the literal string WEBVTT on the first line, followed by an optional header block, then a sequence of timed cue blocks. Each cue is a timestamp range plus the text to display. Compared to SRT — the more universally supported format — VTT adds an explicit file header, supports cue settings for positioning and alignment, allows CSS styling, and uses periods (rather than commas) to separate seconds from milliseconds in timecodes.
What a VTT file looks like
A minimal valid VTT file is just the WEBVTT header followed by cues. Here is a complete example with three subtitle cues:
WEBVTT 1 00:00:00.000 --> 00:00:03.500 Welcome to the tutorial. 2 00:00:03.600 --> 00:00:07.200 Today we're going to talk about WebVTT files. 3 00:00:07.300 --> 00:00:11.000 Let's start with the basic structure.
Notice three things that distinguish VTT from SRT:
- The WEBVTT header on the first line — required, not optional. Without it, browsers will reject the file.
- Periods, not commas, between seconds and milliseconds in the timecodes (
00:00:03.500in VTT vs00:00:03,500in SRT). - Cue identifiers are optional — the leading numbers (1, 2, 3) can be omitted in VTT. SRT requires them.
VTT with cue settings (positioning)
VTT supports placing cues anywhere on the video, not just centered at the bottom:
WEBVTT 00:00:00.000 --> 00:00:03.500 line:0 align:start [narrator at top of screen] 00:00:04.000 --> 00:00:07.000 line:90% align:middle Standard subtitle position.
The line: setting controls vertical position (0 = top, 100% = bottom). align: controls horizontal alignment. SRT cannot do this — VTT can.
VTT with styling
VTT supports CSS styling via the STYLE block:
WEBVTT
STYLE
::cue {
background-color: rgba(0,0,0,0.8);
color: yellow;
font-family: sans-serif;
}
00:00:00.000 --> 00:00:03.000
Yellow text on dark background.
This is a major capability advantage over SRT, which has no native styling support.
How to open a VTT file
VTT files are plain text, so opening one for reading or editing requires no special tool.
Open as text
- Windows: Right-click the .vtt file → Open with → Notepad. Or use VS Code, Notepad++.
- macOS: Right-click → Open With → TextEdit. Or VS Code, BBEdit, Sublime Text.
- Linux:
cat filename.vttor any text editor.
If TextEdit on Mac shows raw HTML-like markup or garbled characters, the file isn't UTF-8 — open in VS Code which auto-detects encoding.
Use with a video
- HTML5 video on a webpage: Add to your video element with the
<track>tag:<track src="captions.vtt" kind="captions" srclang="en" label="English"> - VLC Media Player: Open the video, drag the .vtt file onto it, or use Subtitle → Add Subtitle File. VLC handles VTT natively.
- YouTube: YouTube accepts VTT uploads (and SRT). YouTube Studio → Subtitles → Add → Upload file.
- Vimeo, Wistia: Both accept VTT directly via their captions upload tool.
- Premiere Pro / Final Cut / DaVinci Resolve: Some workflows accept VTT directly; others want SRT. SRT is more universally supported in editors.
If you have a VTT file and your tool only accepts SRT, conversion is a 5-minute job — change the period to comma in timecodes and remove the WEBVTT header. There are also free online converters that do it in one click.
How to create a VTT file
Three common workflows:
1. Auto-generate from a video
The fastest path. Tools like TranscribeVideo.ai (for YouTube/TikTok/Instagram URLs), Whisper (open-source, file-based), Otter, Rev, and 3PlayMedia all output VTT or can convert their SRT output to VTT trivially. AI accuracy on clear speech is now 90-95%. For accessibility-grade captions, do a 15-30 minute human review pass on each hour of video.
2. Hand-write in a text editor
For a short clip, hand-writing a VTT file is fine. The format is simple. Open a text editor, type WEBVTT on line 1, leave a blank line, then add timed cues following the pattern in the example above. Save with the .vtt extension as UTF-8 encoding.
3. Use a subtitle editor
For longer content with precise timing requirements, a dedicated tool is faster:
- Aegisub (free, all platforms) — professional-grade subtitle editor with waveform view.
- Subtitle Edit (free, Windows) — popular, supports both SRT and VTT natively.
- Kapwing (web-based, freemium) — good for quick caption editing in-browser.
- HappyScribe Online VTT Editor — free web tool for editing VTT specifically.
Critical details when creating VTT
- Always start with
WEBVTTon line 1. Without this, browsers won't load the file. - Use periods, not commas, in timecodes (
00:00:01.500, not00:00:01,500). - UTF-8 encoding required. Non-ASCII characters break in other encodings.
- Blank line between cues. Required by the spec.
- Cue ID is optional. Unlike SRT, you can omit the leading number.
VTT vs SRT — which should you use?
Both formats serve the same purpose but fit different contexts. The decision is usually obvious once you know what platform you're targeting.
Use VTT when
- Embedding subtitles in HTML5 video on your own website using the
<track>element - You need positioning control (subtitle anywhere on the screen, not just bottom-center)
- You need CSS styling (custom fonts, colors, backgrounds)
- You're working with a tool or framework that requires VTT specifically (Vimeo player API, JW Player, Brightcove)
Use SRT when
- Uploading to YouTube — both work, but SRT is more universally accepted
- Importing into video editors (Premiere Pro, Final Cut, DaVinci Resolve, iMovie)
- Distributing across many platforms — SRT has wider compatibility
- You don't need styling or positioning
Quick conversion
Converting SRT to VTT is mechanical: add WEBVTT as the first line followed by a blank line, then change all commas in timecodes to periods. That's it. Most online converters do this; you can also do it in any text editor with find-and-replace.
Going the other way (VTT to SRT) is also straightforward: remove the WEBVTT header, change periods to commas in timecodes, and ensure each cue has a sequence number.
For a deeper comparison, see SRT vs VTT in detail.
Browser support for VTT
WebVTT is supported in every major modern browser. The HTML5 <track> element loads VTT natively without any JavaScript or library:
| Browser | VTT support | Notes |
|---|---|---|
| Chrome | Yes (since v18) | Full support including styling and cue settings |
| Firefox | Yes (since v31) | Full support |
| Safari | Yes (since v6) | Full support; some quirks with STYLE blocks on older versions |
| Edge | Yes | Full support (Chromium-based since 2020) |
| iOS Safari | Yes | Renders captions in the native iOS overlay style |
| Android Chrome | Yes | Full support |
Minimal HTML5 example
<video controls width="640">
<source src="video.mp4" type="video/mp4">
<track src="captions.vtt"
kind="captions"
srclang="en"
label="English"
default>
</video>
The kind attribute can be: captions (same-language for hearing-impaired), subtitles (translation), descriptions (audio descriptions for visually impaired), chapters (chapter navigation), or metadata (programmatic data). The browser renders captions and subtitles automatically; descriptions, chapters, and metadata require JavaScript to surface.
Beyond captions — what else VTT can do
VTT was designed to handle more than just subtitles. The format supports four kinds of tracks:
Captions
Same-language text describing all audio (speech + sound effects + speaker IDs). For deaf and hard-of-hearing viewers. Most VTT files in the wild are caption tracks.
Subtitles
Translation of dialogue into another language for hearing viewers who don't speak the original. Same VTT format, different kind attribute (kind="subtitles").
Chapters
VTT can define chapter markers in a long video. Each cue is a chapter with a start time and chapter title:
WEBVTT 00:00:00.000 --> 00:02:30.000 Introduction 00:02:30.000 --> 00:08:45.000 Setting up the project 00:08:45.000 --> 00:15:00.000 Writing the first feature
Used by HTML5 video players to render a chapter list or progress bar markers. Available natively without external libraries.
Metadata
Arbitrary data tied to specific video timestamps. Used for cue points in interactive video — triggering UI changes, displaying related content, syncing transcript scroll, or marking ad breaks. Read by JavaScript via the TextTrack API.
This versatility is why VTT remains the W3C-recommended format for HTML5 video, even though SRT remains more universal across non-web platforms.
Feature Comparison
| Feature | VTT (WebVTT) | SRT (SubRip) | TTML / DFXP |
|---|---|---|---|
| Header required | Yes (WEBVTT) | No | Yes (XML) |
| Timecode separator | Period (.) | Comma (,) | Period (.) |
| Cue ID required | No | Yes | Yes |
| Styling support | Yes (CSS) | Limited tags | Yes (rich) |
| Positioning control | Yes | No | Yes |
| HTML5 <track> compatible | Yes (native) | No (convert first) | No |
| Chapter tracks | Yes | No | Yes |
| YouTube accepts | Yes | Yes | No |
| Best for | HTML5 web video | Universal compatibility | Streaming platforms |
How It Works
- 1.Get the source content — automatic transcription via TranscribeVideo.ai, Whisper, or Otter, or hand-written from a script.
- 2.Save as plain text with the .vtt extension. The file must start with WEBVTT on line 1, use UTF-8 encoding, use periods (not commas) in timecodes, and have blank lines between cues.
- 3.Add cue settings if needed — line: and align: for positioning, STYLE blocks for CSS, NOTE blocks for comments. These are VTT-specific features SRT doesn't support.
- 4.Validate the file — paste into a VTT validator (W3C Markup Validator or any browser's developer console with a quick test page). Browsers fail silently if the file is invalid.
- 5.Embed in HTML5 video using the <track> element, or upload to a platform that accepts VTT (YouTube, Vimeo, Wistia, JW Player, Brightcove).
Why Use This Tool?
- ✓Native HTML5 support — works in every major browser without JavaScript
- ✓Plain text — readable, version-controllable, hand-editable
- ✓Tiny file size — typically under 100 KB even for hour-long video
- ✓Rich features — positioning, CSS styling, chapters, metadata
- ✓AI-friendly — every transcription tool can output VTT
- ✓Web-standard — defined by W3C, future-proof, browser-supported
Use Cases
- —Adding captions to HTML5 video on your own website (the canonical use case)
- —Embedding video with captions in a custom web player (JW Player, Brightcove, Wistia, Vimeo)
- —Defining chapter markers for long-form video without external libraries
- —Programmatic cue points in interactive video — sync transcript scrolling, trigger UI changes
- —Subtitle tracks in multiple languages on a single HTML5 video element
- —Audio descriptions for visually-impaired viewers via kind="descriptions"
Frequently Asked Questions
What does VTT stand for?
VTT stands for Web Video Text Tracks (often written WebVTT). It is a W3C-standardised plain-text format for subtitles, captions, chapters, and metadata associated with HTML5 video.
What's the difference between VTT and SRT?
VTT requires a WEBVTT header on the first line, uses periods instead of commas in timecodes (00:00:01.500 vs 00:00:01,500), and supports CSS styling and positioning. SRT is simpler with no header and is more universally supported across non-web platforms. VTT is required for HTML5 video; SRT is more compatible with video editors and uploads.
Can I open a VTT file with Notepad?
Yes. VTT is plain text. Open it with Notepad, TextEdit, VS Code, or any text editor. To play a VTT file alongside a video, use VLC Media Player or upload to YouTube/Vimeo.
Does YouTube accept VTT files?
Yes. YouTube accepts both VTT and SRT for caption uploads. In YouTube Studio, go to Subtitles → Add → Upload file and select your .vtt file. SRT is slightly more common but VTT works equally well.
Why does my VTT file not work in the browser?
The most common cause is a missing WEBVTT header — the very first line of the file must be the literal string WEBVTT, followed by a blank line, before any cues. Other common causes: commas instead of periods in timecodes, wrong encoding (must be UTF-8), or missing blank lines between cues. Browsers fail silently when VTT is malformed.
Can I convert SRT to VTT?
Yes — easily. Add WEBVTT as the first line of the file followed by a blank line, then change all commas in timecodes to periods. That's the entire conversion. Most online tools (HappyScribe, GoTranscript, Subtitle Tools) do this in one click. You can also do it manually with any text editor's find-and-replace.
Does VTT support styling?
Yes. VTT supports a STYLE block at the top of the file containing CSS rules that apply to ::cue selectors. You can change font, color, background, weight, and other text properties. Inline classes are also supported via <c.classname> syntax inside cue text. SRT has no equivalent.
Can VTT files be used outside HTML5 video?
Yes — VTT works wherever a player supports it. VLC Media Player, MPV, JW Player, Brightcove, Wistia, and many video editors accept VTT. For broader compatibility (especially older video editors and broadcast workflows), SRT remains the safer choice.
Related Tools
Related Pages
Ready to get started?
Get a transcript from any video URL →