Self-hosted videos with HLS: subtitles

Vincent Bernat

In a previous article, I have described a solution to self-host videos while offering a delivery adapted to each user’s bandwidth, thanks to HLS and hls.js. Subtitles1 were not part of the game. While they can be declared inside the HLS manifest or embedded into the video, it is easier to include them directly in the <video> element, using the WebVTT format:

<video poster="poster.jpg"
       controls preload="none">
  <source src="index.m3u8"
  <source src="progressive.mp4"
          type='video/mp4; codecs="avc1.4d401f, mp4a.40.2"'>
  <track src="de.vtt"
         kind="subtitles" srclang="de" label="Deutsch">
  <track src="en.vtt"
         kind="subtitles" srclang="en" label="English">

Watch the following demonstration, featuring Agent 327: Operation Barbershop, a video created by Blender Animation Studio and currently released under the Creative Commons Attribution No Derivatives 2.0 license:

You may want to jump to 0:12 for the first subtitle. Most browsers should display a widget to toggle subtitles. This works just fine with Chromium but Firefox will not show the menu until the video starts playing, unless you enable preloading. Another annoyance: there is no simple way to specify safe margins for subtitles and they get stuck at the bottom. These two issues seem minor enough to not warrant pulling hundred of kilobytes of JavaScript for a custom player.

Update (2019-12)

If subtitles are served from another domain, crossorigin="anonymous" attribute needs to be added to the <video> tag and subtitles need to handle CORS headers. On some browsers, this also enables CORS for the video and the poster.

Update (2020-09)

The issue with Firefox and the subtitle menu is fixed in Firefox 76. On Android, subtitles work from Firefox 80.

  1. Some people may be picky over the difference between closed captions and subtitles. Closed captions are usually targeted at people with hearing impairment and they include non-speech information like sound effects. Subtitles assume the viewer can hear but may not understand the language. ↩︎