Ogg Theora Cook Book

Introduction

Encoding is the process of creating a Theora file from raw, uncompressed source video material.  In case the source video material exists in some non-raw, compressed form, an intermediate decoding step is needed before creating the Theora video. This Decoding-Encoding is often referred to as Transcoding, though often Encoding is used as a synonym.

Software programs performing the encoding (resp. transcoding) are called encoders. Various Theora encoders exist, for example ffmpeg2theora and VLC (http://en.wikipedia.org/wiki/Theora#Encoding), to name just a few. 

Before encoding the user has to decide on at least two parameters:

  • the image quality of the created Theora file
  • the audio quality

Depending on the encoder used, more options might be available to control the encoding process:

  • clipping a configured amount of the frames' borders during encoding
  • rescaling the video resolution
  • changing the frame-rate of the video
  • handling the video-audio synchronisation
  • setting the keyframe-interval

Video Quality, Bit-Rate and File Size

Most Theora encoders allow the user to directly specify the subjective quality of the encoded video, usually on a scale from 0 to 10. The higher the quality, the bigger the resulting Theora files.  Most encoders can alternatively be configured to encode for a given average target bitrate. While this option is useful for generating Theora video files for streaming, it sometimes yields sub-optimal quality.

Recent versions of some Theora encoders feature a two-pass encoding mode. Two-pass encoding allows the encoder to hit a configured target bit-rate with optimum video quality, and should thus be comparable to quality controlled encoding, though it comes at the cost of taking twice the encoding time. By nature live videos can not be generated with two-pass encoding.

Video Resolution and Frame-Rate

There can be good reasons to further reduce the height and width (video resolution) of your video when encoding to Theora. 

  • If the Encoder produces too large files, even at low quality settings around 0, then reducing the video resolution will help reduce the file size further
  • If your required maximum file size requires a very low quality setting of 0..3, leading to an unacceptable perceived quality then reducing the video resolution will mean more data can be dedicated to improving the quality
  • If the playback of the encoded video should work even on low-performance computers then a lower video resolution will assist this
  • If your source video material has resolution higher than standard-definition video. The Theora video codec is not designed for high-definition video and might not perform very well at it so it would be better to reduce the video resolution

If your source video material has an unusually low resolution, and you can spare the bits, increasing the video resolution during encoding might have a positive effect on overall video quality.

Adjusting the frame-rate during encoding is generally a bad idea, as it often leads to jerking, reducing perceived quality by a large amount.  However, if you require a very low target file size, try reducing the frame-rate to exactly half the source frame rate. This might do the job of sufficiently reducing file size without degrading quality to an unacceptable level.

Clipping the Frames in the Video

Sometimes video source material does not make use of the full video frames, leaving black borders around the video. It is a good idea to remove black or otherwise unused parts from the video as this usually improves the quality and file size of the encoded Theora file. If possible, try to keep video width and height multiples of 16. The Theora format is capable of, but not very efficient at, storing video using other arbitrary frame sizes.

Video-Audio Synchronization

In an ideal world, the encoder would just copy the video-audio synchronization of the source video material to the created Theora file. In practice however, this is sometimes just not possible. Theora video files must adhere to a constant frame rate throughout the full file. Also the playback speed of the audio tracks is constant in Theora. Some source video material, however might not have a 100% constant frame rate.  Sometimes frames are just missing from the source video due to recording errors or as a result of using video cutting software.

In these cases, the encoding process must actively adjust audio-video synchronization. This is done either by duplicating and/or dropping frames in the video, or by changing the speed of the audio tracks.

Keyframe-Interval

Many Theora encoders allow changing a parameter named keyframe interval. A larger keyframe interval reduces the target file size without sacrificing quality. Keyframes are those frames in the video, which a player can directly seek to during playback. To seek to other points in the video, all frames from the last keyframe on have to be decoded first. In a video with 24 frames/second, a keyframe interval of 240 implies that direct seeking is only possible with a granularity of 10 seconds. Also cutting and concatenation of the encoded video will be limited to the keyframe granularity.

As a rule of thumb, never set the keyframe interval to more than 10 times the target video frame rate.