Links...
 
 
Wednesday, March 16, 2005
Debunking I-Frame Injection
Scene change detection is a popular concept in video compression, and when mixed with MPEG-1 or MPEG-2, we often hear that an encoder is 'capable' of I-frame 'injection' on scene breaks. Additionally, Final Cut Pro provides the videographer with the ability to manually place compression markers on the timeline which have the effect of inserting I-frames into the GOP structure of the MPEG file.

Based on my analysis, injecting I-frames at scene breaks is unnecessary, and in this post I'm going to show how the underlying compression structure of MPEG-1 and MPEG-2 make this so.

First of all, let's consider the role of I, P and B frames in compression. I-frames stand alone; they rely on no other frames for proper decoding. P-frames are predicted in a single direction: they rely on the previous I or P frame for motion compensation. B-frames are placed between pairs of IP frames (display order). B-frames can be reconstructed using both information from the IorP frame earlier in time and the IorP frame occurring later in time.

Each of these three types of frames has an approximate cost in bits to the encoder. Generally speaking, if you study the statistics of IPB frame sizes, P-frames take about twice as many bits to compress as B-frames. And I-frames take twice as many bits to compress as P-frames. I-frames are the most expensive.

The prevailing notion is that by injecting an I-frame at the beginning of a scene, the scene will be coded most efficiently. I'm sure that was true back in the day prior to video codecs which used out-of-order bi-directional motion compensation.

Because of bi-directional motion compensation, it is not necessary that an I-frame occur at the beginning of a scene change. Instead, quantitatively, we find that if a B-frame happens to fall on the first frame of a new scene, it will happily refer to the next upcoming I or P frame as its 'reference' frame. This obviates the need for the I-frame to be the first frame. In my analysis of compressed video, I find that in GOP structures with at least 1 B frame (ie. IBPBP), the bidirectional prediction, if properly performed by the encoder, will take the scene change without a hiccup.

In reality, unless we have another reason to insert an I-frame (ie. random-access point or chapter marker), we should really have the longest GOP pattern possible.

A P-frame is a superset of the capabilities of an I-frame, and from the standpoint of
compression efficiency, is preferrable to the I-frame because of the increased coding efficiency afforded by motion-compensation.

Blogged by MegaPEG under Debunking I-Frame Injection 0 comments

The Myth of I-Frame Injection
To illustrate my point, I've taken a couple of grabs from the analysis feature of MegaPEG.X Pro Demo, as applied to a DVD title (popular TV show).

 

If you look at the Figure to the left, you can see a light green circle around a scene change. Here we see the familiar IBBP GOP pattern taking a scene change one frame before the I-frame. Based on the popular notion, we would expect to see some significant variation in the sizes of the BBP groupings to either side of the I-frame. But it is clearly not the case. Instead, we see that the B frame which sits directly on the scene change boundary is no larger than the earlier B-frames which were simply a part of the flow of the previous scene.

 

Here's another example from a rock video. This particular video is simply filled with great examples. Ironically, the hardware MPEG-2 encoder that compressed this particular video DOES in fact, use I-frame injection. In this case we see that the scene change occurs directly after a P-frame. Notice that the P-frame derives its compression efficiency from the earlier-in-time (leftward) P-frame, and the BB pattern that follows is compressed against the later-in-time (rightward) P-frame.
It is clear from the pattern of frame sizes that this poses no difficulty for the encoder whatsoever.

The mathematics of bi-directional motion compensation eliminate the need for placement of I-frames at scene changes. In reality, we want to use the longest GOP structure possible (as provided by the constraints of the profile), thereby reducing the cost of I-frames, which are the least efficient to encode (require the most bits).

Blogged by MegaPEG under The Myth of I-Frame Injection 0 comments

 

 
 
 

 Digigami   
 News   
 Contact   
 MegaPEG   
 MegaPEG Blog   
 MPressionist   
 MPressionist Blog   
 MoviesForMyPod   
 MoviesForMyPod Blog   
 download(s)   
 [Site Map]   
 


Copyright © 1994-2008
All Rights Reserved
Digigami