4. Experiments
We encoded two video clips, Akiyo and Smile, with an MPEG-4 video encoder. The video clips contained slow motion, mostly facial expressions. We envision that these types of video sequences, where most information is in the first frame, are more suitable for our technique compared to high motion video sequences, where most information content is in the motion and the prediction error.
The first two seconds of Akiyo sequence, shown in Figure 5, is coded at 180x120 resolution, at 3fps. When the MPEG-4 bitstream is analyzed, it is seen that 6 Kbytes is used for encoding the first frame, 0.3 Kbytes is used for encoding motion vectors and 2.5 Kbytes is used for encoding the prediction error. Using our method, the bits used for encoding the first frame can be removed and the video sequence can be represented with 0.3+2.5=2.8 Kbytes, which is only 30% of the coded video bits. This information can be encoded in a single QR code Version 40, which can encode up to 2,953 bytes of binary data [8].

Figure 5. (a) Key frame printed on paper, (b) original Akiyo sequence, and (c) reconstructed sequence using printed key frame and motion information from the QR code.
Another sequence we encoded was 2-second Smile video, shown in Figure 6, that shows a baby smiling. The encoding was done at 4 fps at 320×240 resolution. The encoded MPEG-4 stream contained 6.6 Kbytes of texture bits for the first frame, 1.8 Kbytes of motion bits, and 0.9 Kbytes for the prediction error bits. If the key frame is encoded in the bitstream, 9.4 Kbytes is required for this data. When the first frame is not coded, then 2.7 Kbytes of side information is sufficient for representing the rest of the frames. Again, 2.7 Kbytes is easily representable with a single large QR code or several small QR codes.
The video sequences were printed on paper using our method and captured with a 2 MegaPixel digital camera as shown in Figure 5.a, Figure 6.a, and Figure 6.b. The MPEG-4 decoder is modified such that it uses the reference frame segmented from captured image as an I-frame. The original video sequences for Akiyo and Smile are shown in Figure 5.b and Figure 6.c, respectively. Figure 5.c shows the reconstructed frames of Akiyo sequence using our method. Figure 6.d and Figure 6.e shows the videos for Smile sequence reconstructed from two different captured images with varying lighting conditions. As can be seen, the Akiyo sequence has fine motion around the mouth and eye areas, and motion artifacts are noticeable. On the other hand, the reconstruction artifacts in the Smile clip are less apparent. These videos are also available for online viewing at [9].

Figure 6. (a) (b) Key frames printed on paper, (c) original Smile sequence, and (d) (e) reconstructed sequences using printed key frames and motion information from the QR codes.






