Computing a Multimedia Representation for Documents Given Time and Display Constraints

Berna Erol, Kathrin Berkner, and Jonathan J. Hull

Ricoh Innovations Inc., California Research Center
2282 Sand Hill Rd., Menlo Park, CA 94025-7054
berna_erol@rii.ricoh.com  hull@rii.ricoh.com  berkner@rii.ricom.com

Siddharth Joshi

Department of Electrical Engineering, Stanford University
Packard 243, 350 Serra Mall, Stanford, CA
sidj@stanford.edu

Abstract

It is difficult to view multipage, high resolution documents on devices with small displays. As a solution, we introduce a Multimedia Thumbnail representation, which can be seen as a multimedia clip that provides an automated guided tour through a document. Multimedia Thumbnails are automatically generated by taking a document image as input and first performing visual and audible information analysis on the document to determine salient document elements. Next, the time and information attributes for each document element are computed by taking into account the display and application constraints. An optimization routine, given a time constraint, selects elements to be included in the Multimedia Thumbnail. Last, the selected elements are synthesized into animated images and audio to create the final multimedia representation.