Our method automatically creates personalized e-presentation documents without using any specialized notes-capture devices. A similar presentation access method is suggested in [4]. However, they do not present experimental results on automatically matching handouts to presentation recordings. This leaves utility of the DCT-based matching method they suggest an open question, particularly for matching of handouts that are printed in black and white. In contrast, we present experimental results that show how our method is suitable for matching slides printed in black and white, grayscale, and color, demonstrating that it's suitable for a practical implementation. Furthermore, we propose solutions for handout detection and slide segmentation, which is not addressed by the prior art.

In our earlier work [8], we presented a system that printed bar codes on handouts before a presentation took place. A slide image-matching algorithm mapped the bar codes onto the times when the slides were displayed. Our previous method required a special printer that assigned unique barcodes to each slide and saved the original source file. The slide images captured during a recording were matched to the original presentation slides. In this paper, we eliminate the need for bar codes and using a special printer. As a trade off, we no longer have access to the original document for slide matching (i.e., the PowerPoint file). This problem is overcome by segmenting and matching the scanned slides directly to the recorded slides. This significantly improves the usefulness since our new method can be employed with any e-presentation system that saves slide images.

3. SMART HANDOUT CREATION

figure 2

Our conference room is equipped with a PTZ camera, an omnidirectional audiovisual capture device with 4 channel audio capture, a whiteboard capture system, and a presentation recorder. The Presentation Recorder (PR) captures the video output of a presenter's laptop as it's routed to a projector. A presenter's screen images are captured once a second and every captured image is time-stamped and saved if it is significantly different than the previously captured image. These images are synchronized to the captured video via time stamps. Each PR image is OCR'ed and indexed with the extracted text. Moreover, for each slide, metadata such as slide duration and audio activity, which is based on changes in sound source direction, are computed.

After a presentation, if a user would like to have an electronic version of her notes, she inputs her e-mail address and scans the handouts on a regular scanner. A pdf file is generated from the scanned pages and passed to the Smart Handout Server as shown in Figure 2.

The server converts individual pages in the pdf document into JPEG images. Then, segmentation is applied to each page to detect possible slide regions. Commercially available presentation document authoring software, such as PowerPoint™ and FrameMaker™, support a limited number of layouts for printing handouts. Motivated by this, a template matching technique was developed that detects whether a scanned document is a presentation handout or a regular document. If the scanned document is not a presentation handout, the pdf containing only the scanned document is emailed to the user. Otherwise, the presentation matching step retrieves the relevant presentation recording, and matches slide images in the scanned handout to slides captured by the presentation recorder. Then, the scanned document is populated with e-media links and the resulting document is e-mailed to the user. The details of these processing steps are given in the following sections.

3.1 Segmentation

figure 3

First a smoothing filter is applied to reduce half-toning effects that may occur after printing and scanning, followed by binarization with global thresholding. Examples of two scanned documents, one presentation handout document and one regular document, is given in Figure 3. Connected component analysis is applied to the document images to find the outer-most components, as shown in Figure 3.b. Slide handouts may contain regions that do not belong to the slides, such as a user's handwritten notes, a presentation title, and page numbers. Erosion with a resolution-dependent structuring element disconnects slide regions from any overlapping handwriting For review purposes only. Further dissemination of the content is prohibited. segments. Then features of each connected region, i.e., height, width, width-to-height ratio, and compactness, are analyzed to eliminate non-slide regions. Figure 3.c shows examples of segmented slide region candidates in two input documents.