The Reader's Helper: A Personalized Document Reading Environment
Abstract
Over the last two centuries, reading styles have shifted away from the reading of documents from beginning to end and toward the skimming of documents in search of relevant information. This trend continues today where readers, often confronted with an insurmountable amount of text, seek more efficient methods of extracting relevant information from documents. In this paper, a new document reading environment is introduced called the Reader's Helper™, which supports the reading of electronic and paper documents. The Reader's Helper analyzes documents and produces a relevance score for each of the reader's topics of interest, thereby helping the reader decide whether the document is actually worth skimming or reading. Moreover, during the analysis process, topic of interest phrases are automatically annotated to help the reader quickly locate relevant information. A new information visualization tool, called the Thumbar™, is used in conjunction with relevancy scoring and automatic annotation to portray a continuous, dynamic thumb-nail representation of the document. This further supports rapid navigation of the text.
Keywords
document annotation, information visualization, content recognition, intelligent agents, digital libraries, probabilistic reasoning, user interface design, reading online
INTRODUCTION
Around 1750AD there was a dramatic change in the way people read documents [8]. Before this time, readers consumed documents intensively, reading the document from start to finish, sometimes several times or even out loud to a group. By the early 1800's, however, readers tended to read extensively, reading documents only once or skimming the documents in search of relevant information to determine whether the document was worth reading in its entirety. Today, with the advent of the World Wide Web (WWW) and the growing collection of electronic documents, this style is likely to continue: there are simply too many potentially useful documents and not enough time to read them all [14]. Office workers, in particular, are forced to optimize their daily reading by sifting through the vast amount of information, establishing a balance between in-depth understanding and expediency. Reading intensively versus extensively can be thought of as vertical versus horizontal reading [6]. That is, in the past, readers read the document from beginning to end (vertical); now, they scan and browse the text (horizontal).
Few applications available today fully support the reading process. There are, however, several applications which condense or locate documents for the user. Applications such as [13, 15] provide a synopsis of the text which can sometimes be used to determine the document's relevance. Other systems search for and retrieve documents relevant to an evolving user profile [3, 16]. The learning of user profiles over time provides an evolutionary process which enables the system to improve the quality of documents retrieved for the user. Another system supports users as they search digital libraries by showing query keywords in the context of the sentences they appear in a document [17]. Thus, users can quickly access the database based on the presence or absence of a particular context they are seeking. Another application inserts supplemental information in the form of an annotation into a news story if the story contains key phrases from the subject database [9]. This offers the reader additional information not necessarily provided by the author of the original text. The work by [7] supports the skimming of documents by representing the topics of a text as content capsules. Using a special visualization tool portraying the document as a thumb-nail image, topics are presented to the reader at the location in the text where they occur. This allows the reader to quickly view the highlights of the document in the context of the surrounding text structure.
Despite the growing number of applications used to locate and evaluate documents, there are few, if any, applications that focus on the actual text reading process. I believe that readers require a personalized environment that supports the skimming of documents and the extraction of information. I have created a new document reading environment called the Reader's Helper, to act as both the reader's document browser and personal agent, advising the reader of relevant documents and of the relevant text within each document. The Reader's Helper is not a search engine; it does not search for or deliver documents to the user. Instead, it helps readers help themselves to be more productive in reading by evaluating documents the reader views and by providing visual tools for showing the locations of the relevant portions of the text. In the following sections the Reader's Helper system is describe, both in terms of the user interface and the underlying content recognition subsystem. Future issues and potential research directions are also discussed.






