A Document Browsing Tool Based on Book Indexes
Is part ofProceedings of Computational Linguistics in the North East ; Montréal (Québec), 2004-08-30, pp. 45-52.
This research project is a contribution to the global field of information retrieval, specifically, to develop tools to enable information access in digital documents. We recognize the need to provide the user with flexible access to the contents of large, potentially complex digital documents, with means other than a search function or a handful of metadata elements. The goal is to produce a text browsing tool offering a maximum of information based on a fairly superficial linguistic analysis. We are concerned with a type of extensive single-document indexing, and not indexing by a set of keywords (see Klement, 2002, for a clear distinction between the two). The desired browsing tool would not only give at a glance the main topics discussed in the document, but would also present relationships between these topics. It would also give direct access to the text (via hypertext links to specific passages). The present paper, after reviewing previous research on this and similar topics, discusses the methodology and the main characteristics of a prototype we have devised. Experimental results are presented, as well as an analysis of remaining hurdles and potential applications.