"Semantic-enhanced search: Finding meaning in large-scale scanned text collections"

This talk presents Capisco, a system for semantic-enhanced search in a digital library of full-texts. Document search in Digital Libraries typically use purely lexical analysis, which cannot address the inherent ambiguity of natural language. A semantic search approach offers the potential to overcome the shortcoming of lexical search, but even if an appropriate network of ontologies could be decided upon it would require a full semantic markup of each document. Capisco instead analyzes documents by the semantics and context of their content. The disambiguation of search queries is done interactively, to fully utilize the domain knowledge of the scholar. Our method achieves a form of semantic-enhanced search that simultaneously exploits the proven scale benefits provided by lexical indexing.

For established systems, completely replacing, or even making significant changes to the document retrieval mechanism would require major technological effort, and would most likely be disruptive. We explored ways to use the results of semantic analysis and disambiguation, while retaining an existing keyword-based search and lexicographic index. We engineer this so the output of semantic analysis (performed off-line) is suitable for import directly into existing digital library metadata and index structures, and thus incorporated without the need for architecture modifications.

Annika Hinze is a senior lecturer in the Department of Computer Science at the University of Waikato, New Zealand. She is the head of the Databases and Information Systems (ISDB) group at Waikato. Her research interests include complex event processing, location-based systems and semantic annotation in digital libraries. She is principal investigator on Capsico, exploring semantic-enhanced search in the HathiTrust Digital Library. Annika graduated with a Master's in Mathematics from TU Berlin and undertook her PhD in Computer Science at Freie Universitaet Berlin.

14.10.2016 | 14:00 c.t.

Takustraße 9, Raum 046