Recently, there has been extensive research that generated a wealth
of new XML full-text query languages, ranging from simple Boolean search to
combining sophisticated proximity and order predicates on keywords. While
computing least common ancestors of query terms was proposed for efficient
evaluation of conjunctive keyword queries by exploiting the document structure,
no such solution was developed to evaluate complex full-text queries. We
present efficient evaluation algorithms based on a formalization of XML queries
in terms of keyword patterns and an algebra which manipulates pattern matches.
Our algebra captures most existing languages and their varying semantics and
our algorithms combine relational query evaluation techniques with the
exploitation of document structure to process queries with complex full-text
predicates. We show how scoring can be incorporated into our framework without
compromising the algorithms complexity. Our experiments show that considering
element nesting dramatically improves the performance of queries with complex
full-text predicates.
Pre-2018 CSE ID: CS2005-0845