Skip to main content
eScholarship
Open Access Publications from the University of California

Using Bitmap Indexing Technology for Combined Numerical and Text Queries

Abstract

In this paper, we describe a strategy of using compressed bitmap indices to speed up queries on both numerical data and text documents. By using an efficient compression algorithm, these compressed bitmap indices are compact even for indices with millions of distinct terms. Moreover, bitmap indices can be used very efficiently to answer Boolean queries over text documents involving multiple query terms. Existing inverted indices for text searches are usually inefficient for corpora with a very large number of terms as well as for queries involving a large number of hits. We demonstrate that our compressed bitmap index technology overcomes both of those short-comings. In a performance comparison against a commonly used database system, our indices answer queries 30 times faster on average. To provide full SQL support, we integrated our indexing software, called FastBit, with MonetDB. The integrated system MonetDB/FastBit provides not only efficient searches on a single table as FastBit does, but also answers join queries efficiently. Furthermore, MonetDB/FastBit also provides a very efficient retrieval mechanism of result records.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View