Skip to main content
eScholarship
Open Access Publications from the University of California

Optimizing Candidate Check Costs for Bitmap Indices

Abstract

In this paper, we propose a new strategy for optimizing the placement of bin boundaries to minimize the cost of query evaluation using bitmap indices with binning. For attributes with a large number of distinct values, often the most efficient index scheme is a bitmap index with binning. However, this type of index may not be able to fully resolve some user queries. To fully resolve these queries, one has to access parts of the original data to check whether certain candidate records actually satisfy the specified conditions. We call this procedure the candidate check, which usually dominates the total query processing time. Given a set of user queries, we seek to minimize the total time required to answer the queries by optimally placing the bin boundaries. We show that our dynamic programming based algorithm can efficiently determine the bin boundaries. We verify our analysis with some real user queries from the Sloan Digital Sky Survey. For queries that require significant amount of time to perform candidate check, using our optimal bin boundaries reduces the candidate check time by a factor of 2 and the total query processing time by 40 percent.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View