Skip to main content
eScholarship
Open Access Publications from the University of California

AGM, a dataflow database machine

Abstract

In recent years, a number of database machines consisting of large numbers of parallel processing elements have been proposed. Unfortunately, one of the main limitations to parallelism in database processing is the I/O bandwidth of the underlying storage devices. One way to solve this problem is to use multiple parallel disk units. The main problem with this approach, however, is the lack of a computational model capable of utilizing the potential of any significant number of such devices.

This paper presents a database model which is based on the principles of data-driven computation. According to this model, the database is represented as a network in which each node is conceptually an independent processing element, capable of communicating with other nodes by exchanging messages along the network arcs. To answer a query, one or more such messages, called tokens, are created and injected into the network. These then propagate asynchronously through the network in the search of results satisfying the given query.

To investigate the performance of the proposed system, we have implemented the model on a simulated computer architecture. The results of the simulation ex-periments indicate that the model is capable of exploiting the potential I/O band-width of a large number of disk units as well as the computational power of the associated processing elements.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View