Query-based Graph Data Reduction
Graphs are widely used nowadays to store complex data of large applications such as social networks, recommendation engines, computer networks, bio-informatics, just to name a few. With the rapid growth of the Internet, designing scalable systems to process huge amount of graph data efficiently has become a challenge. In order to store and process such data efficiently, as well as to save the time for transferring them, graph compression techniques are often used. Most of the existing graph data compression approaches are syntactic, focusing on graph structures and data reduction based on serialization or redundancy removal. While they can be applied uniformly by all applications, further reduction of graph data can be achieved by looking into the semantics of an application.
In this thesis we propose a query-based approach to graph data reduction which preserves only the information relevant to the class of queries needed for an application. We study several classical graph problems and their applications, and design graph reduction algorithms to generate a reduced graph from which we can still compute the same solutions as those from the original graph. We also introduce the concept of “lossy” graph reduction for applications that may tolerate small but bounded errors in exchange for further data reduction. In addition, we design a synthesis algorithm that can combine existing graph reduction algorithms to generate a reduced graph for complex graph problems which include more than one constraint. With such we do not need to design a separate graph reduction algorithm for every multiple-constraint problem which can be expensive as the number of constraint combinations increases exponentially.