Concolic Testing of Programs with Concurrent Dynamic Data Structures
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Concolic Testing of Programs with Concurrent Dynamic Data Structures

Creative Commons 'BY' version 4.0 license
Abstract

Concolic execution combines concrete execution with symbolic execution to automatically generate test inputs that exercise different program paths and deliver high code coverage. However, when this technique is extended to multithreaded programs with concurrent dynamic data structures, the lack of support for exploring data shapes (the skeleton that consists of symbolic pointers in node-based linked data structures) and efficiently exploring thread interleavings makes it hard to expose concurrency bugs that manifest only when certain dynamic data structure shapes, program paths, and thread interleavings are exercised. This thesis presents techniques to effectively explore data shapes for concurrent dynamic data structures and optimize the exploration efficiency.

The approach presented first generates a data shape for a chosen path. By capturing path constraints, we form a shape that satisfies path constraints and exercises the chosen path. In addition, by capturing pointer-pointee relationships, we find how to adjust the shape and find new shapes that also exercise the same path. Finally, using the shapes for individual paths, we find a consistent shape that causes multiple threads to simultaneously follow their chosen paths for exposing concurrency bugs. We generate shapes for each thread separately and provide an integration algorithm to merge the shapes into a consistent shape. This approach does not require the user to write code that constructs data structures to exercise desired paths by individual threads; rather, it automatically collects constraints to generate consistent shapes during the concolic execution. We also present a summarization-based technique to improve the efficiency of concolic testing further. Via unit testing of key functions that implement a concurrent data structure, function summaries are derived that capture data structure shapes that cause various function paths to be exercised. During the concolic testing of interprocedural paths, these summaries are exploited to eliminate the repeated overhead of handling symbolic pointers and creating dynamic objects by reusing function summaries. The summary also contains symbolic memory accesses and synchronization events that guide application-level concolic testing to identify and confirm potential data races.

To demonstrate our approach's effectiveness and efficiency, we developed two prototypes for evaluation: DSGEN, built on top of a popular GPU concolic executor GKLEE; and SSRD, on top of Cloud9 for multithreaded programs. DSGEN improves the number of races detected from 10 to 25 by automatically generating 1,897 shapes in experiments with concurrent operations on several widely used dynamic data structures - B-Tree. HAMT, RRB-Tree and Skip List. SSRD, with both data shape generation and function summaries, outperforms AFL++ based on fuzzing and Cloud9 in both effectiveness and efficiency in experiments that tested pthread-based implementations of several concurrent data structures, including Unrolled Linked List, AVL-Tree, Skip List, and Priority Queue. SSRD improves the number of races detected to 74 as opposed to only 34 by Cloud9 and 11 by AFL++ using the same amount of computation resources.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View