Automated Black Box Generation of Structured Inputs for Use in Software Testing
A common problem in automated software testing is the need to generate many inputs with complex structure in a black-box fashion. For example, a library for manipulating red-black trees may require that inputs are themselves valid red-black trees, meaning anything invalid is not suitable for testing. As another example, in order to test code generation in a compiler, it is necessary to use input programs which are both syntactically valid and well-typed. Despite the importance of this problem, we observe that existing solutions are few in number and have severe drawbacks, including unreasonably slow performance and a lack of generality to testing different systems.
This thesis presents a solution to this problem of black-box structured input generation. I observe that test inputs can be described as solutions to systems of logical constraints, and that more expressive constraints can lead to more complex tests. In order to test effectively and generate many tests, we need high-performance constraint solvers capable of finding many solutions to these constraints. I observe that constraint logic programming (CLP) offers an expressive constraint language paired with a high-performance constraint solver, and thus serves as a potential solution to this problem. Via a series of case studies, I have found that CLP (1) is applicable to testing a wide variety of systems; (2) can scale to more complex constraints than ever previously described; and (3) is often orders of magnitude faster than competing solutions. These case studies have also exposed dozens of bugs in high-profile software, including the Rust compiler and the Z3 SMT solver.