Reusable Method Summaries for Improving Performance of Dynamic Dependence Analysis
- Author(s): Palepu, Vijay Krishna
- Advisor(s): Jones, James A
- et al.
Software engineers construct modern-day software applications by building on existing standard and third-party software libraries and components that they necessarily do not author themselves. Consequently, to dynamically analyze a contemporary software application, all transitively dependent external libraries must also be monitored and analyzed, at each layer of the software application’s call stack. However, dynamically analyzing large and often numerous external libraries may prove prohibitively expensive. To address the expenses associated with analyzing dynamically observable behavior of software components, this work presents an approach to summarize and reuse the results of dynamic analysis. This work, specifically focuses on dynamic analysis that computes data- and control-dependencies between executing instructions, and the memory locations they access, by monitoring software-runs at a fine-grained instruction-level. As such, the approach in this work models and summarizes software component behavior as data- and control-dependencies between inputs and outputs of invocable methods within software components. Such behavior models for invocable methods are called method dependence summaries. Method dependence summaries may be created with static analysis, as has been done and studied by prior work. However, a novel contribution of this work is the creation and investigation of method summaries by runtime monitoring of executing instructions in software runs. Such dynamically inferred method summaries are called “dynamic dependence summaries.” Additionally, the approach to reuse a dependence summary — created statically or dynamically — for the purposes of dynamic dependence analysis is, in itself, an important contribution of this work. Software components, equipped with reusable dependence summaries, afford the omission of their exhaustive runtime monitoring and analysis during their future runs. Nonetheless, the reuse of dependence summaries would necessitate the abstraction of any concrete runtime information enclosed within the summary; thus potentially causing a loss in the information modeled by the dependence summary. As a result, benefits to the efficiency of dynamic analyses that use summarization may be afforded with losses of accuracy. This work investigates the resulting trade-offs between accuracy losses and performance gains with 83 system-wide executions across eight real-world software systems. The empirical results show that dynamic dependence summaries exhibit median precision scores of 1.0 when modeling individual method invocations, across multiple executions of the eight software systems — indicating acceptable levels of accuracy. Simultaneously, statically created dependence summaries exhibit median precision scores ranging from 0.5 to 1.0, when modeling method invocations. The results also show notable degrees of cost savings while using summarized analysis, with median trace size reductions of 61% across the 83 executions, when compared to traditional exhaustive dynamic dependence analysis.