Enhancing Binary Code Security Through Pseudocode Diffing and Pointer Analysis
Abstract
Binary analysis remains an indispensable technique in the landscape of cybersecurity. It provides a foundational layer of security by enabling the early detection of vulnerabilities and malicious code, thereby contributing to the overall integrity and resilience of software systems. However, the effectiveness of binary analysis is often challenged by the complexity of modern software, which can arise from various factors, such as the extensive use of optimization techniques and the lack of debug information. These factors complicate the analysis process, making it more difficult to accurately understand and assess the software’s behavior and potential security issues.
This dissertation delves into two advanced techniques in binary analysis: pseudocode diffing and binary pointer analysis. The first project proposes a pseudocode diffing technique that accurately and efficiently characterizes code changes between two different binary programs at a fine-grained pseudocode token level. This technique is applicable in various security scenarios, including code plagiarism detection, lineage analysis, and vulnerability and patch analysis.
The second project focuses on pointer analysis, a critical component in binary code reverse engineering. We propose a program analysis approach that jointly recovers points-to relations and data structures in binary code so that they can enhance each other. This technique can be leveraged in reconstructing a complete call graph of binaries that includes indirect call edges, which is the foundation of many downstream analyses such as binary code diffing and bug detection.