Bugs and security issues are primary concerns for software developers. Existing research have continuously focused on addressing these problems. However, the evolution of software engineering leads to increasingly complex software systems that are more susceptible to bugs. The rise of third-party services, cross-vendor libraries, and collaborative development introduces significant challenges for developers, making it difficult for them to have a comprehensive understanding of the entire codebase. Under the pressure of agile development timelines, developers often work with incomplete knowledge, leading to potential blind spots in software development. These blind spots can result in developers being unaware of certain constraints or security implications imposed by other components or authors, causing serious issues in access control, memory management, I/O operations, and business logic.
The dissertation investigates two aspects of these challenges. The first aspect focuses on cross-authorship blind spots. This part of the study identifies a specific pattern of bug-proneness, namely cross-authorship unused definitions. To address this, we introduce syntactic and semantic patterns that help identify such issues while filtering out false positives. Additionally, to accommodate the time pressures faced by developers, we use a code familiarity model to prioritize bug validation. Our implementation, named ValueCheck, has been evaluated on large-scale systems including Linux, MySQL, OpenSSL, and NFS-ganesha, successfully detecting 210 unknown bugs, with 154 confirmed. In comparisons with the state-of-the-art tools like Infer and Coverity, ValueCheck demonstrates greater effectiveness and lower false positive rates.
The second part studies cross-component blind spots. It focuses on blind spots in web applications with a client-server architecture, where client-side code is exposed. Relying solely on client-side security checks for authorization, identity verification, and user input validation is insufficient due to potential user manipulation. To address this, we propose a novel technique that enhances existing methods by altering client-side code to assess server-side security. This approach improves testing efficiency and detects complex vulnerabilities related to business logic, token-based defenses, and data preprocessing. Our testing tool, FenceHooper, identified 48 vulnerabilities in the top 300 websites from the Tranco dataset, including critical access control flaws affecting over 20 million user accounts.