Search

Scholarly Works (43 results)

Sort By:

Show:

Article
Peer Reviewed

Cryptanalysis of an algebraic privacy homomorphism

Wagner, David

UC Berkeley Previously Published Works (2003)

We use linear algebra to show that an algebraic privacy homomorphism proposed by Domingo-Ferrer is insecure for some parameter settings.

Cover page: Cryptanalysis of an algebraic privacy homomorphism

Thesis
Peer Reviewed

Large-Scale Analysis of Modern Code Review Practices and Software Security in Open Source Software

Thompson, Christopher
Advisor(s): Wagner, David

UC Berkeley Electronic Theses and Dissertations (2017)

Modern code review is a lightweight and informal process for integrating changes into a software project, popularized by GitHub and pull requests. However, having a rich empirical understanding of modern code review and its effects on software quality and security can help development teams make intelligent, informed decisions, analyzing the costs and the benefits of implementing code review for their projects, and provide insight on how to support and improve its use.

This dissertation presents the results of our analyses on the relationships between modern code review practice and software quality and security, across a large population of open source software projects. First, we describe our neural network-based quantification model which allows us to efficiently estimate the number of security bugs reported to a software project. Our model builds on prior quantification-optimized models with a novel regularization technique we call random proportion batching. We use our quantification model to perform association analysis of very large samples of code review data, confirming and generalizing prior work on the relationship between code review and software security and quality. We then leverage timeseries changepoint detection techniques to mine for repositories that have implemented code review in the middle of their development. We use this dataset to explore the causal treatment effect of implementing code review on software quality and security. We find that implementing code review may significantly reduce security issues for projects that are already prone to them, but may significantly increase overall issues filed against projects. Finally, we expand our changepoint detection to find and analyze the effect of using automated code review services, finding that their use may significantly decrease issues reported to a project. These findings give evidence for modern code review being an effective tool for improving software quality and security. They also suggest that the development of better tools supporting code review, particularly for software security, could magnify this benefit while decreasing the cost of integrating code review into a team's development process.

Cover page: Large-Scale Analysis of Modern Code Review Practices and Software Security in Open Source Software

Thesis
Peer Reviewed

Secure Virtualization with Formal Methods

Sturton, Cynthia
Advisor(s): Wagner, David

UC Berkeley Electronic Theses and Dissertations (2013)

Virtualization software is increasingly a part of the infrastructure behind our online activities. Companies and institutions that produce online content are taking advantage of the "infrastructure as a service" cloud computing model to obtain cheap and reliable computing power. Cloud providers are able to provide this service by letting multiple client operating systems share a single physical machine, and they use virtualization technology to do that. The virtualization layer also provides isolation between guests, protecting each from unwanted access by the co-tenants. Beyond cloud computing, virtualization software has a variety of security-critical applications, including intrusion detection systems, malware analysis, and providing a secure execution environment in end-users' personal machines.

In this work, we investigate the verification of isolation properties for virtualization software. Large data structures, such as page tables and caches, are often used to keep track of emulated state and are central to providing correct isolation. We identify these large data structures as one of the biggest challenges in applying traditional formal methods to the verification of isolation properties in virtualization software.

We present a new semi-automatic procedure, S2W, to tackle this challenge. Our approach uses a combination of abstraction and bounded model checking and allows for the verification of safety properties of large or unbounded arrays. The key new ideas are a set of heuristics for creating an abstract model and computing a bound on the reachability diameter of its state space. We evaluate this methodology using six case studies, including verification of the address translation logic in the Bochs x86 emulator, and verification of security properties of several hypervisor models. In all of our case studies, we show that our heuristics are effective: we are able to prove the safety property of interest in a reasonable amount of time (the longest verification takes 70 minutes to complete), and our abstraction-based model checking returns no spurious counter-examples.

One weakness of using model checking is that the verification result is only as good as the model; if the model does not accurately represent the system under consideration, properties proven true of the model may or may not be true of the system. We present a theoretical framework for describing how to validate a model against the corresponding source code, and an implementation of the framework using symbolic execution and satisfiability modulo theories (SMT) solving. We evaluate our procedure on a number of case studies, including the Bochs address translation logic, a component of the Berkeley Packet Filter, the TCAS suite, the FTP server from GNU Inetutils, and a component of the XMHF hypervisor. Our results show that even for small, well understood code bases, a hand-written model is likely to have errors. For example, in the model for the Bochs address translation logic - a small model of only 300 lines of code that was vigorously used and tested as part of our work on S2W - our model validation engine found seven errors, none of which affected the results of the earlier effort.

Cover page: Secure Virtualization with Formal Methods

Article
Peer Reviewed

Risks of e-voting

UC Davis Previously Published Works (2007)

Thesis
Peer Reviewed

Generative Models as a Robust Alternative for Image Classification: Progress and Challenge

Ju, An
Advisor(s): Wagner, David

UC Berkeley Electronic Theses and Dissertations (2021)

The tremendous success of neural networks is clouded by the existence of adversarial examples: maliciously engineered inputs can cause neural networks to perform abnormally, causing security and trustworthiness concerns. This thesis will present some progress on an alternative approach for robustly classifying images. Generative classifiers use generative models for image classification, showing better robustness than discriminative classifiers. However, generative classifiers face some unique challenges when images are complex. This thesis will present an analysis of these challenges and remedies.

Generative classifiers suffer from out-of-domain reconstructions: overpowered generators can generate images out of the training distribution. This thesis demonstrates a method to address out-of-domain reconstructions in generative classifiers. Combined with other extensions, our method has successfully extended generative classifiers from robustly recognizing simple digits to classifying structured colored images. Besides, this thesis conducts a systematic analysis of out-of-domain reconstructions on CIFAR10 and ImageNet and presents a method to address this problem on these realistic images.

Another challenge of generative classifiers is measuring image similarity. This thesis will demonstrate that similarity measurements are critical for complex images such as CIFAR10 and ImageNet. It also presents a metric that has significantly improved generative classifiers on complex images.

The challenges of generative classifiers come from modeling image distributions. Discriminative models do not have such challenges because they model label distribution instead of image distribution. Therefore, the last part of this thesis is dedicated to a method that connects the two worlds. With the help of randomized smoothing, the new generative method leverages discriminative models and model image distribution under noises. Experiments showed that such a method improves the robustness of unprotected models, suggesting a promising direction for connecting the world of generative models and discriminative models.

Cover page: Generative Models as a Robust Alternative for Image Classification: Progress and Challenge

Thesis
Peer Reviewed

Evaluation and Design of Robust Neural Network Defenses

Carlini, Nicholas
Advisor(s): Wagner, David

UC Berkeley Electronic Theses and Dissertations (2018)

Neural networks provide state-of-the-art results for most machine learning tasks. Unfortunately, neural networks are vulnerable to test-time evasion attacks adversarial examples): inputs specifically designed by an adversary to cause a neural network to misclassify them. This makes applying neural networks in security-critical areas concerning.

In this dissertation, we introduce a general framework for evaluating the robustness of neural network through optimization-based methods. We apply our framework to two different domains, image recognition and automatic speech recognition, and find it provides state-of-the-art results for both. To further demonstrate the power of our methods, we apply our attacks to break 14 defenses that have been proposed to alleviate adversarial examples.

We then turn to the problem of designing a secure classifier. Given this apparently-fundamental vulnerability of neural networks to adversarial examples, instead of taking an existing classifier and attempting to make it robust, we construct a new classifier which is provably robust by design under a restricted threat model. We consider the domain of malware classification, and construct a neural network classifier that is can not be fooled by an insertion adversary, who can only insert new functionality, and not change existing functionality.

We hope this dissertation will provide a useful starting point for both evaluating and constructing neural networks robust in the presence of an adversary.

Cover page: Evaluation and Design of Robust Neural Network Defenses

Thesis
Peer Reviewed

The Dark Net: De-Anonymization, Classification and Analysis

Portnoff, Rebecca Sorla
Advisor(s): Wagner, David

UC Berkeley Electronic Theses and Dissertations (2018)

The Internet facilitates interactions among human beings all over the world, with greater scope and ease than we could have ever imagined. However, it does this for both well-intentioned and malicious actors alike. This dissertation focuses on these malicious persons and the spaces online that they inhabit and use for profit and pleasure. Specifically, we focus on three main domains of criminal activity on the clear web and the Dark Net: classified ads advertising trafficked humans for sexual services, cyber black-market forums, and Tor onion sites hosting forums dedicated to child sexual abuse material (CSAM).

In the first domain, we develop tools and techniques that can be used separately and in conjunction to group Backpage sex ads by their true author (and not the claimed author in the ad). Sites for online classified ads selling sex are widely used by human traffickers to support their pernicious business. The sheer quantity of ads makes manual exploration and analysis unscalable. In addition, discerning whether an ad is advertising a trafficked victim or an independent sex worker is a very difficult task. Very little concrete ground truth (i.e., ads definitively known to be posted by a trafficker) exists in this space. In the first chapter of this dissertation, we develop a machine learning classifier that uses stylometry to distinguish between ads posted by the same vs. different authors with 90% TPR and 1% FPR. We also design a linking technique that takes advantage of leakages from the Bitcoin mempool, blockchain and sex ad site, to link a subset of sex ads to Bitcoin public wallets and transactions. Finally, we demonstrate via a 4-week proof of concept using Backpage as the sex ad site, how an analyst can use these automated approaches to potentially find human traffickers.

In the second domain, we develop machine learning tools to classify and extract information from cyber black-market forums. Underground forums are widely used by criminals to buy and sell a host of stolen items, datasets, resources, and criminal services. These forums contain important resources for understanding cybercrime. However, the number of forums, their size, and the domain expertise required to understand the markets makes manual exploration of these forums unscalable. In the second chapter of this dissertation, we propose an automated, top-down approach for analyzing underground forums. Our approach uses natural language processing and machine learning to automatically generate high-level information about underground forums, first identifying posts related to transactions, and then extracting products and prices. We also demonstrate, via a pair of case studies, how an analyst can use these automated approaches to investigate other categories of products and transactions. We use eight distinct forums to assess our tools: Antichat, Blackhat World, Carders, Darkode, Hack Forums, Hell, L33tCrew and Nulled. Our automated approach is fast and accurate, achieving over 80% accuracy in detecting post category, product, and prices.

In the third domain, we develop a set of features for a principal component analysis (PCA) based anomaly detection system to extract producers (those actively abusing children) from the full set of users on Tor CSAM forums. These forums are visited by tens of thousands of pedophiles daily. The sheer quantity of users and posts make manual exploration and analysis unscalable. In the final chapter of this dissertation, we demonstrate how to extract producers from unlabeled, public forum data. We use four distinct forums to assess our tools; these forums remain unnamed to protect law enforcement investigative efforts.

We have released our code written for the first two domains, as well as the proof of concept data from the first domain, and a sub-set of the labeled data from the second domain, allowing replication of our results.

Cover page: The Dark Net: De-Anonymization, Classification and Analysis

Thesis
Peer Reviewed

Enabling More Meaningful Post-Election Investigations

Cordero, Arel Lee
Advisor(s): Wagner, David A.

UC Berkeley Electronic Theses and Dissertations (2010)

Post-election audits and investigations can produce more transparent, trustworthy, and secure elections. However, such investigations are limited in cases by inadequate tools and methods, an absence of meaningful evidence, and high costs. In this dissertation, I address these concerns in the following three lines of research. First, I describe my research on verifiable and transparent random sample selection for post-election audits. I investigate how counties have typically approached random sample selection, and I analyze the implications and limitations of those approaches. I propose a sampling method that has since found use in counties across the country. Second, I describe a novel approach for logging events in direct recording electronic (DRE) voting systems. My approach gives investigators more meaningful evidence about the behavior of DREs on election day. In particular, I propose to record interactions between the voter and the voting machine such that they can be replayed by investigators while preserving the anonymity of the voter. Last, I describe a novel process for efficiently verifying elections that use optical scan voting systems. My process uses image superposition to let an investigator visualize the content of many ballot images simultaneously while allowing individual treatment of anomalous ballots. I evaluate this process and demonstrate an order of magnitude improvement in the time it takes to inspect ballot images individually. This approach will let investigators more cost-effectively verify that all ballots have been accurately counted as intended by the voters.

Cover page: Enabling More Meaningful Post-Election Investigations

Thesis
Peer Reviewed

Helping Developers Construct Secure Mobile Applications

Chin, Erika Michelle
Advisor(s): Wagner, David A.

UC Berkeley Electronic Theses and Dissertations (2013)

Mobile phones are no longer static devices that simply make phone calls and send SMS messages. Modern smartphones are now closer to general purpose computers. They allow users to customize their phones by installing third-party applications that let them browse the web, check social networking sites, and do online banking. Platform manufacturers, such as Android, introduce new APIs to facilitate the creation of rich applications that interact with other applications, system resources, and external resources (such as web applications). Given the level of trust users put in their phones and the number of sensitive tasks they perform, it is important to understand and improve the security of mobile applications.

Android provides tools to enable rich interaction, but if developers do not know how to use them correctly, they will not use them securely. In this dissertation, we examine how mobile applications interact with each other and their environment. We uncover threats to application security due to developer confusion and general misuse of the features provided by the mobile platform. Specifically, we perform an in-depth analysis of how Android applications interact with each other through inter-process communication mechanisms, how they interact with system resources through Android permissions, and how they interact with web content through WebViews. We build static analysis tools to identify vulnerable applications and measure the prevalence of the vulnerabilities. Through automated and manual analysis, we identify patterns that illustrate how developers misuse these features and make their application vulnerable to attack. We further provide platform-level, API-level, and design-level solutions to help developers and platform designers build secure applications and systems.

Cover page: Helping Developers Construct Secure Mobile Applications

Thesis
Peer Reviewed

Towards Comprehensible and Effective Permission Systems

Felt, Adrienne Porter
Advisor(s): Wagner, David

UC Berkeley Electronic Theses and Dissertations (2012)

How can we, as platform designers, protect computer users from the threats associated with malicious, privacy-invasive, and vulnerable applications? Modern platforms have turned away from the traditional user-based permission model and begun adopting application permission systems in an attempt to shield users from these threats. This dissertation evaluates modern permission systems with the goal of improving the security of future platforms.

In platforms with application permission systems, applications are unprivileged by default and must request permissions in order to access sensitive API calls. Developers specify the permissions that their applications need, and users approve the granting of permissions. Permissions are intended to provide defense in depth by restricting the scope of vulnerabilities and user consent by allowing users to control whether third parties have access to their resources.

In this dissertation we investigate whether permission systems are effective at providing defense in depth and user consent. First, we perform two studies to evaluate whether permissions provide defense in depth: we analyze applications to determine whether developers request minimal sets of permissions, and we quantify the impact of permissions on real-world vulnerabilities. Next, we evaluate whether permissions obtain the user's informed consent by surveying and interviewing users. We use the Android application and Google Chrome extension platforms for our studies; at present, they are popular platforms with extensive permission systems.

Our goal is to inform the design of future platforms with our findings. We argue that permissions are a valuable addition to a platform, and our study results support continued work on permission systems. However, current permission warnings fail to inform the majority of users about the risks of applications. We propose a set of guidelines to aid in the design of more user-friendly permissions, based on our user research and relevant literature.

Cover page: Towards Comprehensible and Effective Permission Systems