The democratization of computing and sensing through smart phones and embedded devices has led to widespread instrumentation of our personal and social spaces. The sensor data thus collected, has embedded in them minute details of our daily life. On the one hand, this has enabled a multitude of exciting applications where decisions at various time-scales are driven by inferences that are computationally derived from the shared sensory information and used for purposes such as targeted advertisements,
behavior tailored interventions and automated control. On the other hand, the ability to derive rich inferences about user behaviors and contexts and their use in critical decision making also present various concerns of personal privacy. Prior approaches to handling the privacy concerns have often been ad hoc and focused on disassociating the user identity from the shared data, thus preventing an adversary from tracing a sensitive inference back to the user. However, in many application domains (e.g., mHealth, insurance) user identity is an inalienable part of the shared data. In such settings, instead of identity privacy, the focus is on the more general inference privacy problem, pertaining to the privacy of sensitive inferences that can be derived from the shared sensor data. The objective of this research has been to develop a principled understanding of the inference privacy problem and design formalisms, algorithms, and system mechanisms to effectively address it.
The contributions of this dissertation are multi-fold. First, using information-theoretic notions we formulate the inference privacy problem in terms of a whitelist of utility providing allowed inferences, and a blacklist of sensitive inferences. We define utility and privacy parameters, derive bounds on the feasible region spanned by these parameters, and provide constructive schemes for achieving the boundary points of the feasible region. Second, using insights from the theoretical exploration, we design and implement
ipShield, a privacy-enforcing system by modifying the Android OS. ipShield, is a step towards reducing the user burden of configuring fine-grained privacy policies. It does so by changing the basic privacy abstraction, from access control on sensors to privacy preferences over higher level possible inferences. The user preferences are then used by a rule recommender to auto-generate privacy rules on sensors. Finally, we present iDeceit, a framework that implements model-based plausible falsification of sensor data to protect the privacy of sensitive inferences while maximizing the utility of the shared data. A graphical model is used to capture the temporal and spatial patterns that exists in user behavior. The model is then used, together with privacy and utility metrics and a novel plausibility metric, to generate falsified data stream that conforms to typical user-behavior ensuring perfect privacy. Extensive evaluation results are detailed for both ipShield and iDeceit to validate their efficiency and feasibility on mobile platforms.