Skip to main content
eScholarship
Open Access Publications from the University of California

Anonymization via Clustering of Locations in Road Networks

Published Web Location

https://doi.org/10.25436/E2CC7P
Abstract

Data related to households or addresses needs be published in an aggregated form to obfuscate sensitive information about individuals. Usually, the data is aggregated to the level of existing administrative zones, but these often do not correspond to formal models of privacy or a desired level of anonymity. Therefore, automatic privacy-preserving spatial clustering methods are needed. To address this need, we present algorithms to partition a given set of locations into k-anonymous clusters, meaning that each cluster contains at least k locations. We assume that the locations are given as a set TV of terminals in a weighted graph G = (V, E) representing a road network. Our approach is to compute a forest in G, i.e., a set of trees, each of which corresponds to a cluster. We ensure the k-anonymity of the clusters by constraining the trees to span at leastterminals each (plus an arbitrary number of non-terminal nodes called Steiner nodes). By minimizing the total edge weight of the forest, we ensure that the clusters reflect the proximity among the locations. Although the problem is NP-hard, we were able to solve instances of several hundreds of terminals using integer linear programming. Moreover, we present an efficient approximation algorithm and show that it can be used to process large and fine-grained data sets.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View