In many modern machine learning applications, users hold data on a local device and communicate with an untrusted central analyst. Local differential privacy (local DP) is the state-of-the art way to ensure that all data sent by a user will be protected regardless of who eventually sees the data. In this dissertation, I will focus on collecting certain graph statistics, such as the degree vector and number of triangles in the graph, under local DP. I will obtain formal performance guarantees in the presence of different real-world constraints.
First, I will detail a one-round triangle counting algorithm which is simple, but not guaranteed to work well on sparse graphs. To improve these matters, I will propose a two-round algorithm with improved performance for sparse graphs. Unfortunately, it may have prohibitively large per-user download cost. Second, I will show how one can use edge sampling to reduce download cost, and exhibit another two-round algorithm which trades off between the download cost and error.
Third, given that it requires significant synchronization overhead to implement multiple rounds, I will focus on improved one-round algorithms for triangle counting. I propose such an algorithm which satisfies a slightly weaker notion of DP, the shuffled model. This algorithm uses a novel wedge-shuffling approach to count wedges, and gives rise to an accurate algorithm for counting 4-cycles as well. Fourth, I will shift gears to computing the degree vector in the presence of malicious users who do not follow the local DP protocol. I will detail robust algorithms, leveraging the fact that every edge in a graph is shared between two users, which use edge consistency checks to catch malicious users.
Fifth, I will highlight my work on private hierarchical clustering, a common clustering algorithm used for graphs and other data. I propose two algorithms and a lower bound in the general setting of the problem, and a near-optimal algorithm in the stochastic block model of graphs. The algorithm design and lower bound are of interest to designing future local DP protocols on graphs.