Community detection using total variation and surface tension
- Author(s): Boyd, Zachary
- Advisor(s): Bertozzi, Andrea
- et al.
In recent years, a massive expansion in the amount of available network data in fields such as social networks, food networks in ecology, similarity networks in machine learning, transportation networks, brain networks, and many others has motivated the development of "network science'' to describe all this data. One of the fundamental branches in network science is "community detection,'' or the decomposition of large networks into coherent subnetworks, which is useful for visualization, data exploration, hypothesis formation, approximation of network dynamics, link prediction, and a host of other tasks.
Two of the most well-known frameworks for community detection are modularity optimization and stochastic block modeling. They can often uncover meaningful community structure in networks from diverse applications. However, both of these approaches are computationally demanding. In this dissertation, I will show how these two statistically-motivated frameworks for community detection can be reinterpreted more geometrically using the language of graph total variation (TV) and (discretized) surface tension, respectively. This change in perspective allows one to leverage algorithms and analytical tools developed for other problems in the fields of compressed sensing, materials science, and nonlinear partial differential equations. One also can adapt arguments from other domains to obtain theoretical guarantees on the performance of these algorithms. I illustrate these approaches on a number of synthetic and real-world datasets, yielding results competitive with other state-of-the-art techniques on problems from machine learning, image processing, social networks, and biological networks.