Diffusion and clustering on large graphs
- Author(s): Tsiatas, Alexander
- et al.
This dissertation studies two important algorithmic problems on networks : graph diffusion and clustering. These problems are closely related: bottlenecks that diffusive processes find difficult to cross make good cluster boundaries and vice versa. First, we give an efficient global partitioning algorithm specifically tailored for a graph-theoretic setting. It uses sampling with PageRank vectors as probability distributions, optimizing new PageRank-distance metrics that depend only on the jumping constant alpha. Once k centers are found, we give a graph drawing algorithm that uses a force-based layout, with spring constants determined by PageRank vectors, that highlights clustered structure. We study graph clustering in two additional contexts. The first is with subsets of communication networks that have imposed boundaries when the topologies reach private subnetworks. We empirically show that spectral clustering using Dirichlet eigenvectors instead of the usual eigenvectors can be more suitable. Second, we study the extended planted partition model, a random-graph model that starts with a predetermined partitioning of the vertices with arbitrary expected degrees. We give a spectral clustering algorithm to recover the planted partition under certain conditions. Our algorithm uses a new matrix, the degree- corrected random-walk Laplacian, and unlike some prior work, does not assume knowledge of any graph generation parameters. Next, we propose and study Dirichlet PageRank, or PageRank vectors with arbitrary boundary conditions, to address vertex ranking problems resulting from the propagation of trust and distrust in networks. Using Dirichlet PageRank, we can compute vertex rankings in the presence of known spammers and negatively-weighted edges. We also study a diffusive network epidemic model : the contact process. We show that network epidemics can by stopped by finding a local cluster around the infection's starting points and inoculating the vertices in that cluster with antidote proportional to a personalized PageRank vector. This is more efficient than prior work which required distributing antidote widely to the entire graph. Finally, we propose a new voter model for the evolution of electoral opinions in social networks. Unlike the traditional voter model, our model operates on hypergraphs, allowing for more complex interactions, and it yields a wider range of outcomes