Big Data & Spectral Graph Theory
Network data arises naturally in many domains - from protein-protein interaction networks in biology to social networks. Modern storage and collection technology make it possible to collect and analyze networks of unprecedented size. At the same time, the proliferation of machine learning techniques increase the amount of analysis tasks practitioners want to perform. With bigger data and more demands on it, there is a need for faster algorithms tailored to modern datasets.
In this dissertation, we present several new results on algorithms for sparse graphs based on spectral graph theory. From biology to social networks, the networks we encounter in practice are often contain very sparse. Spectral graph theory provides a toolkit which allows us to come up with local procedures for sparse graphs which have desirable global properties, enabling new advances in the field.
In the first part of this dissertation, new upper bounds in the field of sparse graph property testing are presented. We consider the problem of testing for $H$-minor-freeness. A near-optimal one-sided tester is presented, as well as a two-sided tester. We also present a polynomial-time algorithm for the related problem of graph partition oracles.
In the second part, we examine embeddings of sparse graphs. A popular machine learning tool is to compute geometric embeddings of the vertices of a graph. There has been little principled investigation of the power of these methods. We present several impossibility results which question the efficacy of these embedding methods and follow these up with an empirical investigation.