The analysis of the large-scale structure (LSS) of the Universe can yield insights into some of the most important questions in contemporary cosmology, and in recent years, has become a data-driven endeavor. With ever-growing data sets, optimal analysis techniques have become essential, not only to extract statistics from data, but also to effectively use computing resources to produce accurate theoretical predictions for those statistics. Future LSS experiments will help answer fundamental questions about our Universe, including the physical nature of dark energy, the mass scale of neutrinos, and the physics of inflation. To do so, improvements must be made to theoretical models as well as the computational tools used to perform such analyses.
This thesis examines multiple aspects of LSS data analysis, presenting novel modeling techniques as well as a software toolkit suitable for analyzing data from the next generation of LSS surveys. First, we present nbodykit, an open-source, massively parallel Python toolkit for analyzing LSS data. nbodykit is both an interactive and scalable piece of scientific software, providing parallel implementations of many commonly used algorithms in LSS. Its modular design allows researchers to integrate nbodykit with their own software to build complex applications to solve specific problems in LSS. Next, we derive an optimal means of using fast Fourier transforms to estimate the multipoles of the line-of-sight dependent power spectrum, eliminating redundancy present in previous estimators in the literature. We also discuss potential advantages of our estimator for future data sets. We then present a novel theoretical model for the redshift-space galaxy power spectrum and demonstrate its accuracy in describing the clustering of galaxies down to scales of k = 0.4 h/Mpc. Finally, we analyze the large-scale clustering of quasars from the extended Baryon Oscillation Spectroscopic Survey to constrain the deviation from Gaussian random field initial conditions in the early Universe, known as primordial non-Gaussianity.