Developing experimental and computational tools for sequence-census assays
Genomic data has revolutionized biology. The related development of high-throughput DNA sequencers has had a major impact on this, as the DNA sequencing platforms are widely adaptable to a number of assays that measure aspects of biology that are increasingly unrelated to the problem for which they were built: to understand the content of a genome. Such experiments, termed sequence-census assays, leverage the power of these machines to yield high throughput digital information about a molecular sample. Although there is large variability within these assays, there are main themes to the structure of these experiments: they encode molecular information into a sequencable format by performing a limited set of molecular biology operations on a pool of nucleic acids, then they read out this information by sequencing, and finally decode the sequenced message computationally. Importantly, during this process the computational decoding and experimental encoding of this information are coupled, and as such both ’wet’ and ’dry’ parts of an assay must be designed and treated together. In this thesis I explore these aspects of sequence-census assays, and their application to biologically unrelated, but computationally similar problems, ranging from CRISPR biology to single-cell sequencing.