A major challenge in genomics is the knowledge gap between sequence and its encoded function. Gain-of-function methods based on gene overexpression are attractive avenues for phenotype-based functional screens, but are not easily applied in high-throughput across many experimental conditions. Here, we present Du al B arcoded S hotgun E xpression Library Se q uencing (Dub-seq), a method that greatly increases the throughput of genome-wide overexpression assays. In Dub-seq, a shotgun expression library is cloned between dual random DNA barcodes and the precise breakpoints of DNA fragments are associated to the barcode sequences prior to performing assays. To assess the fitness of individual strains carrying these plasmids, we use DNA barcode sequencing (BarSeq), which is amenable to large-scale sample multiplexing. As a demonstration of this approach, we constructed a Dub-seq library with total Escherichia coli genomic DNA, performed 155 genome-wide fitness assays in 52 experimental conditions, and identified 813 genes with high-confidence overexpression phenotypes across 4,151 genes assayed. We show that Dub-seq data is reproducible, accurately recapitulates known biology, and identifies hundreds of novel gain-of-function phenotypes for E. coli genes, a subset of which we verified with assays of individual strains. Dub-seq provides complementary information to loss-of-function approaches such as transposon site sequencing or CRISPRi and will facilitate rapid and systematic functional characterization of microbial genomes.
Importance
Measuring the phenotypic consequences of overexpressing genes is a classic genetic approach for understanding protein function; for identifying drug targets, antibiotic and metal resistance mechanisms; and for optimizing strains for metabolic engineering. In microorganisms, these gain-of-function assays are typically done using laborious protocols with individually archived strains or in low-throughput following qualitative selection for a phenotype of interest, such as antibiotic resistance. However, many microbial genes are poorly characterized and the importance of a given gene may only be apparent under certain conditions. Therefore, more scalable approaches for gain-of-function assays are needed. Here, we present Du al B arcoded S hotgun E xpression Library Se q uencing (Dub-seq), a strategy that couples systematic gene overexpression with DNA barcode sequencing for large-scale interrogation of gene fitness under many experimental conditions at low cost. Dub-seq can be applied to many microorganisms and is a valuable new tool for large-scale gene function characterization.