Background
Low-frequency variants play an important role in breast cancer (BC) susceptibility. Gene-based methods can increase power by combining multiple variants in the same gene and help identify target genes.Methods
We evaluated the potential of gene-based aggregation in the Breast Cancer Association Consortium cohorts including 83,471 cases and 59,199 controls. Low-frequency variants were aggregated for individual genes' coding and regulatory regions. Association results in European ancestry samples were compared to single-marker association results in the same cohort. Gene-based associations were also combined in meta-analysis across individuals with European, Asian, African, and Latin American and Hispanic ancestry.Results
In European ancestry samples, 14 genes were significantly associated (q < 0.05) with BC. Of those, two genes, FMNL3 (P = 6.11 × 10-6) and AC058822.1 (P = 1.47 × 10-4), represent new associations. High FMNL3 expression has previously been linked to poor prognosis in several other cancers. Meta-analysis of samples with diverse ancestry discovered further associations including established candidate genes ESR1 and CBLB. Furthermore, literature review and database query found further support for a biologically plausible link with cancer for genes CBLB, FMNL3, FGFR2, LSP1, MAP3K1, and SRGAP2C.Conclusions
Using extended gene-based aggregation tests including coding and regulatory variation, we report identification of plausible target genes for previously identified single-marker associations with BC as well as the discovery of novel genes implicated in BC development. Including multi ancestral cohorts in this study enabled the identification of otherwise missed disease associations as ESR1 (P = 1.31 × 10-5), demonstrating the importance of diversifying study cohorts.