Background and purposeStroke is the leading cause of death and long-term disability worldwide. Previous genome-wide association studies identified 51 loci associated with stroke (mostly ischemic) and its subtypes among predominantly European populations. Using whole-genome sequencing in ancestrally diverse populations from the Trans-Omics for Precision Medicine (TOPMed) Program, we aimed to identify novel variants, especially low-frequency or ancestry-specific variants, associated with all stroke, ischemic stroke and its subtypes (large artery, cardioembolic, and small vessel), and hemorrhagic stroke and its subtypes (intracerebral and subarachnoid).
MethodsWhole-genome sequencing data were available for 6833 stroke cases and 27 116 controls, including 22 315 European, 7877 Black, 2616 Hispanic/Latino, 850 Asian, 54 Native American, and 237 other ancestry participants. In TOPMed, we performed single variant association analysis examining 40 million common variants and aggregated association analysis focusing on rare variants. We also combined TOPMed European populations with over 28 000 additional European participants from the UK BioBank genome-wide array data through meta-analysis.
ResultsIn the single variant association analysis in TOPMed, we identified one novel locus 13q33 for large artery at whole-genome-wide significance (P<5.00×10-9) and 4 novel loci at genome-wide significance (P<5.00×10-8), all of which need confirmation in independent studies. Lead variants in all 5 loci are low-frequency but are more common in non-European populations. An aggregation of synonymous rare variants within the gene C6orf26 demonstrated suggestive evidence of association for hemorrhagic stroke (P<3.11×10-6). By meta-analyzing European ancestry samples in TOPMed and UK BioBank, we replicated several previously reported stroke loci including PITX2, HDAC9, ZFHX3, and LRCH1.
ConclusionsWe represent the first association analysis for stroke and its subtypes using whole-genome sequencing data from ancestrally diverse populations. While our findings suggest the potential benefits of combining whole-genome sequencing data with populations of diverse genetic backgrounds to identify possible low-frequency or ancestry-specific variants, they also highlight the need to increase genome coverage and sample sizes.