- Kobren, Shilpa Nadimpalli;
- Baldridge, Dustin;
- Velinder, Matt;
- Krier, Joel B;
- LeBlanc, Kimberly;
- Esteves, Cecilia;
- Pusey, Barbara N;
- Züchner, Stephan;
- Blue, Elizabeth;
- Lee, Hane;
- Huang, Alden;
- Bastarache, Lisa;
- Bican, Anna;
- Cogan, Joy;
- Marwaha, Shruti;
- Alkelai, Anna;
- Murdock, David R;
- Liu, Pengfei;
- Wegner, Daniel J;
- Paul, Alexander J;
- Undiagnosed Diseases Network;
- Sunyaev, Shamil R;
- Kohane, Isaac S
Purpose
Genomic sequencing has become an increasingly powerful and relevant tool to be leveraged for the discovery of genetic aberrations underlying rare, Mendelian conditions. Although the computational tools incorporated into diagnostic workflows for this task are continually evolving and improving, we nevertheless sought to investigate commonalities across sequencing processing workflows to reveal consensus and standard practice tools and highlight exploratory analyses where technical and theoretical method improvements would be most impactful.Methods
We collected details regarding the computational approaches used by a genetic testing laboratory and 11 clinical research sites in the United States participating in the Undiagnosed Diseases Network via meetings with bioinformaticians, online survey forms, and analyses of internal protocols.Results
We found that tools for processing genomic sequencing data can be grouped into four distinct categories. Whereas well-established practices exist for initial variant calling and quality control steps, there is substantial divergence across sites in later stages for variant prioritization and multimodal data integration, demonstrating a diversity of approaches for solving the most mysterious undiagnosed cases.Conclusion
The largest differences across diagnostic workflows suggest that advances in structural variant detection, noncoding variant interpretation, and integration of additional biomedical data may be especially promising for solving chronically undiagnosed cases.