An Informatics Roadmap Toward a FAIR Understanding of Mitochondrial Biology and Rare Mitochondrial Disease
Mitochondrial biology is integral to our fundamental understanding of human health and many diseases. They exist in every human cell type except for red blood cells and have critical functions in metabolism, oxidative phosphorylation, oxidation-reduction, and as signaling hubs responsible for mediating protective mechanisms. Rare mitochondrial diseases (RMDs) are devastating and complex, affect multiple organ systems, and disproportionately impact young children. Despite copious existing knowledge and increased public interest, the knowledge is fragmented and difficult to access. Clinical case reports (CCRs) on RMDs contain valuable clinical insights, but they are scarce and lack the metadata necessary to facilitate their discovery among the two million CCRs on PubMed. The unstructured text data of CCRs is also ill-suited to computational approaches, limiting our ability to derive the knowledge contained within.
To address these issues, I assembled all available informatics tools and resources with mitochondrial components and used them to contribute to Gene Wiki pages that enable easy access to mitochondrial knowledge for researchers, students, clinicians, and patients. Through these efforts, I made mitochondrial gene, protein, and disease knowledge widely accessible with contributions of over 4MB of content across 541 Gene Wiki pages. Concurrently, I used Gene Wiki as an educational platform to train over 50 students in the biosciences and pre-medical studies in mitochondrial biology and disease, as well as instilling effective research and writing methods in biomedicine.
To impose structure on CCRs and render them FAIR (Findable, Accessible, Interoperable, Reusable), I developed and applied a standardized metadata template to RMD CCRs and codified patient symptomology with the International Statistical Classification of Disease and Related Health Problems (ICD) system. I created the open-source, cloud-based MitoCases RMD Knowledge Platform (http://mitocases.org/) to house data on 384 RMD CCRs, including 4,561 instances of 952 unique ICD codes. Supplementing CCRs with structured metadata amplifies machine-readable information content and provides a distinct improvement in searching for CCRs as compared to indexing by title and abstract. Finally, I employed these resources to conduct a thorough review of Barth syndrome and characterized the diversity of presentations, range of genetic etiologies, and treatment paradigms.