- Main
Unlocking Insights in the Life Sciences Domain through Knowledge Graph Construction and Hypothesis Generation Using Machine Learning
- Youn, Jae Sung
- Advisor(s): Tagkopoulos, Ilias
Abstract
In the dynamic landscape of life sciences data, the inherent noise and lack of machine-friendliness present significant challenges. The pressing need arises to transform this complex and unstructured data into a machine-friendly format, fostering efficient utilization for the generation of novel scientific discoveries in a faster and more cost-effective manner. This dissertation presents a comprehensive exploration of automated knowledge management and discovery across various domains using advanced machine learning techniques. We first address the challenges associated with the manual creation and maintenance of food ontologies. A semi-supervised framework employing word embeddings is proposed, demonstrating an 89.7% improvement in precision compared to the expert-curated FoodOn ontology. Second, a machine learning framework is introduced for automated knowledge discovery through the construction of a comprehensive Escherichia coli antibiotic resistance knowledge graph. Iterative link prediction and wet-lab validation led to the identification of 15 antibiotic-resistant genes, including 6 previously unassociated with antibiotic resistance. Third, the Knowledge Graph Language Model (KGLM), which incorporates a novel entity/relation embedding layer, achieves state-of-the-art performance in link prediction tasks on benchmark datasets. Finally, an integrated pipeline is presented for the automated generation of large-scale knowledge graphs in an active learning setting. Applied to 155,260 scientific papers, the pipeline extracts 230,848 food-chemical composition relationships, the largest in the domain. This dissertation exemplifies evidence-driven decisions in automating knowledge discovery, providing high confidence, and accelerating the pace compared to traditional methods.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-