- Main
Adaptive Entity Normalization for Biomedical Text Mining
- Mehta, Raghav
- Advisor(s): Knight, Rob;
- Hsu, Chunnan
Abstract
Entity normalization is an essential but challenging task for knowledge base construction by text mining the scientific literature. Related to entity linking and word sense disambiguation, models for entity normalization usually depend either on the surface text phrases of the entities or their coherence in the context. In this paper, we show that NormCo, a deep neural network normalization model, can switch between phrase and coherence models. Specifically, we tested this model on the tasks of normalizing bacteria and disease entities extracted from the scientific literature. These two entity types are important to construct a knowledge base of associations between diseases and human microbiome, an emerging development in biotechnology. We show that NormCo switched to either phrase or coherence model to accomplish the best performance for different entity types. We revised NormCo with a dynamic document-level switch and tested it with novel embedding techniques and obtained encouraging results. We organized and consolidated available lexical resources and annotated corpora for bacteria entity tagging and normalization, revealing a high level of discrepancy among these resources. Our results with these resources suggest that the skewed distribution of biomedical entity mentions may require different normalization approaches for highly mentioned entities from long-tail ones.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-