Skip to main content
eScholarship
Open Access Publications from the University of California

Entity Extraction and Disambiguation in Short Text Using Wikipedia and Semantic User Profiles

  • Author(s): Zendejas, Ignacio
  • Advisor(s): Cardenas, Alfonso F
  • et al.
Abstract

We focus on entity extraction and disambiguation in short text communications, which have experienced some advances in the last decade, but to this day remain very challenging. Much of the research that has helped advance the field has leveraged crowd-sourced, external knowledge bases like Wikipedia to build probabilistic and machine learning models for entity extraction. That work has its basis in Wikify! and has recently been applied to understanding the topics discussed on social media where a terse, lossy form of communication makes topic detection even more challenging. We expand on this work and show that on the Twitter data experiments we conducted that leveraging a rich, semantic history of entities that users discuss can improve the accuracy of semantically annotating their future social media posts.

Main Content
Current View