Inferring Local and Global Properties in Knowledge Graphs
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

Inferring Local and Global Properties in Knowledge Graphs

Creative Commons 'BY' version 4.0 license
Abstract

Understanding the meaning, semantics and nuances of entities and the relationships between entities is crucial for enabling future generations of AI systems to go beyond keyword-based pattern matching. Knowledge bases provide AI systems with a structured representation of entities, their attributes and their relationships. Knowledge graphs (KGs) are a type of knowledge base that uses a graph to store information. They have become ubiquitous as they provide efficient storage and retrieval. A wide range of systems such as search engines, intelligent agents, contextual recommender systems and fake news detection applications use KGs as a knowledge source. To perform effectively, these systems need to extract latent patterns in the KG that are novel, valid and useful, a task known as knowledge discovery. In my thesis, I examine three key tasks of knowledge discovery - data alignment, which involves inferring relationships between entities of the same type, computing aggregate graph queries, which involves estimating recurring subgraphs in the absence of information and model discovery, a data-driven way of discovering and combining rules to reason in a KG. These tasks are challenging because: (1) KGs contain a rich set of entities and relations that have very different characteristics and also highly correlated (2) KGs are large, containing millions of entities, and at the same time incomplete, missing many entities and relationships (3) lack of training data and hard to generate negative instances for training various approaches. In this dissertation, I develop robust and scalable knowledge discovery algorithmsto address these challenges. First, I develop a data alignment approach that identifies both entities that are exactly the same, and also variations. Variations are entity that are similar in most aspects but differ in a few aspects. The proposed approach is a scalable and unsupervised. Using empirical evaluation in three different domains I show the generality of this approach. Second, I propose a fine-grained data alignment approach to identify discriminating attributes between variations. The proposed approach can identify a rich variety of discriminating attributes for different entity types. The framework models the semantics of the attributes enabling it to scale to a large number of attributes with little training data. Third, I develop a scalable framework to estimate global graph properties using complex graph queries. The proposed approach estimates these queries when there is missing information, such as node labels. I analyze two different approaches, point estimate and expectation-based approaches, for this task both empirically and theoretically. Finally, I present a framework to perform model discovery in knowledge graphs. The framework can also generate explanations for the model’s prediction. To discovery these models efficiently, I first propose a template-based rule mining technique that can efficiently search the space of rules. I then propose a new scoring function that enables the framework to learn the relative importance of these rules efficiently. Finally, I prove the stability of that the generated explanations. Together my work expands the scope of knowledge discovery tasks on knowledge graphs.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View