DeepAssist: Deep Knowledge Grounding for Factual and Conversational Natural Language Interfaces
- Author(s): Yavuz, Semih
- Advisor(s): Yan, Xifeng
- et al.
Enabling humans to use natural language to interact with computers towards achieving certain goals such accessing factual information, finding restaurants, holding engaging conversations with an AI agent has been one of the central goals of Artificial Intelligence. Many people use natural language interfaces (NLIs) such as Siri, Google Assistant, and Alexa in their daily life. Furthermore, there are several more equally promising, but less explored domains such as health, law, customer service, etc. Hence, developing more reliable, capable, and extendible NLIs has the potential to lead to the next generation human computer interaction technologies. Although several promising results have been achieved in both academia and industry, there are still several challenges to tackle towards realizing such NLIs in full fledge. This thesis tackles the problem of building reliable NLIs by addressing the central challenges confronted with.
The contributions of this thesis are presented in three main parts. The first part is covered by the first three chapters and focuses on factual (single-turn) NLIs that aim to generate a concise answer for factual user queries such as "Who is the president of Canada?". We discuss how our proposed approaches for answer type inference and query revision of factual queries over a large knowledge base can help improve the performance of the state-of-the-art NLIs for factoid question answering task. In the third chapter, we present an in-depth analysis towards better understanding the kinds of language understanding capabilities required to solve current benchmarks.
In the second part, we investigate conversational NLIs that can generate responses to user queries in a multi-turn fashion. With the goal of making these responses more informative and engaging for users, the main contribution of this study is to introduce principled neural architecture that can generate responses grounded on a relevant external knowledge by hierarchical attention and copy mechanisms.
Learning to generate coherent and engaging natural language from data is one of the most crucial capabilities for the next generation NLIs that can speak with users. In the last part of this thesis, we focus on a more foundational line of research where we propose and discuss a novel training objective for conditional language generation models.