Skip to main content
Open Access Publications from the University of California

UC Santa Barbara

UC Santa Barbara Electronic Theses and Dissertations bannerUC Santa Barbara

Accessing Diverse Web Knowledge with Natural Language Interface


The Web is the primary source for humans to access information in our daily life. In the Information Age, Web knowledge has exploded and distributed in diverse forms like text, tables, charts, graphs, images. To help humans cope with such massive amount of Web information, building an intelligent Natural Language Interface has become a primary interest in artificial intelligence. The natural language interface needs to understand the human input and ground on Web knowledge to provide useful information. Technically, the interface needs two components: 1) natural language understanding: understanding the semantics of the natural language to navigate to supportive evidence. 2) natural language generation: grounding on the supportive evidence to generate natural language response.

In the first part, we will focus on the understanding component and use question answering as our evaluation task. Specifically, I will cover three different grounding scenarios: 1) how to ground on the given hybrid data (structured + unstructured) to answer multi-hop questions, 2) how to search over the Web and then ground on retrieved hybrid data to answer multi-hop questions 3) how to ground on visual data to answer compositional questions.

In the second part, we will focus on the generation component and use data-to-text generation as our evaluation task. Specifically, I will discuss two different scenarios: 1) how to ground on structured tabular data to generate logically entailed claims, 2) how to leverage unlabeled Web data for pre-training to improve existing data-to-text models.

In these scenarios, we designed novel algorithms to dramatically improve the existing models’ capability to generate more consistent and coherent text.

Finally, we summarize the strengths, weaknesses, and implications of our work, and discuss the future research plan of pushing the direction forward.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View