Huang, Kuan-Hao

Building Reliable and Robust Natural Language Processing Models: Enhancing Understanding of Semantically Equivalent Texts

2023

Huang, Kuan-Hao
Advisor(s): Chang, Kai-Wei

Abstract

Recently, research in the natural language processing (NLP) domain has achieved remarkable advancements. Machines have become increasingly intelligent, achieving human performance in several NLP benchmarks. Despite its potential, recent studies demonstrate that NLP systems are not as reliable and robust as we expect and are sensitive to different levels of modifications to the input text, including word-level, syntax-level, and language-level. Those modifications do not alter the meaning of input text but would make NLP models behave very differently, which deviates from human expectations. This robustness issue results in challenges when applying NLP models to real-world applications and therefore becomes an important research question in recent years. In this thesis, we highlight the crucial role of understanding semantically equivalent texts in resolving the robustness issue of NLP models. We put emphasis on enhancing NLP models’ comprehension of semantically equivalent texts while focusing specifically on improving the syntax-level robustness and language-level robustness of NLP models. The first part of this thesis focuses on enhancing syntax-level robustness by disentangling semantics and syntax when learning text representations. We propose three different ways to separate syntax from semantics: learning with paraphrase pairs, learning with unannotated texts, and learning with abstract meaning representations. By adopting these methods, NLP models become less sensitive to syntax and more robust to syntactic perturbations. In the second part of this thesis, we improve language-level robustness by considering the zero-shot cross-lingual setting. We propose two methods to enhance zero-shot cross-lingual transfer: robust training and the utilization of generation-based models. The proposed approaches in this dissertation effectively improve the reliability and robustness of NLP models.

Main Content

For improved accessibility of PDF content, download the file to your device.

UCLA

Building Reliable and Robust Natural Language Processing Models: Enhancing Understanding of Semantically Equivalent Texts