Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Commonsense-Guided Text Generation with Knowledge Grounding and Scoring

Abstract

This thesis investigates improving the world knowledge and commonsense reasoning abilities of Language Models (LMs) such as GPT2 and T5 (Radford et al., 2019; Raffel et al., 2020) through the task of commonsense language generation using the CommonGen benchmark (Lin et al., 2020). We propose a framework that guides pretrained LMs to generate more commonsensical sentences without updating the LMs’ parameters. To do so, we introduce an automatic commonsense metric grounded on ConceptNet (Speer et al., 2017) inspired by ACCENT (Ghazarian et al., 2023). To this end, we introduce a parser to extract triplets of commonsense-related concepts from a input sentence trained on few-shot GPT3-annotated data. We take the extracted triplets and compute similarity scores using COMET (Bosselut et al., 2019) to measure how well the sentence is grounded to ConceptNet, which we assume as the oracle of commonsense knowledge. Finally, we extend the Neurally-Decomposed Oracle by Meng et al. (2022), adding our commonsense metric masked with the lexical constraint into the signal used to train the auxiliary network, and demonstrate our framework is able to guide LMs towards more commonsensical generations while satisfying lexical constraints.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View