Skip to main content
eScholarship
Open Access Publications from the University of California

Mitigating Hallucinations in Large Language Models by Preprocessing Questions into Child-Comprehensible

Creative Commons 'BY' version 4.0 license
Abstract

Alongside the advancement of large language models (LLMs), attention towards their limitations and potential risks has also increased. One common issue is hallucination, which occurs when LLMs generate inaccurate or irrelevant answers, especially for complex sentences. To address this issue, we propose a novel question preprocessing method inspired by how young children comprehend complex sentences. Our method consists of two modules: (1) hierarchical clause annotation (HCA)-based sentence decomposition, which breaks down complex sentences into one-verb-centered clauses, and (2) abstract meaning representation (AMR)-based clause rewriting, which reformulates the clauses based on AMR into the child-comprehensible subject-verb-object (SVO) structure. We evaluate our method on the question-answering dataset, TruthfulQA, and show that it can improve the truthfulness and informativeness of widely-used LLMs, LLaMA-7B, and LLaMA-2-7B-chat, preventing from generating hallucinated answers. Moreover, our method is highly efficient, as it does not require any pre-training, fine-tuning, or invoking larger-scale models.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View