Skip to main content
eScholarship
Open Access Publications from the University of California

Incorporating Mental State into Contrastive Learning for Fine-grained Implicit Hate Speech Classification

Creative Commons 'BY' version 4.0 license
Abstract

Many people have suffered harm as a result of hate speech on social media. The majority of research has focused on coarse-grained explicit hate speech detection while disregarding fine-grained implicit hate speech classification. It is crucial for more effectively combating hate speech. Although the language used in implicit hate speech may vary greatly, the mental states involved are usually the same. There are rarely similarities and differences between the mental states present in implicit hate speech examined. We create a module to infer mental states from implicit hate speech to close this gap. Mental states primarily refer to the speaker's intent and the reader's reaction. Then, we use them as the positive sample in contrastive learning. This strategy can pull the implicit hate speech which has similar mental states in similar representations and push away different ones. Comprehensive experiment results demonstrate superior classification performance and generalization of the proposed method.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View