Barsever, Dan

An Analysis of Deceptive Text Using Techniques of Machine Learning, Corpus Generation, and Online Crowdsourcing

2022

Barsever, Dan
Advisor(s): Neftci, Emre

Creative Commons 'BY-NC-ND' version 4.0 license

Abstract

This research demonstrates how to use deep learning techniques alongside corpus generation and online crowdsourcing in order to better understand deceptive text. In this dissertation, I use state-of-the-art classifiers to examine the structure of deceptive text and determine what parts mark it as deceptive. I also expand knowledge of deception into new areas by adding to the knowledge base of deceptive text with large amounts of curated, realistic data. It also offers a more complete understanding by examining deceptive text through multiple lenses. My research accomplishes this through three interrelated projects: (I) The construction of a new state-of-the-art classifier, and modifying the input to the classifier to examine what the classifier considers most informative in a classification, (II) the creation of a new corpus of deceptive text, the Motivated Deception Corpus, which uses gameifying techniques to improve the quality of deceptive text samples by making them more realistic through competition, and (III) a human subject study on Amazon Mechanical Turk, where I observe what samples humans consider deceptive or truthful and use a Cultural Consensus Theory model to identify what prompts a subject to decide one way or the other.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Irvine

An Analysis of Deceptive Text Using Techniques of Machine Learning, Corpus Generation, and Online Crowdsourcing