Abdali, Sara; Abdali, Sara

Multilinear (Tensor) Algebra Framework for Misinformation Detection With Limited Supervision

2021

Abdali, Sara
Advisor(s): Papalexakis, Evangelos

Creative Commons 'BY' version 4.0 license

Abstract

Identifying misinformation is one of the most challenging problems in today's interconnected world. The vast majority of the state-of-the-art in detecting misinformation are fully supervised, requiring a large number of high-quality human annotations. However, the availability of such annotations cannot be taken for granted, since it is very costly, time-consuming, and challenging to do so in a way that keeps up with the proliferation of misinformation. In this thesis, we are interested in exploring scenarios where the number of annotations is limited. In such scenarios, we leverage a multilinear framework a.k.a. "tensor" for a variety of modalities which has been shown to be interpretable and a proper tool for semi-supervised and unsupervised settings where there is a few or no annotation. In this dissertation, We propose a number of tensor-based techniques, organized in the following parts:Content-based techniques of misinformation detection. We propose a novel strategy mixing tensor-based content modeling and semi-supervised learning on article embeddings which requires very few labels. Driven by the effectiveness of our tensor embeddings, we propose a novel text augmentation framework i.e., Vec2Node leveraging tensor decomposition to generate synthetic samples by exploiting local and global information in text and reducing concept drift. Vec2Node leverages self-training from in-domain unlabeled data augmented with tensorized word embeddings. Finally. we propose a hybrid summarization framework that incorporates both extractive and abstractive techniques for capturing misinformative key phrases. Ensemble techniques for multi-aspect detection of misinformation We investigate how to tap into a diverse number of aspects that characterize a news article, can compensate for the lack of labels. We propose two tensor-based techniques for ensemble learning: HiJoD, a 2-level decomposition framework that leverages article content, context of social sharing behaviors, and host website/domain features; and K-Nearest Hyperplane Graph i.e., KNH which merges the aforementioned aspects to create a higher order graphical representation of articles. Vision-based techniques for misinformation detection. We propose to use a promising yet neglected feature: the overall look of the domain web page. We propose VizFake which takes screenshots of news articles and leverages a tensor decomposition based semi-supervised classification technique to classify them. Finally, we propose a modified multilinear (tensor) method, a combination of linear and multilinear regressions for presenting manipulated videos. Our method leverages only a handful of frames per video to detect Deepfakes.

Overall, this dissertation is innovating in the field of misinformation detection, empowering work in label scarce settings while leveraging multiple modalities. We envision that the body of work contained in this dissertation will serve as a blueprint for further research in multi-modal label-scarce misinformation detection, in research and in practice.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Riverside

Multilinear (Tensor) Algebra Framework for Misinformation Detection With Limited Supervision