Applications of Natural Language Processing for Predicting Self-Harm Risk
Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Applications of Natural Language Processing for Predicting Self-Harm Risk

Abstract

Self-harm is a subset of mental health that is considered a severe condition requiringimmediate attention. This research aims to predict individuals’ risk of self-harm using their social media history. This dataset and broader task were originally developed by the eRisk lab at the Conference and Labs of the Evaluation Forum (CLEF). By analyzing the text corpus, it is possible to identify writing patterns that are highly correlated with self-harm. Various methods rooted in Natural Language Processing (NLP) are explored to this end, including sentiment analysis, random forest classification, and deep learning classification using BERT. The results show that adequate classification is attainable with these methods, but the potential to incorporate additional processing steps and model features to increase predictiveness is also discussed.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View