Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

An Application of Document Embedding Methods with Movie Plots

Abstract

This thesis will explore and compare Natural Language Processing methods to determine the similarity between movies based on plot descriptions. Three document embedding methods, TF-IDF, Doc2Vec, and S-BERT, are implemented. The results are evaluated using spectral clustering and normalized mutual information. Specific case studies are also presented. The objective is to identify movies that are most similar in topic or theme based solely on plot content.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View