Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Joint Image-Text Topic Detection and Tracking for Analyzing Social and Political News Events

Abstract

News plays a vital role in informing citizens, affecting public opinion, and influencing policy making. The analyses of information flow in the news information ecosystem are important issues in social and political science research. However, the sheer amount of news data overwhelms manual analysis. In this dissertation, we present an automatic topic detection and tracking method which can be used to analyze the real world events and their relationships from multimodal TV news data. We propose a Multimodal Topic And-Or Graph (MT-AOG) to jointly rep- resent textual and visual elements of news stories and their latent topic structures. An MT-AOG leverages a context sensitive grammar that can describe the hierarchical composition of news topics by semantic elements about people involved, related places and what happened, and model contextual relationships between elements in the hierarchy. We detect news topics through a cluster sampling process which groups stories about closely related events together. Swendsen-Wang Cuts, an effective cluster sampling algorithm, is adopted for traversing the solution space and obtaining optimal clustering solutions by maximizing a Bayesian posterior probability. The detected topics are then continuously tracked and up- dated with incoming news streams. We generate topic trajectories to show how topics emerge, evolve and disappear over time. We conduct both qualitative and quantitative evaluations to show the effectiveness and efficiency of the proposed approach over existing methods. We further expand our work to the analysis of campaign communication in recent presidential elections. Specifically, we apply fully automated coding on a massive collection of news and other campaign information to track which candidates are discussed on Twitter and in traditional television news coverage; what topics are being discussed in relation to the candidates and by which news outlets; and which candidates were treated most favorably across news outlets and media. Our methods, which rely on machine learning and digital visual processing, offer promising new methods for social and political science scholars hoping to study large-scale information datasets.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View