Skip to main content
eScholarship
Open Access Publications from the University of California

StoryMiner: An Automated and Scalable Framework for Story Analysis and Detection from Social Media

  • Author(s): Shahbazi, Behnam
  • Advisor(s): Parker, Stott
  • Roychowdhury, Vwani P
  • et al.
Abstract

The explosive growth of social media over the past decade, together with advancements in computational power, has paved the way for many large-scale sociological studies, which were not possible before. Social media sites are now the primary source of data for much of our insights into society, from trending topics to behavioral patterns of various groups such as online shoppers or political parties. One particular area of interest is the analysis of events and interactions through their descriptions in social media posts. Inferring and analyzing real-world events from social media in a large-scale automated way provides a platform for understanding real-world stories, which are not only influenced by but also heavily impact public opinion. Therefore, it is necessary to design computational and statistical tools to automatically extract social media stories. In this dissertation, we introduce StoryMiner, an automated and scalable machine learning framework rooted in a narrative theory that identifies and tracks multi-scale narrative structures from large-scale social media text.

Predicating our work on narrative theory, StoryMiner derives stories and narrative structures by automatically 1) extracting and co-referencing the actants (entities such as people and objects) and their relationships from the text by proposing an Open Information Extraction system, 2) assigning named-entity types and importance scores for entities and relationships using character-level neural language architectures and other traditional machine learning models, 3) making use of context-dependent word embeddings to aggregate actant-relationships and form contextual story graphs in which the nodes are the actants and the edges are their relationships, and 4) enriching the story graphs with additional layers of information such as sentiments or sequence orders of relationships.

StoryMiner allows academic and industry researchers to extract structured knowledge from unstructured text to inform practical decisions. To exhibit the benefits of our framework, among the many possible applications we showcase three major use cases: identification of differences in narrative structures between fake and real conspiracies, summarization of user product opinions from tweets, and reconstruction of plot summaries of famous novels from reader reviews on social reading sites such as Goodreads.

Main Content
Current View