The emergence of social media and advances in mobile technology and internet
has resulted in constant connectivity across users enabling them to post, share, and engage with content published on the web. Studying and learning from such data about
users, and their engagement with content can give insights into the current and emerging trends in society. However, studying social media data comes with its own set of
unique challenges. Social media data is highly unstructured because the content is not
curated to adhere to any formal structure. This makes the process of analyzing the data
challenging. Each message published on social media has Social media data is also
highly volatile since huge volumes of data is generated every second. In this thesis, we
propose machine learning based algorithms and methodologies to accommodate these
challenges; and apply the algorithms to solve problems in domains of public health and
journalism.
Chapter 1 proposes a new framework to combine the text and user engagement
data to detect trends from social networks.
Chapter 2 studies social media data to predict the impact of news events. The
chatter on social media surrounding news events is accurately quantified, and is found
to be the most distinguishing feature between high-impact and low-impact events.
Chapter 3 uses topic modeling to discover attitudes and trends about drug abuse.