Statistical Models for Aspect-Level Sentiment Analysis
Sentiment analysis and opinion mining is the field of computational study of people’s
opinion expressed in written language or text. Sentiment analysis brings together various
research areas such as natural language processing, data mining and text mining,
and is fast becoming of major importance to organizations as they integrate online commerce
into their operations.
The input of the problem is a collection of written reviews about an object. The object could be a product or service and the goal is to discover people’s opinions expressed in those written reviews. The input reviews are in the form of free text and do not have any structure (people can write whatever they like however they want). Dealing with unstructured data is a challenging problem.
Sentiment analysis can be done in different levels, and the focus of this dissertation
in on aspect-level sentiment analysis. In aspect-level sentiment analysis there are two
tasks that need to be addressed. The first task is aspect identification which is the process
of discovering those attributes of the object that people are commenting on. These
attributes of the object are called aspects. The second task is sentiment identification
which is the process of discovering people’s opinions expressed about each one of the
aspects. Aforementioned tasks can be solved in 2 separate steps or can be solved simultaneously.
Early work on aspect-level sentiment analysis would solve it in 2 steps and recent techniques based on topic models address these 2 tasks simultaneously. In this thesis an automatic framework for discovering the aspects and their corresponding sentiments is proposed. This framework first identifies the aspects and then in the next step classifies each sentence containing one of the discovered aspects into either positive, neutral or negative sentiment classes. In the subsequent chapter this framework is used to give structure to input data which does not have any structure. Also a Bayesian model is proposed for overall satisfaction that accurately predicts the overall customer satisfaction and also the significance of each discovered aspect from the contributors perspectives. Hierarchical Bayesian frameworks are powerful tools that have recently attracted a lot of attention in the machine learning community. In this dissertation a new model based on hierarchical Bayesian models is proposed to simultaneously discover aspects and sentiments. This framework is based on probabilistic topic modeling techniques.