Skip to main content
eScholarship
Open Access Publications from the University of California

UC Davis

UC Davis Electronic Theses and Dissertations bannerUC Davis

Learning from Sequences in Education, History, and Natural Language

Abstract

One of the most powerful ideas in natural language processing is the distributional hypothesis which indicates that words with similar distributions tend to have similar meanings. This led to huge movement in learning from word sequences and as a result sequential-based learning is considered one of main tools in natural language processing today. In fact, many of the recent breakthroughs in natural language processing (Word2Vec, BERT, ...) learn by exploiting sequential properties of natural language.

In this dissertation, we further explore learning from sequences more and push the bound- aries on what can be learned solely from sequences. We investigate different sequences in diverse settings, ranging from educational and historical sequences to sequences of mor- phologically rich languages. These investigations provide us with insights and answers to questions such as: 1) can we learn good representations of non-linguistic items from their sequences?, 2) is it possible to create state of the art natural language processing models by simply rethinking sequences?, and 3) is learning an end-to-end named-entity disambigua- tion/entity linking system entirely from sequences feasible? Answers to these questions enlighten the machine learning, natural language processing, and computational linguistics communities on the potential, yet to be harnessed, in sequences.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View