Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

The Elements of Automatic Summarization

Abstract

This thesis is about automatic summarization, with experimental results on multi- document news topics: how to choose a series of sentences that best represents a collection of articles about one topic. I describe prior work and my own improvements on each component of a summarization system, including preprocessing, sentence valuation, sentence selection and compression, sentence ordering, and evaluation of summaries. The centerpiece of this work is an objective function for summarization that I call "maximum coverage". The intuition is that a good summary covers as many possible important facts or concepts in the original documents. It turns out that this objective, while computationally intractable in general, can be solved efficiently for medium-sized problems and has reasonably good fast approximate solutions. Most importantly, the use of an objective function marks a departure from previous algorithmic approaches to summarization.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View