The Elements of Automatic Summarization
This thesis is about automatic summarization, with experimental results on multi- document news topics: how to choose a series of sentences that best represents a collection of articles about one topic. I describe prior work and my own improvements on each component of a summarization system, including preprocessing, sentence valuation, sentence selection and compression, sentence ordering, and evaluation of summaries. The centerpiece of this work is an objective function for summarization that I call "maximum coverage". The intuition is that a good summary covers as many possible important facts or concepts in the original documents. It turns out that this objective, while computationally intractable in general, can be solved efficiently for medium-sized problems and has reasonably good fast approximate solutions. Most importantly, the use of an objective function marks a departure from previous algorithmic approaches to summarization.