Press "Enter" to skip to content

An Overview of Automatic Text Summarization

Kevin Jacobs looks at the state of the art with respect to automatic text summarization:

Automatic text summarization comes in two flavours: extractive summarization and abstractive summarization. Extractive summarization models take exact phrases from the reference documents and use them as a summary. One of the very first research papers on (extractive) text summarization is the work of Luhn [1]. TextRank [2] (based on the concepts used by the PageRank algorithm) is another widely used extractive summarization model.

In the era of deep learning, abstractive summarization became a reality. With abstractive summarization, a model generates a text instead of using literal phrases of the reference documents. One of the more recent works on abstractive summarization is PEGASUS [3] (a demo is available at HuggingFace).

Click through for a couple contemporary examples as well as a few pain points you can experience when using the current set of libraries and algorithms.