Extractive-abstractive summarization with pointer and coverage mechanism

Published: 18 May 2018


Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization. However, they are facing the challenges of low efficiency and accuracy when dealing with long text: their capability are not enough to handle very long input, they can not reproduce factual details accurately, and they tend to repeat themselves. In this paper, we propose an extractive and abstractive hybrid model. In the extractive part, we construct a graph model and propose a hybrid sentence similarity measure by combining sentence vector and Levenshtein. Then use this measure to rank and extract key sentences and concatenate the key sentences into a shorter text as the input of the summary generator. In the abstractive part, we make two improvement to the standard sequence-to-sequence attentional model. First, we use pointer mechanism to copy words from the source text, which helps the seq2seq generator to handle out-of-vocabulary (OOV) problem. Second, we use coverage mechanism to avoid repetition. We collect a financial news dataset and apply our model to the financial news summarization task, outperforming state-of-the-art method by at least 4.7 ROUGE points.


    Author Tags

    1. abstractive summarization
    2. coverage mechanism
    3. extractive summarization
    4. pointer mechanism


