A Two Stage Transformer Based Approach for Variable Length Abstractive Summarization in Python

A Two Stage Transformer Based Approach for Variable Length Abstractive Summarization in Python

Abstract:

This study proposes a two-stage method for variable-length abstractive summarization. This is an improvement over previous models, in that the proposed approach can simultaneously achieve fluent and variable-length abstractive summarization. The proposed abstractive summarization model consists of a text segmentation module and a two-stage Transformer-based summarization module. First, the text segmentation module utilizes a pre-trained Bidirectional Encoder Representations from Transformers (BERT) and a bidirectional long short-term memory (LSTM) to divide the input text into segments. An extractive model based on the BERT-based summarization model (BERTSUM) is then constructed to extract the most important sentence from each segment. For training the two-stage summarization model, first, the extracted sentences are used to train the document summarization module in the second stage. Next, the segments are used to train the segment summarization module in the first stage by simultaneously considering the outputs of the segment summarization module and the pre-trained second-stage document summarization module. The parameters of the segment summarization module are updated by considering the loss scores of the document summarization module as well as the segment summarization module. Finally, collaborative training is applied to alternately train the segment summarization module and the document summarization module until convergence.