Summarization of news articles
Typ
Examensarbete för masterexamen
Program
Complex adaptive systems (MPCAS), MSc
Publicerad
2019
Författare
Beronius, Oscar
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
In this work, two neural summarization models, Seq2seq and the Transformer, with
several variations, were implemented and evaluated on the task of abstractively summarizing
news articles. Seq2seq yielded poor results, likely due to not being flexible
enough to fit the data set. The Transformer yielded promising results, and it was
discovered that the quality of the output was heavily dependent on the quality of
the input data, indicating that the implementation might be good but the performance
bottle necked by the data set. For future work, specifically in developing a
summarizer of clusters of documents, a recommended approach would be to combine
an abstractive summarizer such as the Transformer, with extractive methods.
In such a case, the Transformer could be further improved upon by pre-training it
on word embeddings such as Google BERT, or training it on additional data sets
such as CNN/Daily Mail. Finally, it was discovered that the used evaluation metric,
ROUGE, could not be considered complete for the given task, and it would thus be
advised to explore additional evaluation metrics for summarization models.
Beskrivning
Ämne/nyckelord
Summarization , Summary , NLP , Transformer , Attention , Articles , Long , Multiple , Seq2seq , Abstractive