PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Multi-document summarization using A* search and discriminative training
Ahmet Aker, Trevor Cohn and Robert Gaizauskas
In: EMNLP 2010, 9-11 Oct 2010, Boston, Cambridge, MA, USA.


In this paper we address two key challenges for extractive multi-document summarization: the search problem of finding the best scoring summary and the training problem of learning the best model parameters. We propose an A* search algorithm to find the best extractive summary up to a given length, which is both optimal and efficient to run. Further, we propose a discriminative training algorithm which directly maximises the quality of the best summary, rather than assuming a sentence-level decomposition as in earlier work. Our approach leads to significantly better results than earlier techniques across a number of evaluation metrics.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:8137
Deposited By:Trevor Cohn
Deposited On:29 April 2011