PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Préparation des données et analyse des résultats de DEFT'05
Erick Alphonse, Ahmed Amrani, Jérôme Azé, Thomas Heitz, Amar-Djalil Mezaour and Mathieu Roche
In: DEFT'05 (DEfi Fouille de Textes), 10 June, Dourdan, France.

Abstract

The text-mining challenge (DEFT) consisted of removing the non relevant sentences from French corpora of political speeches. It took place in 2005 and brought together about thirty participants belonging to eleven teams. This paper describes the preprocessings carried out on the corpora of F. Mitterrand and J. Chirac within the framework of this challenge. In particular, conversion to text format, cutting in sentences, classification of the speeches, introduction of F. Mitterrand's sentences into J. Chirac's speeches and identification of dates and people's names. The obtained results by the eleven participating teams are also presented.

EPrint Type:Conference or Workshop Item (Talk)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
Information Retrieval & Textual Information Access
ID Code:1993
Deposited By:Mathieu Roche
Deposited On:11 January 2006