|
Préparation des données et analyse des résultats de DEFT'05 AbstractThe text-mining challenge (DEFT) consisted of removing the non relevant sentences from French corpora of political speeches. It took place in 2005 and brought together about thirty participants belonging to eleven teams. This paper describes the preprocessings carried out on the corpora of F. Mitterrand and J. Chirac within the framework of this challenge. In particular, conversion to text format, cutting in sentences, classification of the speeches, introduction of F. Mitterrand's sentences into J. Chirac's speeches and identification of dates and people's names. The obtained results by the eleven participating teams are also presented.
[Edit] |