PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Etude de Mesures de Qualité pour Classer les Termes Extraits de Corpus Spécialisés
Mathieu Roche, Oriane Matte-Tailliez and Yves Kodratoff
In: INFORSID 2004, 25-28 May 2004, Biarritz, France.

Abstract

This paper compares several quality measures used to order the terms of a corpus of speciality. We also explain the improvements we brought to these measures. The terms are said to be relevant when a domain expert declares they are linguistic instances of concepts. We built an experimental protocol in order to evaluate the quality measures, on four corpora written in French and English. This work shows a large variation in the relevance of the terms, depending on the language and application field. This large variation is linked also to the grammatical tag of words making up the terms.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:650
Deposited By:Mathieu Roche
Deposited On:29 December 2004