PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

EXIT: Un système itératif pour l'extraction de la terminologie du domaine à partir de corpus spécialisés
Mathieu Roche, Thomas Heitz, Oriane Matte-Tailliez and Yves Kodratoff
In: 7èmes Journées internationales d'Analyse statistique des Données Textuelles (JADT'04), 10-12 March 2004, Belgium.


The work presented in this paper is relative to the discovery of a significant terminology in specialized texts. Our approach, partly based on statistical methods extracts the terms in an iterative way. At first, the only terms looked for are binary. The binary terms detected during this first phase are included in the corpus, and the process is iteratively repeated in order to detect very long terms, that happen often to be the most significant terms, as our experience in molecular biology has clearly shown.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:640
Deposited By:Mathieu Roche
Deposited On:29 December 2004