PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Mining texts by association rules discovery in a technical corpus
Mathieu Roche, Jérome Azé, Oriane Matte-Tailliez and Yves Kodratoff
In: New Trends in Intelligent Information Processing and Web Mining (IIPWM'04), 17-20 May 2004, Zakopane, Poland,.


The text mining tools proposed in this paper extract association rules from a set of specialized and homogeneous texts (corpus). This tool is built in several steps and, at each of them, the expert plays a fundamental role. The first step extracts the terms from the corpus, and clusters them in classes by semantic similarity, associating each class to a concept meaningful to a field expert. Using the knowledge thus obtained, the corpus generates a table of concept frequencies in the texts. Next, we discretize the values of this table, and finally we are able to extract association rules among the concept occurrences.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
Information Retrieval & Textual Information Access
ID Code:645
Deposited By:Mathieu Roche
Deposited On:29 December 2004