Mining Probabilistic Automata: a New Way to Sequence Mining
Jacquemont Stéphanie, François Jacquenet and Marc Sebban
In: Third International Workshop on Mining Graphs, Trees and Sequences at ECML/PKDD 2005, 7 Oct 2005, Porto, Portugal.
We propose a new sequence mining algorithm that extracts constrained frequent patterns from a probabilistic finite state automaton (PDFA). Even if PDFAs have not received wide attention in sequence mining, we show that the use of a learned compact summary of the input-sequences, rather than a costly exact representation, is very relevant in this domain. We propose two kinds of constraints for extracting particular patterns from the PDFA, reducing then the search space. Experiments show the utility of considering such constraints in sequence mining.