PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Rule meta-learning for trigram-based sequence processing
Sander Canisius, Antal van den Bosch and Walter Daelemans
In: Fourth Learning Language in Logic Workshop, Aug 2005, Bonn, Germany.


Predicting overlapping trigrams of class labels is a recently-proposed method to improve performance on sequence labelling tasks. In this method, sequence elements are effectively classified three times, therefore some procedure is needed to post-process those overlapping classifications into one output sequence. In this paper, we present a rule-based procedure learned automatically from training data. In combination with a memory-based leaner predicting class trigrams, the performance of this meta-learned overlapping trigram post-processor matches that of a handcrafted post-processing rule used in the original study on class trigrams. Moreover, on two domain-specific entity chunking tasks, the class trigram method with automatically learned post-processing rules compares favourably with recent probabilistic sequence labelling techniques, such as maximum-entropy markov models and conditional random fields.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
ID Code:1386
Deposited By:Walter Daelemans
Deposited On:28 November 2005