Feature construction for memory-based semantic role labeling of Catalan and Spanish
To improve the performance of a single-classifier memory-based semantic role labeling system for Catalan and Spanish, we construct new predictive features based on originally multi-valued features. We split and binarize these features, and construct new features by combining the most informative multi-valued features. The new system is tested on in-domain and out-of-domain corpora, achieving state-of-the-art performance, and error reductions ranging from 6.93 to 20.59 over the original system. The improvements are due to new features constructed out of the two most informative original features, viz. the syntactic function and the preposition of the sibling phrase in focus.