PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

XML Structure Mapping Application to the PASCAL/INEX 2006 XML Document Mining Track
francis maes, Ludovic Denoyer and Patrick Gallinari
In: INEX 2007(2007).


We address the problem of learning to map automatically at and semi-structured documents onto a mediated target XML schema.We propose a machine learning approach where the mapping between input and target documents is learned from examples. Complex transforma- tions can be learned using only pairs of input and corresponding target documents. The model sequentially builds the target XML document by processing the input document node per node. We demonstrate the e- ciency of our model on two structure mapping tasks that were introduced during the second year of the XML Mining Challenge at INEX 2006 with the goal of learning how to deal with XML heterogeneity

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Information Retrieval & Textual Information Access
ID Code:3661
Deposited By:Ludovic Denoyer
Deposited On:14 February 2008