PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Cross-Language Retrieval Using Link-Based Language Models
Benjamin Roth and Dietrich Klakow
In: SIGIR 2010(2010).

Abstract

We propose a cross-language retrieval model that is solely based on Wikipedia as a training corpus. The main contributions of our work are: 1. A translation model based on linked text in Wikipedia and a term weighting method associated with it. 2. A combination scheme to interpolate the link translation model with retrieval based on Latent Dirichlet Allocation. On the CLEF 2000 data we achieve improvement with respect to the best German-English system at the bilingual track (non-significant) and improvement against a baseline based on machine translation (significant).

EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
Information Retrieval & Textual Information Access
ID Code:8859
Deposited By:Grzegorz Chrupala
Deposited On:21 February 2012