PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Spanning spaces : learning cross-lingual similarities
Jan Rupnik, Andrej Muhič and Primož Škraba
In: NIPS 2011, 16-17 Dec 2011, Sierra Nevada, Spain.


In analyzing multilingual text corpora, we have the practical problem of computing similarities between documents in different languages. Given two documents in different languages, we use monolingual similarity to an aligned set to compute a similarity across languages. We derive several algorithms and show their relationship the choice of similarity function. We also show experimental results illustrating the approach.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
Information Retrieval & Textual Information Access
ID Code:8732
Deposited By:Jan Rupnik
Deposited On:21 February 2012