PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Hierarchical Dirichlet Trees for Information Retrieval
Yee Whye Teh and Gholamreza Haffari
In: NAACL-HLT 2009, 31 May - 05 Jun 2009, Colorado, USA.

Abstract

We propose a probabilisitc framework which uses trees over the vocabulary to capture similarities among terms in an information retrieval setting. This allows the retrieval of documents based not just on occurrences of specific query terms, but also on similarities between terms. Additionally our generative model exhibits an effect similar to inverse document frequency. Experimentally, the resulting model produces improved retrieval results.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Information Retrieval & Textual Information Access
ID Code:4693
Deposited By:Yee Whye Teh
Deposited On:24 March 2009