PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A Hierarchical Pitman-Yor Language Model for Information Retrieval
Saeedeh Momtazi and Dietrich Klakow
In: SIGIR 2010, 19 July - 23 July 2010, Geneva, Switzerland.

Abstract

In this paper, we propose a new application of Bayesian lan- guage model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distribution. The Pitman-Yor process creates a power-law distribution which is one of the statistical properties of word frequency in natural language. Our experiments on Ro- bust04 indicate that this model improves the document re- trieval performance compared to the commonly used Dirich- let prior and absolute discounting smoothing techniques.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
Information Retrieval & Textual Information Access
ID Code:8866
Deposited By:Diana Schreyer
Deposited On:21 February 2012