A Hierarchical Pitman-Yor Language Model for Information Retrieval
Saeedeh Momtazi and Dietrich Klakow
In: SIGIR 2010, 19 July - 23 July 2010, Geneva, Switzerland.
In this paper, we propose a new application of Bayesian lan-
guage model based on Pitman-Yor process for information
retrieval. This model is a generalization of the Dirichlet
distribution. The Pitman-Yor process creates a power-law
distribution which is one of the statistical properties of word
frequency in natural language. Our experiments on Ro-
bust04 indicate that this model improves the document re-
trieval performance compared to the commonly used Dirich-
let prior and absolute discounting smoothing techniques.