|
A Hierarchical Pitman-Yor Language Model for Information Retrieval AbstractIn this paper, we propose a new application of Bayesian lan- guage model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distribution. The Pitman-Yor process creates a power-law distribution which is one of the statistical properties of word frequency in natural language. Our experiments on Ro- bust04 indicate that this model improves the document re- trieval performance compared to the commonly used Dirich- let prior and absolute discounting smoothing techniques.
[Edit] |