A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction
Phil Blunsom and Trevor Cohn
In: ACL-HLT 2011, 19-24 Jun 2011, Portland, OR, USA.
In this work we address the problem of
unsupervised part-of-speech induction
by bringing together several strands of
research into a single model. We develop a
novel hidden Markov model incorporating
sophisticated smoothing using a hierarchical
Pitman-Yor processes prior, providing an
elegant and principled means of incorporating
lexical characteristics. Central to our
approach is a new type-based sampling
algorithm for hierarchical Pitman-Yor models
in which we track fractional table counts.
In an empirical evaluation we show that our
model consistently out-performs the current
state-of-the-art across 10 languages.