Improved Unsupervised POS Induction through Prototype Discovery
Omri Abend, Roi Reichart and Ari Rappoport
We present a novel fully unsupervised algorithm
for POS induction from plain text,
motivated by the cognitive notion of prototypes.
The algorithm first identifies landmark
clusters of words, serving as the
cores of the induced POS categories. The
rest of the words are subsequently mapped
to these clusters. We utilize morphological
and distributional representations
computed in a fully unsupervised manner.
We evaluate our algorithm on English and
German, achieving the best reported results
for this task.