PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

A Generalized Framework for Revealing Analogous Themes across Related Topics
Zvika Marx, Ido Dagan and Eli Shamir
In: HLT/EMNLP 2005, 6-8 Oct 2005, Vancouver, B.C., Canada.

Abstract

This work addresses the task of identifying thematic correspondences across subcorpora focused on different topics. We introduce an unsupervised algorithmic framework based on distributional data clustering, which generalizes previous initial works on this task. The empirical results reveal interesting commonalities of different religions. We evaluate the results through measuring the overlap of our clusters with clusters compiled manually by experts. The tested variants of our framework are shown to outperform alternative methods applicable to the task.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Computational, Information-Theoretic Learning with Statistics
Natural Language Processing
ID Code:1774
Deposited By:Ido Dagan
Deposited On:28 November 2005