PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Estimating Likelihoods for Topic Models
Wray Buntine
In: ACML 2009, 3-5 Nov 2009, Nanjing, China.

Abstract

Topic models are a discrete analogue to principle component analysis and independent component analysis that model {\it topic} at the word level within a document. They have many variants such as NMF, PLSI and LDA, and are used in many fields such as genetics, text and the web, image analysis and recommender systems. However, only recently have reasonable methods for estimating the likelihood of unseen documents, for instance to perform testing or model comparison, become available. This paper explores a number of recent methods, and improves their theory, performance, and testing.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:5482
Deposited By:Wray Buntine
Deposited On:29 October 2009