Temporal analysis of text data using latent variable models
Lasse Mølgaard, Jan Larsen and Cyril Goutte
In: 2009 IEEE International Workshop on MACHINE LEARNING FOR SIGNAL PROCESSING, 02-04 Sep 2009, Grenoble, France.
Detecting and tracking of temporal data is an important task in multiple applications. In this paper we study temporal text mining methods for Music Information Retrieval. We compare two ways of detecting the temporal latent semantics of a corpus extracted from Wikipedia, using a stepwise Probabilistic Latent Semantic Analysis (PLSA) approach and a global multiway PLSA method. The analysis indicates that the global analysis method is able to identify relevant trends which are difficult to get using a step-by-step approach. Furthermore we show that inspection of PLSA models with different number of factors may reveal the stability of temporal clusters making it possible to choose the relevant number of factors.