A Segmented Topic Model based on the Two-parameter Poisson-Dirichlet Process
Lan Du, Wray Buntine and H Jin
In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2010, Barcelona, Spain.
Conference Mining has been an important problem discussed these days for the purpose of academic recommendation. Previous approaches mined conferences by using network connectivity or by using semantics-based intrinsic structure of the words present between documents (modeling from document level (DL)), while ignored semantics-based intrinsic structure of the words present between conferences. In this paper, we address this problem by considering semantics-based intrinsic structure of the words present in conferences (richer semantics) by modeling from conference level (CL). We propose a generalized topic modeling approach based on Latent Dirichlet Allocation (LDA) named as Conference Mining (ConMin). By using it we can discover topically related conferences, conferences correlations and conferences temporal topic trends. Experimental results show that proposed approach significantly outperformed baseline approach in discovering topically related conferences and finding conferences correlations because of its ability to produce less sparse topics.