PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Topic-specific link analysis using independent components for information retrieval
Wray Buntine, Jaakko Löfström, Sami Perttu and Kimmo Valtonen
In: AAAI-05 Workshop: Link Analysis, 9-10 July 2005, Pittsburgh, Pennsylvania.


There has been mixed success in applying semantic component analysis (LSA, PLSA, discrete PCA, etc.) to information retrieval. Previous experiments have shown that high-fidelity language models do not imply good quality retrieval. Here we combine link analysis with discrete PCA (a semantic component method) to develop an auxiliary score for information retrieval that is used in post-filtering documents retrieved via regular Tf.Idf methods. For this, we use a topic-specific version of link analysis based on topics developed automatically via discrete PCA methods. To evaluate the resultant topic and link based scoring, a demonstration has been built using the Wikipedia, the public domain encyclopedia on the web.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Information Retrieval & Textual Information Access
ID Code:990
Deposited By:Wray Buntine
Deposited On:19 June 2005