Topic-specific link analysis using independent components for information retrieval
Wray Buntine, Jaakko Löfström, Sami Perttu and Kimmo Valtonen
In: AAAI-05 Workshop: Link Analysis, 9-10 July 2005, Pittsburgh, Pennsylvania.
There has been mixed success in applying semantic component analysis
(LSA, PLSA, discrete PCA, etc.) to information retrieval. Previous
experiments have shown that high-fidelity language models do not imply
good quality retrieval. Here we combine link analysis with discrete
PCA (a semantic component method) to develop an auxiliary score for
information retrieval that is used in post-filtering documents
retrieved via regular Tf.Idf methods. For this, we use a
topic-specific version of link analysis based on topics developed
automatically via discrete PCA methods. To evaluate the resultant
topic and link based scoring, a demonstration has been built using the
Wikipedia, the public domain encyclopedia on the web.