Comparison of Independent Component Analysis and Singular Value Decomposition in Word Context Analysis
In earlier studies we have been able show that independent component analysis is able to extract automatically meaningful linguistic features. The emergent syntactic and semantic features are based on an analysis of the words in their contexts in large corpora. We have also shown that there is a reasonably strong correlation between traditional features and categories defined by linguists and the emergent features. In this article, we introduce a new measure for comparing the emergent and the traditionally defined features. We apply this measure to compare the emergent features produced by singular value decomposition (SVD) and independent component analysis (ICA). The conclusion is that the ICA-based features correspond to the human intuitions much more closely than the SVD-based features not only in a visual inspection but also in a systematic and principled comparison.