PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Dependency detection with similarity constraints
Leo Lahti, Samuel Myllykangas, Sakari Knuutila and Samuel Kaski
In: Proc. MLSP 2009, IEEE International Workshop on Machine Learning for Signal Processing (2009) IEEE , pp. 89-94.

Abstract

Unsupervised two-view learning, or detection of dependencies between two paired data sets, is typically done by some variant of canonical correlation analysis (CCA). CCA searches for a linear projection for each view, such that the correlations between the projections are maximized. The solution is invariant to any linear transformation of either or both of the views; for tasks with small sample size such flexibility implies overfitting, which is even worse for more flexible nonparametric or kernel-based dependency discovery methods. We develop variants which reduce the degrees of freedom by assuming constraints on similarity of the projections in the two views. A particular example is provided by a cancer gene discovery application where chromosomal distance affects the dependencies between gene copy number and activity levels. Similarity constraints are shown to improve detection performance of known cancer genes.

EPrint Type:Book Section
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:6292
Deposited By:Samuel Kaski
Deposited On:08 March 2010