Matrix centering and low-rank approximation
Low-rank approximation, better known in the machine learning literature as principal component analysis, is a general data modeling tool. Data centering, on the other hand, is a common preprocessing step. This paper studies the combination of low-rank approximation with data centering. Three types of matrix means and corresponding centering methods are distinguished: column, row, and total centering. We prove that the two stage procedure of 1) computing the column (or row) mean of the data, and 2) computing the low rank approximation of the centered data matrix yields a solution of the original problem of simultaneous column (or row) centering and low-rank approximation. The same is not true, however, in the case of total centering. Two local optimization methods are proposed in this case. The first one performs one dimensional search and the other one is an alternating least squares type algorithm. Both algorithm can be used for weighted and structured low-rank approximation problems with total centering.