Analysis of High-Dimensional Data with Partial Least Squares and Boosting
PhD thesis, TU Berlin.
The crucial task in the statistical analysis of high-dimensional data is to model relationships between a large amount p of variables based on a small number n of observations.The high dimensionality of the data often forms an obstacle, as for p> n, the traditional statistical techniques fail to produce satisfactory results. Furthermore, the structure of the data can be complex. In this work, we investigate high-dimensional and complex data with the help of two methods: Partial Least Squares and Boosting for functional data.
Partial Least Squares (PLS) models the relationship between different blocks of variables in terms of so-called latent variables. In the case of more than two blocks, the PLS-techniques are also called path models and can be seen as a generalization of Canonical Correlation Analysis. The mathematical properties of PLS are for the most parts not yet established. For example, it is neither known whether the PLS algorithms converge numerically, nor - in the case that they converge - if they produce solutions of a sensible optimization criterion. In this work, we establish a sound mathematical framework for the description of PLS path models. We show that for a large part of the PLS algorithms, there is indeed no twice-differentiable optimization problem. Furthermore, we show on simulated data that another part of the PLS algorithms can converge only to a local solution of an optimization problem.
PLS can also be used to solve regression problems. In this case, it leads to a substantial reduction of the dimension of the data, which hopefully leads to better prediction rules. In this work, we present an extension of PLS using penalization techniques. This method is then used to estimate generalized additive models (GAM's). This approach turns out to be a good alternative to traditional GAM-methods in the case of high-dimensional data. Based on the well-known relationship between PLS and the conjugate gradient technique, we prove that penalized PLS is equal to a preconditioned conjugate gradient technique. Subsequently, we exploit the connections between PLS and linear algebra to investigate empirically the so-called shrinkage properties of PLS. In addition, we derive an unbiased estimate of the degrees of freedom of PLS.
Boosting has its seed in the machine learning community. The basic idea is to combine several, simple models in such a way that their combination leads to better prediction rules. In this work, we develop Boosting algorithms for complex data structures. Our focus is on data that are (discrete) measurements of curves. The established Boosting methods implicitly assume that the observed
variables lie in a finite-dimensional vector space. We show that an extension of Boosting to infinite-dimensional function spaces is straightforward. Furthermore, we illustrate how to detect relevant features of the investigated functions and how to produce simple and interpretable models. This is done by applying wavelet or Fourier transformations to the data and by then applying suitable Boosting algorithms.