New Methods in Machine Learning: Theory and Applications
For a dozen years, the French machine learning community has been meeting at the yearly "Conférence francophone sur l'Apprentissage Automatique" (CAp) event. Beside the well-renowned international events on the topic, this conference is the main place for french-speaking researchers to share their contributions in the field of machine learning. The liveliness of the community and its participation to various prominent actions are the reason why CAp is fortunate to receive high quality papers, which have possibly been presented and published in some of the most highly rated international meetings. This book, all written in English, contains long versions of the most appealing papers of the 2005 edition of CAp, held in Nice, France. These high quality contributions show the wide diversity of the topics addressed by CAp reasearchers; among the subjects covered one can find: (a) a result on the identifiability of the Naive Bayes classifier from asymmetrical semi-supervised data, i.e. data that are only made of positive and unlabeled examples; the usefulness of this result is illustrated through simulations on the prediction of disulfide connectivity in proteins; (b) a Bayesian approach for the clustering of short time series that is based on an original graphical model close to HMMs within which prior knowledge can be smoothly incorporated; (c) a covering number-based theoretical analysis of the learning of Bayesian networks, providing a new learning strategy for the parameters and a novel structural complexity measure; (d) Kernel Basis Pursuit, a new kernel method aimed at doing regression; this method makes it possible to automatically leverage the availability of different kernels, while optimizing a sound criterion balancing goodness-of-fit and model complexity; (e) a comparison study between learning techniques in grammatical inference, namely the classical techniques used to infer regular grammars and those devoted to the inference of categorial grammars; and (f), insights and theoretical results from statistical learning theory in the context of genetic programming that specifically stress out the influence of bloat on universal consistency.