PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Universal Consistency and Bloat in GP : Some theoretical considerations about Genetic Programming from a Statistical Learning Theory viewpoint
Sylvain Gelly, Olivier Teytaud, Nicolas Bredeche and Marc Schoenauer
RIA Number 20, pp. 805-827, 2005.


In this paper, we provide an analysis of Genetic Programming (GP) from the Statistical Learning Theory viewpoint in the scope of symbolic regression. Firstly, we are interested in Universal Consistency, i.e. the fact that the solution minimizing the empirical error does converge to the best possible error when the number of examples goes to infinity, and secondly, we focus our attention on the uncontrolled growth of program length (i.e. bloat), which is a well-known problem in GP. Results show that (1) several kinds of code bloats may be identified and that (2) Universal consistency can be obtained as well as avoiding bloat under some conditions. We conclude by describing an ad hoc method that makes it possible simultaneously to avoid bloat and to ensure universal consistency.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Article
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
Theory & Algorithms
ID Code:2418
Deposited By:Sylvain Gelly
Deposited On:22 November 2006