PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Toward behavioral modeling of a grid system:mining the logging and bookkeeping files
Xiangliang Zhang, Michele Sebag and Cecile Germain
In: IEEE ICDM 2007 workshop - DSMM 07, Oct 28-31, 2007, Omaha, Nebraska, USA.

Abstract

Grid systems are complex heterogeneous systems, and their modeling constitutes a highly challenging goal. This paper is interested in modeling the jobs handled by the EGEE grid, by mining the Logging and Bookkeeping files. The goal is to discover meaningful job clusters, going beyond the coarse categories of "successfully terminated jobs" and "other jobs". The presented approach is a three-step process: i) Data slicing is used to alleviate the job heterogeneity and afford discriminant learning; ii) Constructive induction proceeds by learning discriminant hypotheses from each data slice; iii) Finally, double clustering is used on the representation built by constructive induction; the clusters are fully validated after the stability criteria proposed by Meila (2006). Lastly, the job clusters are submitted to the experts and some meaningful interpretations are found.

PDF - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:3684
Deposited By:Xiangliang Zhang
Deposited On:14 February 2008