PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Generation of incomplete test-data using bayesian networks
Olivier François and Philippe Leray
In: EEE IJCNN, International Joint Conference on Neural Networks, Orlando, USA(2007).


We introduce a new method based on Bayesian Network formalism for automatically generating incomplete datasets. This method can either be configured randomly to generate various datasets with respect to a global percentage of missing data or manually in order to handle many parameters. [1] proposed three types of missing data : MCAR (missing completly at random), MAR (missing at random) and NMAR (not missing at random). The proposed approach can successfully generate all MCAR data mechanisms and most of MAR data mechanisms. NMAR data generation is very difficult to manage automatically but we propose some hints in order to cover some of the NMAR data situations.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Learning/Statistics & Optimisation
ID Code:3770
Deposited By:Philippe Leray
Deposited On:21 February 2008