A new imputation method for incomplete binary data
In data analysis problems where the data are represented by vectors of real numbers, it is often the case that some of the data-points will have “missing values”, meaning that one or more of the entries of the vector that describes the data-point is not observed. In this paper, we propose a new approach to the imputation of missing binary values. The technique we introduce employs a “similarity measure” introduced by Anthony and Hammer (2006). We compare experimentally the performance of our technique with ones based on the usual Hamming distance measure and multiple imputation.