The Discrete Basis Problem
Pauli Miettinen, Taneli Mielikäinen, Aristides Gionis, Gautam Das and Heikki Mannila
In: PKDD 2006, September 18-22, Berlin, Germany.
Matrix decomposition methods represent a data matrix as
a product of two smaller matrices: one containing basis vectors that
represent meaningful concepts in the data, and another describing how
the observed data can be expressed as combinations of the basis vectors.
Decomposition methods have been studied extensively, but many
methods return real-valued matrices. If the original data is binary, the
interpretation of the basis vectors is hard. We describe a matrix decomposition
formulation, the Discrete Basis Problem. The problem seeks for
a Boolean decomposition of a binary matrix, thus allowing the user to
easily interpret the basis vectors. We show that the problem is computationally
difficult and give a simple greedy algorithm for solving it.
We present experimental results for the algorithm. The method gives
intuitively appealing basis vectors. On the other hand, the continuous
decomposition methods often give better reconstruction accuracies. We
discuss the reasons for this behavior.