PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Algorithms for Discovering Bucket Orders from Data
Aristides Gionis, Heikki Mannila, Kai Puolamäki and Antti Ukkonen
In: ACM SIGKDD 2006, 20-23 Aug 2006, Philadelphia, PA, USA.

Abstract

Ordering and ranking items of different types are important tasks in various applications, such as query processing and scientific data mining. A total order for the items can be misleading, since there are groups of items that have practically equal ranks. We consider bucket orders, i.e., total orders with ties. They can be used to capture the essential order information without overfitting the data: they form a useful concept class between total orders and arbitrary partial orders. We address the question of finding a bucket order for a set of items. We also discuss methods for computing the pairwise precedence data. We describe simple and efficient algorithms for finding good bucket orders. Several of the algorithms have a provable approximation guarantee, and they scale well to large datasets. We provide experimental results on artificial data and a real data that show the usefulness of bucket orders and demonstrate the accuracy and efficiency of the algorithms.

??
EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:2573
Deposited By:Kai Puolamäki
Deposited On:22 November 2006