PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

Impact of the Initialization in Tree-Based Fast Similarity Search Techniques
Aureo Serrano, Luisa Mico and Jose Oncina
In: First international Workshop on Similarity-based pattern recognition, 28-30 Sep 2011, Venice, Italy.

Abstract

Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one randomly chosen. This article explores several techniques to choose the first pivot in a tree-based fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity. Moreover, most pivot tree-based indexes emphasizes in building balanced trees.We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones.

PDF - PASCAL Members only - Requires Adobe Acrobat Reader or other PDF viewer.
EPrint Type:Conference or Workshop Item (Poster)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Theory & Algorithms
ID Code:8616
Deposited By:Luisa Mico
Deposited On:15 February 2012