PASCAL - Pattern Analysis, Statistical Modelling and Computational Learning

The influence of weighting the K-occurrences on hubness-aware classification method
Nenad Tomašev and Dunja Mladenić
In: IS 2011, 8-12 Oct 2011, Ljubljana, Slovenia.

Abstract

Hubness is a phenomenon present in many highdimensional data sets. It is related to the skewness in the distribution of k-occurrences, i.e. occurrences of data points in k-neighbor sets of other data points. Several hubnessaware methods that focus on exploiting this phenomenon have recently been proposed. In this paper, we examine the potential impact of weighting the k-occurrences, by taking into account the distance between the respective data points, on hubness-aware nearest-neighbor methods, more specifically hw-kNN, h-FNN and HIKNN. We show that such distance-based weighting can be both advantageous and detrimental and that it influences different methods in different ways.

EPrint Type:Conference or Workshop Item (Paper)
Project Keyword:Project Keyword UNSPECIFIED
Subjects:Natural Language Processing
Information Retrieval & Textual Information Access
ID Code:8726
Deposited By:Jan Rupnik
Deposited On:21 February 2012