Exploring the hubness-related properties of oceanographic sensor data
In this paper we examine how the high dimensionality of oceanographic sensor data impacts the potential use of nearest-neighbor machine learning methods. We focus on one particular consequence of the curse of dimensionality – hubness. We examine the hubness of oceanographic data and show how it can be used to visualize and detect both prototypical sensors/locations, as well as ambiguous and potentially erroneous ones. We proceed to define an easy classification problem on the data, showing that the recently developed hubness-aware classification methods may help to overcome some of the hubness-related issues in sensor data.