A nearest neighbor data structure for graphics hardware
Nearest neighbor search is a core computational task in database systems and throughout data analysis. It is also a major computational bottleneck, and hence an enormous body of research has been devoted to data structures and algorithms for accelerating the task. Recent advances in graphics hardware provide tantalizing speedups on a variety of tasks and suggest an alternate approach to the problem: simply run brute force search on a massively parallel system. In this paper we marry the approaches with a novel data structure that can efectively make use of parallel systems such as graphics cards. The architectural complexities of graphics hardware|the high degree of parallelism, the small amount of memory relative to instruction throughput, and the single instruction, multiple data design|present signicant challenges for data structure design. Furthermore, the brute force approach applies perfectly to graphics hard- ware, leading one to question whether an intelligent algorithm or data structure can even hope to outperform this basic approach. Despite these challenges and misgivings, we demonstrate that our data structure|termed a Random Ball Cover|provides signicant speedups over the GPU-based brute force approach.