Auto-tuning similarity search algorithms on multi-core architectures

Date
2013
Authors
Gedik, B.
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
International Journal of Parallel Programming
Print ISSN
0885-7458
Electronic ISSN
Publisher
Volume
41
Issue
5
Pages
595 - 620
Language
English
Type
Article
Journal Title
Journal ISSN
Volume Title
Series
Abstract

In recent times, large high-dimensional datasets have become ubiquitous. Video and image repositories, financial, and sensor data are just a few examples of such datasets in practice. Many applications that use such datasets require the retrieval of data items similar to a given query item, or the nearest neighbors (NN or k -NN) of a given item. Another common query is the retrieval of multiple sets of nearest neighbors, i.e., multi k -NN, for different query items on the same data. With commodity multi-core CPUs becoming more and more widespread at lower costs, developing parallel algorithms for these search problems has become increasingly important. While the core nearest neighbor search problem is relatively easy to parallelize, it is challenging to tune it for optimality. This is due to the fact that the various performance-specific algorithmic parameters, or "tuning knobs", are inter-related and also depend on the data and query workloads. In this paper, we present (1) a detailed study of the various tuning knobs and their contributions on increasing the query throughput for parallelized versions of the two most common classes of high-dimensional multi-NN search algorithms: linear scan and tree traversal, and (2) an offline auto-tuner for setting these knobs by iteratively measuring actual query execution times for a given workload and dataset. We show experimentally that our auto-tuner reaches near-optimal performance and significantly outperforms un-tuned versions of parallel multi-NN algorithms for real video repository data on a variety of multi-core platforms. © 2013 Springer Science+Business Media New York.

Course
Other identifiers
Book Title
Keywords
Auto - tuning, Nearest neighbor search, Parallelization, Algorithmic parameters, Autotuning, Multi - core platforms, Multicore architectures, Near - optimal performance, Parallelized version, Iterative methods, Knobs, Learning algorithms, Optimization, Program processors, Tuners, Computer architecture
Citation
Published Version (Please cite this version)