Accelerating massive queries of approximate nearest neighbor search on high-dimensional data
Publication in refereed journal

Altmetrics Information

Other information
AbstractApproximate nearest neighbor (ANN) search on high-dimensional data is a fundamental operation in many applications. In this paper, we study massive queries of ANN (MQ-ANN) search, which deals with a large number of queries simultaneously. To improve the throughput, we combine the parallel capacity of multi-core CPUs and the filtering power of the state-of-the-art index methods, i.e., proximity graphs. However, there are no solutions that exploit proximity graphs to handle MQ-ANN in parallel, except the one called query view, which simply assigns each query to a hardware thread but suffers from numerous cache misses. As the first attempt, we design efficient methods for MQ-ANN with proximity graphs and propose a novel scheduling mechanism called bridge view, which shares the same data access across multiple queries in order to reduce cache misses. Moreover, we extend our method to deal with MQ-ANN on large-scale data sets (e.g. 10^8 points). Finally, we conduct extensive experiments on real data sets to demonstrate the advantages of our method. According to our experimental results, bridge view significantly outperforms query view in various settings. In particular, bridge view with 8 hardware threads even outperforms query view with 24 hardware threads.
Acceptance Date26/04/2023
All Author(s) ListYingfan Liu, Chaowei Song, Hong Cheng, Xiaofang Xia, Jiangtao Cui
Journal nameKnowledge and Information Systems (KAIS): An International Journal
Volume Number65
Issue Number10
Pages4185 - 4212
LanguagesEnglish-United States
KeywordsMassive queries, Approximate nearest neighbor search, High-dimensional data, Proximity graphs, Parallel algorithms

Last updated on 2024-02-01 at 16:00