Indexing large metric spaces for similarity search queries

Article Properties
Abstract
Cite
Bozkaya, Tolga, and Meral Ozsoyoglu. “Indexing Large Metric Spaces for Similarity Search Queries”. ACM Transactions on Database Systems, vol. 24, no. 3, 1999, pp. 361-04, https://doi.org/10.1145/328939.328959.
Bozkaya, T., & Ozsoyoglu, M. (1999). Indexing large metric spaces for similarity search queries. ACM Transactions on Database Systems, 24(3), 361-404. https://doi.org/10.1145/328939.328959
Bozkaya T, Ozsoyoglu M. Indexing large metric spaces for similarity search queries. ACM Transactions on Database Systems. 1999;24(3):361-404.
Journal Categories
Science
Mathematics
Instruments and machines
Electronic computers
Computer science
Science
Mathematics
Instruments and machines
Electronic computers
Computer science
Computer software
Science
Science (General)
Cybernetics
Information theory
Technology
Electrical engineering
Electronics
Nuclear engineering
Electronics
Computer engineering
Computer hardware
Description

How can we efficiently find similar items in vast datasets? This paper explores distance-based index structures for similarity queries in large metric spaces, focusing on applications where distance computations are expensive. The research addresses the challenge of finding approximate matches to a query item within a large collection of data, particularly when calculating distances between objects is computationally intensive. The authors elaborate on an approach using reference points (vantage points) to hierarchically partition the data space into spherical shell-like regions. The paper introduces the multivantage point tree structure (mvp-tree), which employs multiple vantage points to partition the space at each level. The mvp-tree also utilizes precomputed distances between data points and vantage points, improving query efficiency. Experiments comparing mvp-trees to vp-trees demonstrated significant performance gains, with mvp-trees outperforming vp-trees by 20% to 80%. Further experiments explored the impact of varying the number of vantage points and the use of precomputed distances. The results suggest that using a large number of vantage points and precomputed distances can provide more efficient filtering during search operations, making the mvp-tree a valuable tool for similarity searches in large metric spaces.

Published in ACM Transactions on Database Systems, this paper aligns with the journal's focus on efficient data management and retrieval techniques. The research on indexing large metric spaces for similarity queries directly addresses core topics within database systems. By presenting the mvp-tree structure and demonstrating its performance advantages, the paper offers a practical solution for handling similarity searches in large datasets.

Refrences
Citations
Citations Analysis
The first research to cite this article was titled Best-match retrieval for structured images and was published in 2001. The most recent citation comes from a 2023 study titled Best-match retrieval for structured images . This article reached its peak citation in 2008 , with 7 citations.It has been cited in 52 different journals, 5% of which are open access. Among related journals, the Proceedings of the VLDB Endowment cited this research the most, with 5 citations. The chart below illustrates the annual citation trends for this article.
Citations used this article by year