Indexing large metric spaces for similarity search queries

Article Properties

Language

English
DOI (url)

10.1145/328939.328959
Publication Date

1999/09/01
Journal

ACM Transactions on Database Systems
Indian UGC (Journal)
Refrences

24
Citations

79
Tolga Bozkaya Oracle Corp., Redwood Shores, CA
Meral Ozsoyoglu Case Western Reserve Univ., Cleveland, OH

Abstract

Cite

Bozkaya, Tolga, and Meral Ozsoyoglu. “Indexing Large Metric Spaces for Similarity Search Queries”. ACM Transactions on Database Systems, vol. 24, no. 3, 1999, pp. 361-04, https://doi.org/10.1145/328939.328959.

Bozkaya, T., & Ozsoyoglu, M. (1999). Indexing large metric spaces for similarity search queries. ACM Transactions on Database Systems, 24(3), 361-404. https://doi.org/10.1145/328939.328959

Bozkaya T, Ozsoyoglu M. Indexing large metric spaces for similarity search queries. ACM Transactions on Database Systems. 1999;24(3):361-404.

Journal Categories

Science

Mathematics

Instruments and machines

Electronic computers

Computer science

Science

Mathematics

Instruments and machines

Electronic computers

Computer science

Computer software

Science

Science (General)

Cybernetics

Information theory

Technology

Electrical engineering

Electronics

Nuclear engineering

Electronics

Computer engineering

Computer hardware

Description

How can we efficiently find similar items in vast datasets? This paper explores distance-based index structures for similarity queries in large metric spaces, focusing on applications where distance computations are expensive. The research addresses the challenge of finding approximate matches to a query item within a large collection of data, particularly when calculating distances between objects is computationally intensive. The authors elaborate on an approach using reference points (vantage points) to hierarchically partition the data space into spherical shell-like regions. The paper introduces the multivantage point tree structure (mvp-tree), which employs multiple vantage points to partition the space at each level. The mvp-tree also utilizes precomputed distances between data points and vantage points, improving query efficiency. Experiments comparing mvp-trees to vp-trees demonstrated significant performance gains, with mvp-trees outperforming vp-trees by 20% to 80%. Further experiments explored the impact of varying the number of vantage points and the use of precomputed distances. The results suggest that using a large number of vantage points and precomputed distances can provide more efficient filtering during search operations, making the mvp-tree a valuable tool for similarity searches in large metric spaces.

Published in ACM Transactions on Database Systems, this paper aligns with the journal's focus on efficient data management and retrieval techniques. The research on indexing large metric spaces for similarity queries directly addresses core topics within database systems. By presenting the mvp-tree structure and demonstrating its performance advantages, the paper offers a practical solution for handling similarity searches in large datasets.

Category	Category Repetition
Science: Mathematics: Instruments and machines: Electronic computers. Computer science	51
Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software	30
Science: Science (General): Cybernetics: Information theory	29
Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware	25
Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics	15

Indexing large metric spaces for similarity search queries

Article Properties

Abstract

Cite

Journal Categories

You May Also Like

Description

Refrences

Citations

Citations Analysis

Citations used this article by year

Database	Last update
UGC	December 2024
DOAJ	December 2024
Crossref	May 2024