Authors:

Meng ChenKai ZhangZhenying HeYinan JingX. Sean Wang

Paper:

https://arxiv.org/abs/2408.08933

Introduction

Approximate Nearest Neighbor Search (ANNS) is a critical component in various applications, such as recommendation systems and large language model-based applications. With the rise of multimodal neural models, cross-modal ANNS has become essential for retrieving similar items across different modalities (e.g., using text to find similar images). However, existing ANNS approaches struggle with cross-modal queries due to the inherent distribution gap between embeddings from different modalities. This paper introduces RoarGraph, a projected bipartite graph designed to address these inefficiencies and significantly improve cross-modal ANNS performance.

Related Work

Background on ANNS

ANNS aims to find the approximate nearest neighbors in a dataset for a given query, trading off between search speed and accuracy. Traditional methods include partition-based, quantization-based, and hashing-based approaches, with graph-based methods representing the state-of-the-art performance.

Out-of-Distribution ANNS

Cross-modal ANNS queries are often out-of-distribution (OOD) relative to the base data, leading to inefficiencies in existing ANNS methods. The distribution gap between different modalities causes OOD queries to deviate significantly from the base data, making it challenging for traditional ANNS indexes to perform efficiently.

Limitations of Existing Approaches

Existing ANNS methods, such as HNSW and IVF, are designed for single-modal scenarios and perform poorly with OOD queries. RobustVamana, a graph index designed for OOD-ANNS, improves performance but still falls short in handling the distribution gap effectively.

Research Methodology

Key Insights from OOD Workloads

The paper identifies that OOD queries are spatially distant from the base data, and their k-nearest neighbors are also widely distributed. This breaks the assumptions of existing ANNS approaches, which expect queries to be close to the base data and their nearest neighbors to be clustered.

Proposed Solution: RoarGraph

RoarGraph is designed to map distributed vectors that are nearest neighbors to queries into closely connected neighbors within a graph index. The construction process involves:
1. Building a bipartite graph by mapping the similarity relationship between queries and base data.
2. Projecting the bipartite graph onto the base data using neighborhood-aware projection.
3. Enhancing the graph’s connectivity to ensure reachability and efficient search paths.

Experimental Design

Datasets

The experiments use three modern cross-modal datasets: Text-to-Image, LAION, and WebVid. These datasets include text, image, and video frame embeddings, providing a comprehensive evaluation of RoarGraph’s performance.

Evaluation Metrics

The performance is evaluated using recall@k and queries per second (QPS). Recall measures the accuracy of retrieval, while QPS assesses the search speed.

Baseline Methods

The paper compares RoarGraph with state-of-the-art graph-based ANNS methods, including HNSW, NSG, τ-MNG, and RobustVamana.

Results and Analysis

Search Speed vs. Recall

RoarGraph consistently outperforms existing methods across all datasets, achieving up to 3.56× faster search speed at a 90% recall rate for OOD queries. This demonstrates RoarGraph’s efficiency in handling cross-modal ANNS tasks.

Routing Hops vs. Recall

RoarGraph requires significantly fewer hops during the search, reducing the length of the search path and improving efficiency.

Ablation Study

The study validates the effectiveness of RoarGraph’s components, showing that the projected graph and connectivity enhancement techniques contribute to its superior performance.

Effects of Query Set Size for Indexing

RoarGraph maintains high performance even with a reduced query set size, making it flexible and practical for various scenarios.

Robustness to In-Distribution Query

RoarGraph also performs well with in-distribution queries, ensuring its applicability to different query types.

Index Size and Construction Overhead

RoarGraph’s index size is comparable to other methods, and its construction time is reasonable, especially when using a smaller query set.

Overall Conclusion

RoarGraph addresses the inefficiencies of existing ANNS approaches in handling cross-modal queries by leveraging query distribution to guide graph construction. Extensive experiments demonstrate its superior performance, making it a valuable solution for cross-modal ANNS tasks. The proposed method not only improves search speed and accuracy but also maintains robustness to different query types, offering a flexible and practical approach for real-world applications.

Share.

Comments are closed.

Exit mobile version