Hierarchical Structured Neural Network for Retrieval

Authors:

Kaushik Rangadurai、Siyang Yuan、Minhui Huang、Yiqun Liu、Golnaz Ghasemiesfeh、Yunchen Pu、Xinfeng Xie、Xingfeng He、Fangzhou Xu、Andrew Cui、Vidhoon Viswanathan、Yan Dong、Liang Xiong、Lin Yang、Liang Wang、Jiyan Yang、Chonglin Sun

Paper:

https://arxiv.org/abs/2408.06653

Introduction

In the realm of machine learning and recommender systems, the retrieval stage is crucial for narrowing down millions of candidate ads to a few thousand relevant ones. This paper introduces the Hierarchical Structured Neural Network (HSNN), a novel approach designed to address the limitations of traditional Embedding Based Retrieval (EBR) systems. HSNN leverages sophisticated interactions and model architectures to enhance the retrieval process while maintaining sub-linear inference costs.

Related Work

Clustering Methods

Clustering algorithms are broadly categorized into hierarchical and partitional methods. Hierarchical clustering, such as agglomerative clustering, starts with small clusters and merges them. Partitional methods like K-means aim to minimize the sum of squared distances between data points and their closest cluster centers. Other techniques include expectation maximization (EM), spectral clustering, and non-negative matrix factorization (NMF).

Embedding Based Retrieval

EBR systems have been widely adopted in search and recommendation systems. These systems use approximate nearest neighbor (ANN) algorithms to retrieve top relevant candidates efficiently. Techniques like tree-based algorithms, locality-sensitive hashing (LSH), and hierarchical navigable small world graphs (HNSW) are commonly used.

Generative Retrieval

Generative retrieval has emerged as a new paradigm for document retrieval, drawing parallels to ad retrieval where ads are treated as documents and users as queries. This approach uses a learnt codebook similar to the hierarchical cluster IDs proposed in HSNN.

Joint Optimization

Joint optimization systems, where item hierarchies and large-scale retrieval systems are optimized together, are closely related to our work. Methods like Deep Retrieval and tree-based models optimize both the tree structure and model parameters jointly.

Motivation

The motivation for HSNN arises from the limitations of the current production baseline:

Limited Interaction: The Two Tower architecture restricts user-ad interactions.
Post-Training Clustering: Clustering occurs after training, unaware of retrieval optimization criteria.
Embedding Updates: Frequent updates to embeddings necessitate new cluster indices, leading to delays.

Model Design

Features and Model Architecture

The Two Tower model, widely used in the industry, outputs fixed-size representations for users and ads, relying on dot-product interactions. HSNN enhances this by introducing interaction features and sophisticated MergeNet architectures.

Disjoint Optimization of Clustering and Retrieval Model

The current system architecture involves separate components for retrieval and clustering, leading to inefficiencies. HSNN proposes a jointly-optimized clustering and retrieval model to improve overall performance.

System Architecture

The system architecture of HSNN involves data loading, clustering, indexing, and serving stages. The process ensures that cluster information is up-to-date and available at serving time.

Proposed Method

Hierarchical Structured Neural Network (HSNN)

Modeling

HSNN introduces three decoupled towers (user, ad, and interaction) and a MergeNet to capture higher-order interactions. The clustering modules use the Learning To Cluster (LTC) algorithm to generate cluster embeddings.

Indexing

The retrieval model is split into five parts for serving: user tower, ad tower, interaction tower, cluster model, and over-arch model. Cluster IDs and embeddings are indexed for efficient retrieval.

Serving

The serving process involves fetching user embeddings, scoring clusters, and ranking ads based on cluster scores. This approach reduces the number of inferences, improving efficiency.

Learning To Cluster (LTC)

LTC is a gradient-descent based clustering algorithm that co-trains cluster assignments and centroid embeddings. It incorporates curriculum learning, cluster collapse prevention, and hierarchical clustering.

Curriculum Learning

Curriculum learning trains the model from easier to harder tasks, using a scheduling strategy to adjust the softmax temperature over iterations.

Cluster Distribution

To prevent cluster collapse, techniques like FLOPs regularizer and random replacement are employed, ensuring a balanced cluster distribution.

Introducing Hierarchy

HSNN introduces hierarchy through Residual Quantization, using multiple Ad Cluster Towers to handle large numbers of items.

Ablation Studies

Interaction Arch and Features

Ablation studies show the importance of interaction features and sophisticated interaction towers in improving model performance.

Co-Training

Co-training the clustering with the retrieval model significantly improves clustering performance, as demonstrated by various clustering techniques.

LTC Algorithm Ablation

The LTC algorithm’s components, such as curriculum learning and cluster collapse prevention, contribute to its effectiveness.

Experiments

Online Results

HSNN demonstrates substantial improvements in relevance and efficiency metrics over traditional EBR systems. The co-trained approach and use of interaction features contribute to these gains.

Deployment Lessons

Cluster Collapse

Techniques like FLOPs regularizer and random replacement ensure even cluster distribution, preventing cluster collapse.

Staleness of Cluster Centroids

Frequent updates to centroid embeddings prevent staleness, maintaining model accuracy over time.

[illustration: 13]

Conclusion and Next Steps

HSNN introduces a learnable hierarchical clustering module that enhances the retrieval stage in recommendation systems. Future work will explore more complex interactions and personalization through user clustering.

Acknowledgements

The authors thank their colleagues for their contributions and support in developing and deploying HSNN.

This blog post provides a detailed interpretation of the paper “Hierarchical Structured Neural Network for Retrieval,” highlighting the motivation, design, methodology, and experimental results of the proposed HSNN system.

What's Hot

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Hierarchical Structured Neural Network for Retrieval

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Our Picks

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Subscribe to Updates

What's Hot

Hierarchical Structured Neural Network for Retrieval

Authors:

Paper:

Introduction

Related Work

Clustering Methods

Embedding Based Retrieval

Generative Retrieval

Joint Optimization

Motivation

Model Design

Features and Model Architecture

Disjoint Optimization of Clustering and Retrieval Model

System Architecture

Proposed Method

Hierarchical Structured Neural Network (HSNN)

Modeling

Indexing

Serving

Learning To Cluster (LTC)

Curriculum Learning

Cluster Distribution

Introducing Hierarchy

Ablation Studies

Interaction Arch and Features

Co-Training

LTC Algorithm Ablation

Experiments

Online Results

Deployment Lessons

Cluster Collapse

Staleness of Cluster Centroids

Conclusion and Next Steps

Acknowledgements

Related Posts