SPI: Query-Depth-Adaptive Indexing for Streaming RAG in Vector Databases
The article introduces Semantic Pyramid Indexing (SPI), a novel indexing framework for vector databases designed to enhance retrieval-augmented generation (RAG) by allowing for incremental updates and adaptive retrieval depth based on query complexity. SPI organizes embeddings into multiple semantic resolution levels and utilizes an uncertainty-aware controller for efficient query processing, achieving a 1.4–2.3× reduction in average retrieval latency with competitive Recall@10 on datasets like MS MARCO and Natural Questions. This framework supports progressive coarse-to-fine approximate nearest neighbor search and integrates seamlessly with existing backends such as FAISS and Qdrant, making it a valuable tool for practitioners aiming to optimize query performance in dynamic environments.