ai-digest.dev
last updated 13 h ago
RAGarXiv cs.CL 8 d ago

When More Documents Hurt RAG: Mitigating Vector Search Dilution with Domain-Scoped, Model-Agnostic Retrieval

The paper introduces MASDR-RAG (Multi-Agent Scoped Domain Retrieval for RAG), a solution to the issue of vector search dilution in retrieval-augmented generation (RAG) systems when scaling to large, heterogeneous document collections. The authors observed a significant accuracy drop from 75% to below 40% when increasing the document count from 54 to 1,128, and demonstrated that domain scoping using organizational metadata can improve precision at rank 10 (P@10) from 0.77 to 0.86 across multiple LLM backbones and corpora. This approach emphasizes the importance of scoping before synthesis in retrieval tasks, providing a practical guideline for AI practitioners working with RAG systems.

ragretrievalsearchrelevance 0.00 · engagement 0.00
Read at source ↗← all news
When More Documents Hurt RAG: Mitigating Vector Search Dilution with Domain-Scoped, Model-Agnostic Retrieval — AI News Digest