RAGarXiv cs.AI — 15 d ago

ELVA: Exploring Ranking-Driven Universal Multimodal Retrieval

The paper introduces ELVA, a novel rule-based reinforcement learning framework designed to address grain blindness in Universal Multimodal Retrieval (UMR) by treating negative samples based on their similarity to positive samples. It extends Reinforcement Learning with Verifiable Rewards (RLVR) for retrieval tasks and introduces MRBench, a benchmark for evaluating multi-grain queries. ELVA achieves state-of-the-art performance on standard retrieval benchmarks, with a 13.1% improvement on MRBench, highlighting its significance for practitioners focused on enhancing retrieval models with nuanced query handling.

retrievalcontrastive-learningmultimodalrelevance 0.00 · engagement 0.00

Read at source ↗← all news