Learning to Reason by Analogy via Retrieval-Augmented Reinforcement Fine-Tuning
The article introduces Retrieval-Augmented Reinforcement Fine-Tuning (RA-RFT), a novel framework for enhancing language models' reasoning capabilities by teaching them to reason by analogy. It utilizes gold-relevance distillation to optimize a retriever that ranks contexts based on their expected reasoning benefit rather than mere semantic similarity, followed by reinforcement fine-tuning with these retrieved contexts. RA-RFT demonstrates significant improvements in mathematical reasoning benchmarks, achieving a 7.1 and 2.8 point increase in average accuracy on the AIME 2025 benchmark for Qwen3-1.7B and Qwen3-4B models, respectively, indicating that reasoning-aware retrieval can significantly enhance model performance in complex reasoning tasks.