Training
Reward-SQL: Boosting Text-to-SQL via Stepwise Execution-Aware Reasoning and Process-Supervised Rewards
The paper introduces Reward-SQL, a novel framework designed to enhance Text-to-SQL performance through stepwise execution-aware reasoning and process-supervised rewards. It employs a divide-and-conquer strategy with structured Common Table Expressions (CTEs) and introduces a process reward model (PRM) that integrates execution-aware scoring into both reinforcement learning training and inference. Experimental results demonstrate significant performance improvements over baseline models, highlighting its potential for practitioners facing complex SQL query generation tasks.
text-to-sqlreasoningrewards