Research
RoSE: Round-robin Synthetic Data Evaluation for Selecting LLM Generators without Human Test Sets
The paper introduces Round-robin Synthetic data Evaluation (RoSE), a novel proxy metric designed to select the most effective LLM generator for synthetic data generation without relying on human annotations. RoSE evaluates candidate LLMs by training a smaller model on their outputs and measuring performance across multiple tasks, demonstrating superior identification of optimal generators compared to traditional intrinsic heuristics, achieving results within 0.76 percentage points of the optimal baseline. This approach is particularly significant for practitioners working with low-resource languages, as it provides a reliable method for LLM selection that aligns closely with downstream performance metrics.
llmsynthetic-dataevaluation