Coding
Open-SWE-Traces: Advancing Dual-Mode Multilingual Distillation for Software Engineering Agents
The article introduces Open-SWE-Traces, a dataset comprising 207,489 agentic trajectories across nine programming languages, sourced from real-world pull requests. It employs a hybrid-reasoning synthesis with Minimax-M2.5 for explicit reasoning and Qwen3.5-122B for high-quality traces, facilitating the training of models like Qwen3-30B-A3B. The best model fine-tuned on this dataset achieved resolve rates of 61.7% on SWE-bench Verified, highlighting its potential to enhance the capabilities of open-source LLMs for software engineering tasks.
softwareengineeringagentsdataset