ai-digest.dev
last updated 3 h ago
ResearcharXiv cs.AI 12 d ago

ReproRepo: Scaling Reproducibility Audits with GitHub Repository Issues

ReproRepo is a newly introduced framework designed to enhance the scalability of reproducibility audits by utilizing human-raised GitHub issues as supervision for identifying reproduction blockers in machine learning research. Evaluated on 1,149 recent papers, the framework demonstrated that LLM agents, particularly Codex with GPT-5.5, can effectively surface relevant reproducibility issues, achieving a success rate of approximately 90% in identifying at least one related blocker. This framework provides a reusable tool for practitioners to assess LLM agents' capabilities in real-world reproducibility scenarios, addressing a critical challenge in scientific research validation.

llmreproducibilitygithubrelevance 0.00 · engagement 0.00
Read at source ↗← all news
ReproRepo: Scaling Reproducibility Audits with GitHub Repository Issues — AI News Digest