Research
ReproRepo: Scaling Reproducibility Audits with GitHub Repository Issues
ReproRepo is a newly introduced framework designed to enhance the scalability of reproducibility audits by utilizing human-raised GitHub issues as supervision for identifying reproduction blockers in machine learning research. Evaluated on 1,149 recent papers, the framework demonstrated that LLM agents, particularly Codex with GPT-5.5, can effectively surface relevant reproducibility issues, achieving a success rate of approximately 90% in identifying at least one related blocker. This framework provides a reusable tool for practitioners to assess LLM agents' capabilities in real-world reproducibility scenarios, addressing a critical challenge in scientific research validation.
llmreproducibilitygithub