Safety
Gaming AI-Assisted Peer Reviews Poses New Risks to the Scientific Community
The study published in arXiv:2606.10159v1 highlights vulnerabilities in AI-assisted peer review systems, demonstrating that superficial rephrasing of manuscript abstracts can significantly improve review outcomes, achieving an attack success rate of approximately 38% for Gemini 3 Flash and 50% when the original review suggests rejection. This manipulation, which requires minimal time and cost, raises concerns about the integrity of AI evaluations, as inflated review scores could lead to biased editorial decisions favoring acceptance over scientific merit. The findings underscore the necessity for robust testing and oversight of AI tools in scientific evaluation processes to mitigate potential manipulation risks.
peer reviewaimanipulation