IterCAD: An Iterative Multimodal Agent for Visually-Grounded CAD Generation and Editing
The paper introduces IterCAD, a multimodal agent framework designed for iterative Computer-Aided Design (CAD) generation and editing, addressing the limitations of existing one-shot methods. It features a multi-turn interaction model that includes tasks such as Drawing-to-Code and Text-to-Code, leveraging a data synthesis pipeline for generating compliant engineering drawings and utilizing progressive supervised fine-tuning (SFT) with geometry-aware reinforcement learning for improved code executability. The proposed IterCAD-Bench evaluation suite and the Chamfer Distance Tolerance-Recall metric provide a new standard for assessing code validity and geometric precision, with IterCAD demonstrating superior performance in closed-loop iterative refinement compared to existing methods.