Agents
A History-Aware Visually Grounded Critic for Computer Use Agents
The paper introduces HiViG, a History-aware Visually Grounded framework designed to enhance Computer Use Agents (CUAs) by integrating a multimodal critic that evaluates actions based on past interactions and visual context. HiViG outperforms existing critics, achieving a 5.8% improvement in success rates for Qwen3-VL-32B and 9.0% for Gemini-3-Flash across various benchmarks. This framework is significant for practitioners as it addresses common limitations in decision-making in complex GUI environments, facilitating more effective long-horizon planning and error interception in real-time applications.
computer use agentsguicritic models