Agents
VISTA: A Versatile Interactive User Simulation Toolkit for Agent Evaluation
VISTA, a new Versatile Interactive user Simulation Toolkit for Agent evaluation, has been proposed to enhance the evaluation of interactive agents by addressing limitations in existing frameworks. It introduces a hybrid user simulator that supports both UI and API interactions, along with six metrics for assessing realism, capability coverage, and interaction effectiveness. This toolkit is significant for practitioners as it provides a more comprehensive evaluation method, enabling better identification of agent capabilities and failure modes across varied interactive environments.
evaluationuser-simulationagent