AgentsarXiv cs.AI — 15 d ago

IHBench: Evaluating Post-Interruption Recovery in Voice Agents with Structured Workflows

IHBench (Interruption Handling Benchmark) has been introduced to evaluate the post-interruption recovery capabilities of voice agents in structured workflows across 10 enterprise domains. The benchmark tests 27 audio-language model configurations from OpenAI, Google, and open-weight models, focusing on six types of interruptions and assessing task fulfillment and recovery quality. Results indicate that closed-weight models outperform open-weight models in handling interruptions, degrading at a slower rate as conversation length increases, highlighting the importance of interruption recovery for practitioners developing robust voice agents.

voice agentsbenchmarkinterruption handlingrelevance 0.00 · engagement 0.00

Read at source ↗← all news