Agents
PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions
PhoneHarness is a new mixed-action benchmark and execution environment designed for evaluating mobile phone agents on real-world workflows, integrating GUI, CLI, and tool actions. It features a deterministic action routing system and generates auditable execution traces, allowing for the assessment of task completion based on observable side effects rather than just final outputs. With a reported 75.0% pass rate in its benchmark, PhoneHarness demonstrates that effective phone automation requires a broader approach than traditional GUI control, emphasizing the importance of reliable action routing and verifiability in mobile agent performance.
phone-agentsworkflow