Agents
Can LLMs Be CEOs? Benchmarking Strategic Resource Reallocation with Multi-Role Agent Simulation
The paper introduces \textsc{CEO-Bench}, a multi-agent benchmark designed to evaluate the strategic decision-making capabilities of large language models (LLMs) in the context of executive resource reallocation. It involves LLM agents interacting with conflicting advice from role-conditioned C-suite advisors and assesses their performance across dimensions such as role integration and plan validity. The findings highlight significant divergence in strategic calibration among five tested models, revealing limitations in their ability to synthesize conflicting information, which is critical for developing AI-assisted executive systems.
llmceobenchmarkdecision-making