AgentsarXiv cs.CL — 8 d ago

MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft

The paper introduces the MineExplorer benchmark, designed to evaluate the open-world exploration capabilities of multimodal large language models (MLLMs) within the Minecraft environment. It emphasizes a ReAct-style formulation for task organization, incorporating a multi-agent synthesis workflow that enhances task reliability compared to single-agent approaches. Findings indicate that while advanced MLLMs perform well on simpler tasks, they struggle with complex, multi-hop challenges, highlighting limitations in current models' exploration abilities and the need for improved coordination in dynamic environments.

mllmexplorationminecraftrelevance 0.00 · engagement 0.00

Read at source ↗← all news