Agents
Orchestra-o1: Omnimodal Agent Orchestration
Orchestra-o1 is a newly proposed omnimodal agent orchestration framework that enhances collaboration among agents across diverse modalities, including text, image, audio, and video. It features a unified orchestration mechanism for modality-aware task decomposition, online sub-agent specialization, and parallel sub-task execution, achieving a 10.3% accuracy improvement over the second-best method on the OmniGAIA benchmark. The framework employs decision-aligned group relative policy optimization (DA-GRPO) for training the Orchestra-o1-8B model, which demonstrates state-of-the-art performance compared to existing open-source omnimodal agents, making it significant for practitioners working with complex, multi-modal AI systems.
multi-agentorchestrationllm