ai-digest.dev
last updated 3 h ago
AgentsarXiv cs.AI 12 d ago

Divide, Deliberate, Decide: A Multi-Agent Framework for Fine-Grained Egocentric Action Recognition

The paper presents the "Divide, Deliberate, Decide" framework for fine-grained egocentric action recognition using a multi-agent approach that operates fully locally and in a zero-shot manner. It employs a Vision-Language Model (VLM) orchestrator to segment videos and propose candidate labels, which are then refined through deliberation among diverse VLM specialists, culminating in a Borda count for ranking. This method enhances zero-shot performance by leveraging the diversity in model priors without requiring fine-tuning, making it significant for practitioners aiming to improve action recognition in nuanced visual contexts.

action recognitionmulti-agentrelevance 0.00 · engagement 0.00
Read at source ↗← all news
Divide, Deliberate, Decide: A Multi-Agent Framework for Fine-Grained Egocentric Action Recognition — AI News Digest