ai-digest.dev
last updated 4 h ago
MultimodalarXiv cs.AI 7 d ago

CineOrchestra: Unified Entity-Centric Conditioning for Cinematic Video Generation

CineOrchestra is a unified video diffusion model designed for cinematic video generation that integrates multi-subject personalization, temporal control, multi-shot synthesis, and camera control into a single framework. It employs entity-centric conditioning primitives and two parameter-free coordinated rotary embeddings—temporal RoPE for consistent attention across varying durations and 2D entity-temporal cross-attention RoPE for entity-specific routing. CineOrchestra demonstrates superior performance on new benchmarks, outperforming six specialized models in dense caption following and shot-transition timing, making it a significant advancement for practitioners aiming to enhance control in video generation tasks.

videogenerationcinematicrelevance 0.00 · engagement 0.00
Read at source ↗← all news
CineOrchestra: Unified Entity-Centric Conditioning for Cinematic Video Generation — AI News Digest