ai-digest.dev
last updated 58 min ago
MultimodalarXiv cs.AI 2 d ago

BiWM: Advancing Open-Source Interactive Video World Models with Bidirectional Autoregression

BiWM is introduced as the first full-stack framework for interactive video world models utilizing a bidirectional autoregressive approach, significantly reducing the training pipeline from four stages to two, thus enhancing generation quality and inference speed. It supports models ranging from Wan2.1-1.3B to LTX-2.3-22B, and integrates features like camera control fine-tuning, pluggable history compression, and an optional 4-bit NVFP4 training/inference pipeline. This framework is crucial for practitioners as it allows for improved controllability and fidelity in video generation, addressing limitations of existing causal models like minWM.

videoautoregressivemodelsrelevance 0.00 · engagement 0.00
Read at source ↗← all news
BiWM: Advancing Open-Source Interactive Video World Models with Bidirectional Autoregression — AI News Digest