ai-digest.dev
last updated 5 h ago
ResearcharXiv cs.AI 21 h ago

Can Image Models Imagine Time? ImageTime: A Novel Benchmark for Probing Visual World Modeling Through Spatiotemporal Consistency

The article introduces ImageTime, a new benchmark designed to evaluate the ability of image generation models to maintain spatiotemporal consistency across multiple visual states. The benchmark requires models to generate a sequence of four keyframes representing an initial state, action onset, transition, and final state, which presents a more complex challenge than single-image generation. This diagnostic tool aims to provide insights into the temporal capabilities of current models, offering structured scoring through GPT-5.5 and revealing strengths and weaknesses in visual world modeling, which is critical for applications like storyboarding and video previsualization.

image-modelsspatiotemporal-consistencybenchmarkrelevance 0.00 · engagement 0.00
Read at source ↗← all news