MultimodalarXiv cs.AI — 12 d ago

Detail++: Training-Free Detail Enhancer for Text-to-Image Diffusion Models

Detail++ is a training-free framework designed to enhance text-to-image diffusion models by introducing a Progressive Detail Injection (PDI) strategy, which decomposes complex prompts into simplified sub-prompts for staged generation. The method leverages self-attention for global composition and employs cross-attention mechanisms along with a Centroid Alignment Loss to improve attribute consistency and reduce binding noise. Extensive experiments show that Detail++ outperforms existing methods on T2I-CompBench and a new style composition benchmark, making it particularly valuable for practitioners dealing with complex multi-object scenarios in T2I generation.

text-to-imagediffusion modelsrelevance 0.00 · engagement 0.00

Read at source ↗← all news