ai-digest.dev
last updated 2 h ago
ResearcharXiv cs.AI 15 d ago

Rethinking Cross-Layer Information Routing in Diffusion Transformers

The paper introduces Diffusion-Adaptive Routing (DAR), a novel approach to cross-layer information routing in Diffusion Transformers (DiTs) that replaces traditional residual connections with a learnable, timestep-adaptive aggregation method. Empirical analysis reveals issues with conventional residual addition, and DAR demonstrates improvements on ImageNet 256x256, enhancing the SiT-XL/2 model's FID score by 2.11 while requiring 8.75 times fewer training iterations. This method not only accelerates training but also maintains high-frequency detail during fine-tuning, highlighting a significant opportunity for optimization in diffusion model architectures.

transformersinformation-routingrelevance 0.00 · engagement 0.00
Read at source ↗← all news
Rethinking Cross-Layer Information Routing in Diffusion Transformers — AI News Digest