ResearcharXiv cs.AI — 15 d ago

DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models

The article introduces DPRM (Doob-transform Process Reward Model), a novel plug-in token-ordering module designed for diffusion language models that enhances token ordering without altering the underlying architecture or denoising objectives. DPRM transitions from a confidence-driven ordering approach to a process-reward-guided ordering method, demonstrating improved performance across nine host models in various domains, including language reasoning and multimodal tasks. This advancement is significant for practitioners as it offers a more effective ordering policy that can lead to better model performance and efficiency in diverse applications.

diffusion-modelstoken-orderingrelevance 0.00 · engagement 0.00

Read at source ↗← all news