Multimodal
FreeStyle: Free Control of Style-Content Dual-Reference Generation from Community LoRA Mining
FreeStyle is a new framework for style-content dual-reference image generation that leverages community LoRA mining to create large-scale triplet datasets for training. It features a two-stage curriculum with mechanisms to prevent style leakage, including an attention-level enrichment constraint and a frequency-aware RoPE modulation strategy. This framework introduces a benchmark for evaluating style similarity, content preservation, and leakage rejection, demonstrating improved performance in balancing these aspects, which is critical for practitioners aiming to enhance the fidelity and reliability of generative models in AI applications.
style-contentdual-referencegenerationlora