ai-digest.dev
last updated 2 h ago
TrainingarXiv cs.AI 8 d ago

Post-Hoc Merging is Not Enough: Many-Shot Model Merging with Loss-Gap Balancing

The paper introduces METIS, a novel iterative many-shot model merging method that enhances multi-task performance by addressing information erasure from task interference during the merging process. Unlike traditional post-hoc merging, METIS employs task-wise loss-gap weighting and consensus-based masking to optimize the merging of task-specialized models. This approach significantly improves performance on the worst-performing tasks, making it a valuable strategy for practitioners aiming to build robust multi-task large language models.

model mergingmulti-taskllmrelevance 0.00 · engagement 0.00
Read at source ↗← all news