Training
Post-Hoc Merging is Not Enough: Many-Shot Model Merging with Loss-Gap Balancing
The paper introduces METIS, a novel iterative many-shot model merging method that enhances multi-task performance by addressing information erasure from task interference during the merging process. Unlike traditional post-hoc merging, METIS employs task-wise loss-gap weighting and consensus-based masking to optimize the merging of task-specialized models. This approach significantly improves performance on the worst-performing tasks, making it a valuable strategy for practitioners aiming to build robust multi-task large language models.
model mergingmulti-taskllm