Papers
arxiv:2510.21223

Model Merging with Functional Dual Anchors

Published on Oct 24
· Submitted by Kexuan Shi on Oct 27
Authors:
,

Abstract

Functional Dual Anchors (FDAs) enhance model merging by aligning gradients with task vectors in the input-representation space, offering robustness and flexibility compared to parameter-space methods.

AI-generated summary

Model merging is an efficient post-training strategy for integrating knowledge from multiple finetuned checkpoints of a shared foundation model. Existing methods operate in the parameter space, combining task vectors to mitigate conflicts, but remain constrained by parameter inconsistencies. We propose Functional Dual Anchors (FDAs), a framework that instead models the input-representation space. FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pretrained model. This perspective bridges joint multi-task training and post-hoc merging, offering both robustness and flexibility. We further introduce a principled initialization scheme and show that FDAs are complementary to parameter-space model merging. Comprehensive experiments demonstrate the effectiveness of FDAs in model merging.

Community

Paper author Paper submitter

This work presents a novel perspective for Model Merging as well as a novel knowledge utilization.

Model Merging has been an intriguing post-training strategy for integrating knowledge from existing checkpoints of a shared foundation model. Existing methods focus on operations in the parameter space (i.e., task vectors), thereby suffering from the complexity of the parameter space.

We propose Functional Dual Anchors (FDAs), a framework that instead models the knowledge in the input-representation space. Specifically, FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pretrained model. Then, we use the FDAs to adapt the pretrained model. FDAs provide an alternative perspective on model merging by extending input-space modeling to this setting and bridging joint multi-task training and post-hoc merging.

💬 We welcome discussions, feedback, and collaborations on this direction!

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.21223 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.21223 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.21223 in a Space README.md to link it from this page.

Collections including this paper 2