Abstract
Functional Dual Anchors (FDAs) enhance model merging by aligning gradients with task vectors in the input-representation space, offering robustness and flexibility compared to parameter-space methods.
Model merging is an efficient post-training strategy for integrating knowledge from multiple finetuned checkpoints of a shared foundation model. Existing methods operate in the parameter space, combining task vectors to mitigate conflicts, but remain constrained by parameter inconsistencies. We propose Functional Dual Anchors (FDAs), a framework that instead models the input-representation space. FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pretrained model. This perspective bridges joint multi-task training and post-hoc merging, offering both robustness and flexibility. We further introduce a principled initialization scheme and show that FDAs are complementary to parameter-space model merging. Comprehensive experiments demonstrate the effectiveness of FDAs in model merging.
Community
   
   
This work presents a novel perspective for Model Merging as well as a novel knowledge utilization.
Model Merging has been an intriguing post-training strategy for integrating knowledge from existing checkpoints of a shared foundation model. Existing methods focus on operations in the parameter space (i.e., task vectors), thereby suffering from the complexity of the parameter space.
We propose Functional Dual Anchors (FDAs), a framework that instead models the knowledge in the input-representation space. Specifically, FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pretrained model. Then, we use the FDAs to adapt the pretrained model. FDAs provide an alternative perspective on model merging by extending input-space modeling to this setting and bridging joint multi-task training and post-hoc merging.
💬 We welcome discussions, feedback, and collaborations on this direction!
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
 Kexuan Shi
							Kexuan Shi 
	 
					 
					 
					 
					 
					