Models and datasets in paper "WPO: Enhancing RLHF with Weighted Preference Optimization".
			
	
	Wenxuan Zhou
wzhouad
		AI & ML interests
None yet
		
		Organizations
			models
			8
		
			
	
	
	
	
	wzhouad/Llama3-Instruct-8B-WPO-HB-v2
			Text Generation
			• 
		
				8B
			• 
	
				Updated
					
				
				
				
	
				• 
					
					5
				
wzhouad/Llama3-Instruct-8B-WPO-HB
			Text Generation
			• 
		
				8B
			• 
	
				Updated
					
				
				
				
	
				• 
					
					1
				
wzhouad/zephyr-7B-WPO-HB
			Text Generation
			• 
		
				7B
			• 
	
				Updated
					
				
				
				
	
				
				
wzhouad/gemma-2-9b-it-WPO-HB
			Text Generation
			• 
		
				9B
			• 
	
				Updated
					
				
				• 
					
					7
				
	
				• 
					
					34
				
wzhouad/gemma-2-9b-it-WPO-FP
			Text Generation
			• 
		
				9B
			• 
	
				Updated
					
				
				
				
	
				
				
wzhouad/zephyr-7B-WPO-FP
			Text Generation
			• 
		
				7B
			• 
	
				Updated
					
				
				
				
	
				
				
wzhouad/Llama3-Instruct-8B-WPO-FP
			Text Generation
			• 
		
				8B
			• 
	
				Updated
					
				
				
				
	
				
				
wzhouad/prix-lm
			Text Generation
			• 
		
	
				Updated
					
				
				• 
					
					1
				
	
				
				
 
								


