Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
CodeGoat24 's Collections
UnifiedReward Edit Models
UnifiedReward 2.0 Qwen3VL Models
Pref-GRPO & UniGenBench
UnifiedReward 2.0 Qwen2.5VL Models
UnifiedReward 1.0 Qwen2.5VL Models
UnifiedReward 1.0 Qwen2.5 Models GGUF
UnifiedReward 1.0 LLaVA Model
UnifiedReward Training Data

UnifiedReward 1.0 Qwen2.5VL Models

updated 10 days ago
Upvote
10

  • Unified Reward Model for Multimodal Understanding and Generation

    Paper • 2503.05236 • Published Mar 7 • 123

  • Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning

    Paper • 2505.03318 • Published May 6 • 93

  • CodeGoat24/UnifiedReward-Think-qwen-7b

    8B • Updated Aug 29 • 354 • 3

  • CodeGoat24/UnifiedReward-qwen-32b

    33B • Updated Aug 29 • 20 • 1

  • CodeGoat24/UnifiedReward-qwen-7b

    8B • Updated Aug 29 • 4.45k • 6

  • CodeGoat24/UnifiedReward-qwen-3b

    4B • Updated Aug 29 • 18 • 1
Upvote
10
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs