GRAM-RR - a wangclnlp Collection

wangclnlp 's Collections

GRAM-RR

RoVRM

GRAM

GRAM-RR

updated Sep 7

Self-Training Generative Foundation Reward Models for Reward Reasoning

GRAM-R^2: Self-Training Generative Foundation Reward Models for Reward Reasoning

Paper • 2509.02492 • Published Sep 2 • 1
wangclnlp/GRAM-RR-LLaMA-3.1-8B-RewardModel

Text Generation • 8B • Updated Sep 4 • 4
wangclnlp/GRAM-RR-LLaMA-3.2-3B-RewardModel

Text Generation • 3B • Updated Sep 4 • 2
wangclnlp/GRAM-RR-TrainingData

Updated Sep 4 • 60