Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models
Paper
•
2511.12464
•
Published
Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models