arxiv:2510.05342
hyung gyu rho
sirano1004
ยท
AI & ML interests
None yet
Recent Activity
authored
a paper
about 1 month ago
Margin Adaptive DPO: Leveraging Reward Model for Granular Control in
Preference Optimization
upvoted
a
paper
about 1 month ago
A Contextual Quality Reward Model for Reliable and Efficient Best-of-N
Sampling
upvoted
a
paper
about 1 month ago
Margin Adaptive DPO: Leveraging Reward Model for Granular Control in
Preference Optimization
Organizations
None yet