TMLR-Group-HF 's Collections

Co-rewarding

Co-rewarding is a novel self-supervised RL framework that improves training stability by seeking complementary supervision from another views.