CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning Paper • 2509.20712 • Published Sep 25 • 19
RLEP: Reinforcement Learning with Experience Replay for LLM Reasoning Paper • 2507.07451 • Published Jul 10 • 5
Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval Paper • 2505.19650 • Published May 26 • 5
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model Paper • 2504.10068 • Published Apr 14 • 30