QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 13 days ago • 165
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe Paper • 2509.18154 • Published Sep 16 • 48
Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities Paper • 2505.15692 • Published May 21 • 14