MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe Paper • 2509.18154 • Published Sep 16 • 49
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published Dec 11, 2024 • 54
MiniCPM-o & MiniCPM-V Collection Multimodal models with leading performance. • 28 items • Updated Sep 1 • 56
RLPR: Extrapolating RLVR to General Domains without Verifiers Paper • 2506.18254 • Published Jun 23 • 31
RLPR Collection Extrapolating RLVR to General Domains without Verifiers • 6 items • Updated Aug 7 • 4
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28 • 130
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness Paper • 2405.17220 • Published May 27, 2024 • 1
SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding Paper • 2308.10529 • Published Aug 21, 2023 • 1
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback Paper • 2312.00849 • Published Dec 1, 2023 • 12