wangrui's picture

wangrui

varuy322

·

varuy322

AI & ML interests

None yet

Recent Activity

upvoted a collection 6 days ago

Nemotron-Pre-Training-Datasets

upvoted a collection 11 days ago

Multimodal Implementations

liked a dataset 11 days ago

google/deepsearchqa

View all activity

Organizations

None yet

upvoted a collection 6 days ago

Nemotron-Pre-Training-Datasets

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 5 days ago • 82

upvoted a collection 11 days ago

Multimodal Implementations

Comprehensive Demo of Multimodal VLMs on the Hub • 23 items • Updated 8 days ago • 10

upvoted a paper 17 days ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26 • 110

upvoted 2 collections 17 days ago

Multimodal Dataset

86 items • Updated 5 days ago • 7

MiniCPM4

MiniCPM4: Ultra-Efficient LLMs on End Devices • 29 items • Updated Sep 8 • 82

upvoted a paper 27 days ago

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Paper • 2511.22570 • Published Nov 27 • 81

upvoted a collection 27 days ago

Olmo 3 Post-training

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 5 days ago • 46

upvoted 2 collections 28 days ago

Synthetic Data and Self-Improvement

113 items • Updated Sep 26 • 9

Reasoning, Thinking, RL and Test-Time Scaling

261 items • Updated Nov 22 • 14

upvoted a collection about 1 month ago

Papers

650 items • Updated 8 days ago • 15

upvoted a paper about 1 month ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 143

upvoted an article about 2 months ago

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Nov 3

•

53

upvoted a paper 2 months ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14 • 118

upvoted a collection 2 months ago

Ferret

A framework for training LLM agents via RL with advanced search capability: https://github.com/Tree-Shu-Zhao/ferret • 7 items • Updated Oct 21 • 1

upvoted 2 papers 3 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16 • 51

ARE: Scaling Up Agent Environments and Evaluations

Paper • 2509.17158 • Published Sep 21 • 35

upvoted a collection 3 months ago

ZeroSearch_Policy_Google_V2

6 items • Updated Sep 7 • 5

upvoted a paper 4 months ago

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28 • 89

upvoted an article 4 months ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

+5

May 21

•

245

upvoted a paper 4 months ago

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2 • 124