3 24 13

DeyangKong

DeyangKong

AI & ML interests

Natural Language Processing

Recent Activity

upvoted a paper 9 days ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

upvoted a paper 14 days ago

DEER: Draft with Diffusion, Verify with Autoregressive Models

liked a model about 1 month ago

sentence-transformers/all-MiniLM-L6-v2

View all activity

Organizations

upvoted a paper 9 days ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published 10 days ago • 60

upvoted a paper 14 days ago

DEER: Draft with Diffusion, Verify with Autoregressive Models

Paper • 2512.15176 • Published 16 days ago • 41

liked 2 models about 1 month ago

sentence-transformers/all-MiniLM-L6-v2

OpenMOSS-Team/DiRL-8B-Instruct

Text Generation • 8B • Updated 2 days ago • 48 • 10

liked a dataset about 2 months ago

microsoft/rStar-Coder

Viewer • Updated Jul 20, 2025 • 1.86M • 4.86k • 220

liked a model 2 months ago

Buchilaguo/ATF-8B

8B • Updated Oct 21, 2025 • 54 • 1

liked a model 4 months ago

meituan-longcat/LongCat-Flash-Chat

Text Generation • 562B • Updated Sep 24, 2025 • 21.8k • 514

upvoted 2 papers 7 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263

Skywork Open Reasoner 1 Technical Report

Paper • 2505.22312 • Published May 28, 2025 • 54

authored a paper 7 months ago

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Paper • 2505.17652 • Published May 23, 2025 • 6

upvoted a paper 7 months ago

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Paper • 2505.17652 • Published May 23, 2025 • 6

commented a paper 7 months ago

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Paper • 2505.17652 • Published May 23, 2025 • 6 •

upvoted a paper 9 months ago

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Paper • 2504.05520 • Published Apr 7, 2025 • 11

liked a dataset 9 months ago

lime-nlp/DeepScaleR_Difficulty

Viewer • Updated Apr 10, 2025 • 5.06M • 173 • 9

liked a model 9 months ago

agentica-org/DeepCoder-14B-Preview

Text Generation • 15B • Updated May 11, 2025 • 1.78k • • 681

upvoted a paper 9 months ago

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Paper • 2503.18892 • Published Mar 24, 2025 • 31

upvoted 3 papers 10 months ago

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published Mar 20, 2025 • 77

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 144

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Paper • 2503.01506 • Published Mar 3, 2025 • 10

commented a paper 10 months ago

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Paper • 2503.01506 • Published Mar 3, 2025 • 10 •

DeyangKong

AI & ML interests

Recent Activity

Organizations

DeyangKong's activity