2 91 27

xiang huang

xianghuang

AI & ML interests

None yet

Recent Activity

published a dataset 7 days ago

xianghuang/Sysbank

upvoted a paper about 2 months ago

ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

upvoted a paper 2 months ago

Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning

View all activity

Organizations

None yet

upvoted a paper about 2 months ago

ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

Paper • 2505.23923 • Published May 29 • 8

upvoted 19 papers 2 months ago

Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning

Paper • 2505.17813 • Published May 23 • 57

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

Paper • 2505.11896 • Published May 17 • 58

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Paper • 2504.21776 • Published Apr 30 • 59

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20 • 62

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Paper • 2505.04588 • Published May 7 • 65

SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Paper • 2505.19641 • Published May 26 • 67

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5 • 80

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published May 12 • 82

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 90

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Paper • 2505.15277 • Published May 21 • 104

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28 • 131

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 186

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 316

ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs

Paper • 2506.15211 • Published Jun 18 • 37

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

Paper • 2506.18945 • Published Jun 23 • 40

xiang huang

AI & ML interests

Recent Activity

Organizations

xianghuang's activity