yueliu1999's picture

yueliu1999

yueliu1999

·

https://yueliu1999.github.io/

yueliu1999

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

upvoted a paper about 1 month ago

MAPO: Mixed Advantage Policy Optimization

upvoted a paper 2 months ago

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

View all activity

Organizations

None yet

upvoted 2 papers about 1 month ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28 • 170

MAPO: Mixed Advantage Policy Optimization

Paper • 2509.18849 • Published Sep 23 • 26

upvoted a paper 2 months ago

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

Paper • 2508.14029 • Published Aug 19 • 118

upvoted a paper 3 months ago

Pixels, Patterns, but No Poetry: To See The World like Humans

Paper • 2507.16863 • Published Jul 21 • 68

upvoted a paper 4 months ago

Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning

Paper • 2502.11962 • Published Feb 17 • 38

upvoted 8 papers 5 months ago

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Paper • 2506.08989 • Published Jun 10 • 14

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published Jun 2 • 52

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Paper • 2505.22651 • Published May 28 • 50

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published May 28 • 29

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27 • 26

MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research

Paper • 2505.19955 • Published May 26 • 12

Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published May 26 • 23

Backdoor Cleaning without External Guidance in MLLM Fine-tuning

Paper • 2505.16916 • Published May 22 • 16

upvoted a paper 6 months ago

GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

Paper • 2505.11049 • Published May 16 • 60

upvoted a collection 6 months ago

GuardReasoner-VL

A reasoning-based VLM guard model • 6 items • Updated May 28 • 2

upvoted a paper 6 months ago

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15 • 120

upvoted 4 papers 7 months ago

FlowReasoner: Reinforcing Query-Level Meta-Agents

Paper • 2504.15257 • Published Apr 21 • 47

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

Paper • 2504.13055 • Published Apr 17 • 19

Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute

Paper • 2503.23803 • Published Mar 31 • 8

JudgeLRM: Large Reasoning Models as a Judge

Paper • 2504.00050 • Published Mar 31 • 62