3 5 1

Jeff

JiayuJeff

JiayuJeff

AI & ML interests

None yet

Recent Activity

authored a paper 16 days ago

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

upvoted a paper 17 days ago

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

commented on a paper 17 days ago

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

View all activity

Organizations

None yet

authored a paper 16 days ago

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

Paper • 2510.24505 • Published 29 days ago • 3

upvoted a paper 17 days ago

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

Paper • 2510.24505 • Published 29 days ago • 3

commented a paper 17 days ago

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

Paper • 2510.24505 • Published 29 days ago • 3 •

commented a paper 20 days ago

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published 22 days ago • 20 •

authored 2 papers 21 days ago

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published 22 days ago • 20

Mathematical Proof as a Litmus Test: Revealing Failure Modes of Advanced Large Reasoning Models

Paper • 2506.17114 • Published Jun 20

upvoted a paper 21 days ago

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published 22 days ago • 20

upvoted a paper 2 months ago

UserRL: Training Interactive User-Centric Agent via Reinforcement Learning

Paper • 2509.19736 • Published Sep 24 • 11

liked a model 2 months ago

emrecanacikgoz/Qwen2.5-7B-Instruct-ToolRL-grpo-cold

8B • Updated Apr 22 • 18 • 3

upvoted a paper 4 months ago

UserBench: An Interactive Gym Environment for User-Centric Agents

Paper • 2507.22034 • Published Jul 29 • 29

New activity in osunlp/Mind2Web 4 months ago

Dataset Viewer issue: TooBigContentError

#7 opened 7 months ago by

coung21

authored a paper 4 months ago

Diversity-Enhanced Reasoning for Subjective Questions

Paper • 2507.20187 • Published Jul 27 • 25

upvoted a paper 4 months ago

Diversity-Enhanced Reasoning for Subjective Questions

Paper • 2507.20187 • Published Jul 27 • 25

Jeff

AI & ML interests

Recent Activity

Organizations

JiayuJeff's activity

Dataset Viewer issue: TooBigContentError