8 25 8

Zhengyang Tang

tangzhy

AI & ML interests

None yet

Recent Activity

authored a paper about 7 hours ago

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

upvoted a paper about 15 hours ago

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

authored a paper 1 day ago

PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions

View all activity

Organizations

authored a paper about 7 hours ago

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

Paper • 2606.17861 • Published 2 days ago • 36

upvoted a paper about 15 hours ago

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

Paper • 2606.17861 • Published 2 days ago • 36

authored a paper 1 day ago

PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions

Paper • 2606.14832 • Published 6 days ago • 10

upvoted a paper 1 day ago

PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions

Paper • 2606.14832 • Published 6 days ago • 10

upvoted a paper 17 days ago

PhoneWorld: Scaling Phone-Use Agent Environments

Paper • 2605.29486 • Published 21 days ago • 11

authored a paper 19 days ago

PhoneWorld: Scaling Phone-Use Agent Environments

Paper • 2605.29486 • Published 21 days ago • 11

submitted a paper to Daily Papers 20 days ago

PhoneWorld: Scaling Phone-Use Agent Environments

Paper • 2605.29486 • Published 21 days ago • 11

authored a paper about 1 month ago

Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents

Paper • 2605.07630 • Published May 8 • 1

submitted a paper to Daily Papers about 1 month ago

Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents

Paper • 2605.07630 • Published May 8 • 1

authored a paper about 1 month ago

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Paper • 2604.28139 • Published Apr 30 • 42

upvoted a paper about 2 months ago

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Paper • 2604.28139 • Published Apr 30 • 42

authored a paper about 2 months ago

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

Paper • 2604.16029 • Published Apr 17 • 23

upvoted a paper about 2 months ago

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

Paper • 2604.16029 • Published Apr 17 • 23

upvoted a paper 2 months ago

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

Paper • 2604.10866 • Published Apr 13 • 67

authored a paper 2 months ago

Do Phone-Use Agents Respect Your Privacy?

Paper • 2604.00986 • Published Apr 1 • 9

upvoted a paper 3 months ago

Do Phone-Use Agents Respect Your Privacy?

Paper • 2604.00986 • Published Apr 1 • 9

submitted a paper to Daily Papers 3 months ago

Do Phone-Use Agents Respect Your Privacy?

Paper • 2604.00986 • Published Apr 1 • 9

upvoted a paper 4 months ago

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 356

authored 2 papers 4 months ago

Teaching Language Models to Reason with Tools

Paper • 2510.20342 • Published Oct 23, 2025

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published Feb 2 • 273

Zhengyang Tang

AI & ML interests

Recent Activity

Organizations

tangzhy's activity