Haoze Wu
WaitHZ
AI & ML interests
Modular DL, Complex Reasoning
Recent Activity
authored
a paper
about 9 hours ago
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic,
and Long-Horizon Task Execution
upvoted
a
paper
about 21 hours ago
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic,
and Long-Horizon Task Execution
liked
a dataset
2 days ago
hkust-nlp/Toolathlon-Trajectories