Wenhao Huang's picture

1 24

Wenhao Huang

StephenHuang

·

AI & ML interests

None yet

Organizations

upvoted 3 papers 3 months ago

FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction

Paper • 2508.11987 • Published Aug 16 • 70

WideSearch: Benchmarking Agentic Broad Info-Seeking

Paper • 2508.07999 • Published Aug 11 • 109

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Paper • 2507.23726 • Published Jul 31 • 113

upvoted 2 papers 4 months ago

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published Jul 8 • 43

A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 92

upvoted 5 papers 8 months ago

FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis

Paper • 2503.13265 • Published Mar 17 • 15

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Paper • 2503.08638 • Published Mar 11 • 70

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

Paper • 2502.19361 • Published Feb 26 • 28

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models

Paper • 2502.16614 • Published Feb 23 • 27

Audio-FLAN: A Preliminary Release

Paper • 2502.16584 • Published Feb 23 • 36

upvoted a paper 9 months ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published Feb 20 • 104

upvoted 9 papers about 1 year ago

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions

Paper • 2410.20424 • Published Oct 27, 2024 • 40

LLaSM: Large Language and Speech Model

Paper • 2308.15930 • Published Aug 30, 2023 • 34

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

Paper • 2401.11944 • Published Jan 22, 2024 • 27

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25, 2024 • 60

Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 65

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

Paper • 2405.19327 • Published May 29, 2024 • 48

ING-VP: MLLMs cannot Play Easy Vision-based Games Yet

Paper • 2410.06555 • Published Oct 9, 2024 • 8

Aria: An Open Multimodal Native Mixture-of-Experts Model

Paper • 2410.05993 • Published Oct 8, 2024 • 111

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 53