Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
little-jack
's Collections
Cite
agent
planning
IFT
RLHF
sft
pre-train
some benchmark
some benchmark
updated
Oct 28, 2025
Upvote
-
cais/mmlu
Viewer
•
Updated
Mar 8, 2024
•
231k
•
552k
•
748
TIGER-Lab/MMLU-Pro
Benchmark
•
Updated
24 days ago
•
12.1k
•
167k
•
476
cais/hle
Benchmark
•
Updated
Jan 20
•
2.5k
•
43.6k
•
805
m-a-p/SuperGPQA
Viewer
•
Updated
Apr 30, 2025
•
26.5k
•
29.3k
•
88
lmarena-ai/arena-hard-auto
Updated
May 1, 2025
•
1.24k
•
7
Running
Agents
202
MT Bench
📊
202
Compare AI model responses side-by-side
Upvote
-
Share collection
View history
Collection guide
Browse collections