leaderboards - a MoritzLaurer Collection

MoritzLaurer 's Collections

prompt-templates

Zeroshot Classifiers

other-interesting

code generation

leaderboards

updated Apr 2

Running

4.65k

4.65k

LMArena Leaderboard

🏆

Display LMArena Leaderboard
Running on CPU Upgrade

13.6k

13.6k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade

6.59k

6.59k

MTEB Leaderboard

🥇

Embedding Leaderboard
Running on CPU Upgrade

1.11k

1.11k

Open ASR Leaderboard

🏆

Display and request speech recognition model benchmarks
Running

569

569

LLM-Perf Leaderboard

🏆

Explore hardware performance for LLMs
Running

1.45k

1.45k

Big Code Models Leaderboard

📈

Submit code models for evaluation and view leaderboard
Runtime error

78

78

Human & GPT-4 Evaluation of LLMs Leaderboard

👩
Running

449

449

Can Ai Code Results

🏆

Can AI Code? An LLM leaderboard inclquantized models.
Runtime error

144

144

Hallucinations Leaderboard

🔥

View and submit LLM evaluations
Runtime error

105

105

Enterprise Scenarios Leaderboard

🥇
Running on CPU Upgrade

94

94

LLM Safety Leaderboard

🥇

Explore and submit LLM benchmarks
Running

557

557

Vision Arena (Testing VLMs side-by-side)

🖼

Display image analysis results
Running

70

70

CyberSecEvalTest

📈

Evaluate LLMs' cybersecurity risks and capabilities
Running

403

403

LLM Performance Leaderboard

🐨

Display LLM performance leaderboards
Runtime error

73

73

AIR-Bench Leaderboard

🥇

Explore and compare QA and long doc benchmarks
Running on CPU Upgrade

920

920

Open VLM Leaderboard

🌎

VLMEvalKit Evaluation Results Collection
Running

402

402

Reward Bench Leaderboard

📐

Display and analyze reward model evaluation results
Running

225

225

BigCodeBench Leaderboard

🥇

Explore and analyze code completion benchmarks
Runtime error

10

10

MJ Bench Leaderboard

🥇

Display and filter multimodal model leaderboard results
Running

115

115

MTEB Arena

⚔

Display MTEB Arena interface
Runtime error

151

151

Open LLM Progress Tracker

🔬

Visualize Open vs. Proprietary LLM Progress
Running

108

108

Judge Arena

💻

Vote on AI responses to rank models
Running

444

444

TTS Spaces Arena

🤗

Blind vote on HF TTS models!
Running

139

139

smolagents LLM leaderboard

🏆

A leaderboard for LLMs powering smolagents