LMArena Leaderboard
Display LMArena Leaderboard
Display LMArena Leaderboard
Track, rank and evaluate open LLMs and chatbots
Embedding Leaderboard
Display and request speech recognition model benchmarks
Explore hardware performance for LLMs
Submit code models for evaluation and view leaderboard
Can AI Code? An LLM leaderboard inclquantized models.
View and submit LLM evaluations
Explore and submit LLM benchmarks
Display image analysis results
Evaluate LLMs' cybersecurity risks and capabilities
Display LLM performance leaderboards
Explore and compare QA and long doc benchmarks
VLMEvalKit Evaluation Results Collection
Display and analyze reward model evaluation results
Explore and analyze code completion benchmarks
Display and filter multimodal model leaderboard results
Display MTEB Arena interface
Visualize Open vs. Proprietary LLM Progress
Vote on AI responses to rank models
Blind vote on HF TTS models!
A leaderboard for LLMs powering smolagents