Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs Paper • 2510.13795 • Published 13 days ago • 49
Vibe Checker: Aligning Code Evaluation with Human Preference Paper • 2510.07315 • Published 20 days ago • 30
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction Paper • 2508.11987 • Published Aug 16 • 69