Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents Paper • 2509.09265 • Published Sep 11 • 45
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks Paper • 2503.09572 • Published Mar 12 • 2
Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let's Take TravelPlanner as an Example Paper • 2408.06318 • Published Aug 12, 2024 • 1
OdysseyBench: Evaluating LLM Agents on Long-Horizon Complex Office Application Workflows Paper • 2508.09124 • Published Aug 12 • 3
Speech Evals Collection Synthesized speech evals generated by MistralAI from popular text evaluation datasets to evaluate spoken-language reasoning capabilities of Audio LLMs • 3 items • Updated Jul 18 • 9
Running on CPU Upgrade 1.12k 1.12k Open ASR Leaderboard 🏆 Display and request speech recognition model benchmarks
view article Article 5 Things You Need to Know About Moonshot AI and Kimi K2, the New #1 model on the Hub By fdaudens and 1 other • Jul 15 • 24
view article Article Introducing ColQwen-Omni: Retrieve in every modality By manu and 4 others • Jul 17 • 75
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9 • 699
view article Article Qwen2-VL-OCR-2B-Instruct and VisionOCR-3B-061125 for precise recognition of [messy] handwriting. By prithivMLmods • Jun 17 • 11