DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published 6 days ago • 48
BhashaBench V1: A Comprehensive Benchmark for the Quadrant of Indic Domains Paper • 2510.25409 • Published Oct 29 • 3
ColorAgent: Building A Robust, Personalized, and Interactive OS Agent Paper • 2510.19386 • Published Oct 22 • 8
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation Paper • 2510.09116 • Published Oct 10 • 95
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1 • 107
MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors Paper • 2409.15273 • Published Sep 23, 2024 • 13