What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT Paper • 2509.19284 • Published Sep 23 • 22
Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet Paper • 2509.06861 • Published Sep 8 • 8
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 137
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights Paper • 2509.22944 • Published Sep 26 • 78
Running 3.52k The Ultra-Scale Playbook 🌌 3.52k The ultimate guide to training LLM on large GPU Clusters
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published Oct 10 • 48