Reasoning with Sampling: Your Base Model is Smarter Than You Think Paper • 2510.14901 • Published 13 days ago • 41
AION-1: Omnimodal Foundation Model for Astronomical Sciences Paper • 2510.17960 • Published 9 days ago • 27
When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling Paper • 2510.15346 • Published 13 days ago • 32
Robust Layerwise Scaling Rules by Proper Weight Decay Tuning Paper • 2510.15262 • Published 13 days ago • 4
Large Language Models Do NOT Really Know What They Don't Know Paper • 2510.09033 • Published 20 days ago • 16
Deconstructing Attention: Investigating Design Principles for Effective Language Modeling Paper • 2510.11602 • Published 16 days ago • 14
StatEval: A Comprehensive Benchmark for Large Language Models in Statistics Paper • 2510.09517 • Published 19 days ago • 6
Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval and Synthesis for SLMs Paper • 2509.24107 • Published Sep 28 • 76