Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context Paper • 2510.06182 • Published Oct 7 • 8
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning Paper • 2510.04081 • Published Oct 5 • 22
Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval and Synthesis for SLMs Paper • 2509.24107 • Published Sep 28 • 77
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning Paper • 2510.06217 • Published Oct 7 • 63
Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization Paper • 2510.05342 • Published Oct 6 • 5