User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale Paper • 2601.08225 • Published 5 days ago • 46
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale Paper • 2601.08225 • Published 5 days ago • 46
Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training Paper • 2509.25758 • Published Sep 30, 2025 • 22
Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training Paper • 2509.25758 • Published Sep 30, 2025 • 22 • 2
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs Paper • 2506.14429 • Published Jun 17, 2025 • 44
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published Mar 13, 2025 • 29
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster Paper • 2503.09662 • Published Mar 12, 2025 • 33
CompAct: Compressing Retrieved Documents Actively for Question Answering Paper • 2407.09014 • Published Jul 12, 2024 • 1
IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval Paper • 2503.04644 • Published Mar 6, 2025 • 21
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information Paper • 2502.14258 • Published Feb 20, 2025 • 26