MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published 19 days ago • 156
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published 19 days ago • 156
MiroThinker-v1.0 Collection Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling • 7 items • Updated 16 days ago • 40
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2 • 83
Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance Paper • 2502.12459 • Published Feb 18 • 2
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper • 2506.04734 • Published Jun 5 • 20
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper • 2506.04734 • Published Jun 5 • 20
view reply How is packing implemented in your code? Have you tried using a 4D attention mask to avoid the overlap between samples that you mentioned?