Value Drifts: Tracing Value Alignment During LLM Post-Training Paper • 2510.26707 • Published Oct 30 • 12
HUME: Measuring the Human-Model Performance Gap in Text Embedding Task Paper • 2510.10062 • Published Oct 11 • 8
FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents Paper • 2510.03204 • Published Oct 3 • 6
LineRetriever: Planning-Aware Observation Reduction for Web Agents Paper • 2507.00210 • Published Jun 30 • 6
view article Article MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before Apr 24 • 16
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7 • 262
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 252
Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents Paper • 2406.04028 • Published Jun 6, 2024 • 2
Teaching Large Language Models to Reason with Reinforcement Learning Paper • 2403.04642 • Published Mar 7, 2024 • 50
GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer Paper • 2311.08526 • Published Nov 14, 2023 • 12