view article Article Activation Steering: A New Frontier in AI ControlβBut Does It Scale? Feb 2, 2025 β’ 4
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7, 2025 β’ 266
100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models Paper β’ 2505.00551 β’ Published May 1, 2025 β’ 36
LLMs for Engineering: Teaching Models to Design High Powered Rockets Paper β’ 2504.19394 β’ Published Apr 27, 2025 β’ 13
AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization Paper β’ 2504.21659 β’ Published Apr 30, 2025 β’ 14
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers Paper β’ 2504.20752 β’ Published Apr 29, 2025 β’ 92
EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents Paper β’ 2501.11858 β’ Published Jan 21, 2025 β’ 7
RL + Transformer = A General-Purpose Problem Solver Paper β’ 2501.14176 β’ Published Jan 24, 2025 β’ 28
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper β’ 2501.05441 β’ Published Jan 9, 2025 β’ 95
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper β’ 2501.04003 β’ Published Jan 7, 2025 β’ 27
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models Paper β’ 2412.01822 β’ Published Dec 2, 2024 β’ 15
DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving Paper β’ 2411.15139 β’ Published Nov 22, 2024 β’ 15
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper β’ 2411.10440 β’ Published Nov 15, 2024 β’ 129