MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 53
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification Paper • 2509.15591 • Published Sep 19 • 45
When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance Paper • 2509.22193 • Published Sep 26 • 37
PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning Paper • 2509.19894 • Published Sep 24 • 32
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization Paper • 2509.13313 • Published Sep 16 • 77
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18 • 83
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 240