Attention Is All You Need for KV Cache in Diffusion LLMs Paper • 2510.14973 • Published Oct 16, 2025 • 40
nablaNABLA: Neighborhood Adaptive Block-Level Attention Paper • 2507.13546 • Published Jul 17, 2025 • 124
Transition Matching: Scalable and Flexible Generative Modeling Paper • 2506.23589 • Published Jun 30, 2025 • 1
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation Paper • 2506.20639 • Published Jun 25, 2025 • 30
Pre-trained Large Language Models Learn Hidden Markov Models In-context Paper • 2506.07298 • Published Jun 8, 2025 • 26
Geopolitical biases in LLMs: what are the "good" and the "bad" countries according to contemporary language models Paper • 2506.06751 • Published Jun 7, 2025 • 71
Seedance 1.0: Exploring the Boundaries of Video Generation Models Paper • 2506.09113 • Published Jun 10, 2025 • 105
DiffusionBlocks: Blockwise Training for Generative Models via Score-Based Diffusion Paper • 2506.14202 • Published Jun 17, 2025 • 2
Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression Paper • 2506.09482 • Published Jun 11, 2025 • 45
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17, 2025 • 45
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models Paper • 2505.23656 • Published May 29, 2025 • 25