Attention Is All You Need for KV Cache in Diffusion LLMs Paper • 2510.14973 • Published 11 days ago • 36
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities Paper • 2507.06261 • Published Jul 7 • 63 • 4
Transition Matching: Scalable and Flexible Generative Modeling Paper • 2506.23589 • Published Jun 30 • 1
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation Paper • 2506.20639 • Published Jun 25 • 31
Pre-trained Large Language Models Learn Hidden Markov Models In-context Paper • 2506.07298 • Published Jun 8 • 26
Geopolitical biases in LLMs: what are the "good" and the "bad" countries according to contemporary language models Paper • 2506.06751 • Published Jun 7 • 71
Seedance 1.0: Exploring the Boundaries of Video Generation Models Paper • 2506.09113 • Published Jun 10 • 102
DiffusionBlocks: Blockwise Training for Generative Models via Score-Based Diffusion Paper • 2506.14202 • Published Jun 17 • 2
Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression Paper • 2506.09482 • Published Jun 11 • 45