Flash Sparse Attention: An Alternative Efficient Implementation of Native Sparse Attention Kernel Paper β’ 2508.18224 β’ Published Aug 25 β’ 1
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper β’ 2511.09611 β’ Published 14 days ago β’ 66
moonshotai/Kimi-Linear-48B-A3B-Instruct Text Generation β’ 49B β’ Updated 1 day ago β’ 332k β’ 487
inclusionAI/LLaDA2.0-flash-preview Text Generation β’ 103B β’ Updated about 22 hours ago β’ 1.03k β’ 67
gghfez/GLM-4.6-control-vectors Text Generation β’ 466k β’ Updated about 1 month ago β’ 1.53k β’ 6