TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 9 days ago • 107
Gemma 4 Uncensored Collection Abliterated Gemma 4 models with refusal behavior removed. Biprojection + EGA for MoE. Cross-validated against 686 prompts from 4 datasets. • 8 items • Updated 10 days ago • 21
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published 26 days ago • 337
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published 20 days ago • 52
From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space Paper • 2603.12648 • Published Mar 13 • 14
Qwen3.5 Unredacted MAX Collection Continual “abliteration” models – experimental. • 8 items • Updated 28 days ago • 4
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published Feb 13 • 44
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published Feb 13 • 58
Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis Paper • 2602.03139 • Published Feb 3 • 44
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 Feb 4 • 89
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space Paper • 2602.02092 • Published Feb 2 • 18