Rosswill

Kutches

AI & ML interests

Recent Activity

liked a model about 16 hours ago

jayn7/WAN2.2-I2V_A14B-DISTILL-LIGHTX2V-4STEP-GGUF

liked a model about 17 hours ago

baidu/ERNIE-Image-Turbo

liked a model about 17 hours ago

baidu/ERNIE-Image

View all activity

Organizations

None yet

upvoted a paper 8 days ago

TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published 9 days ago • 107

upvoted a collection 9 days ago

Gemma 4 Uncensored

Collection

Abliterated Gemma 4 models with refusal behavior removed. Biprojection + EGA for MoE. Cross-validated against 686 prompts from 4 datasets. • 8 items • Updated 10 days ago • 21

upvoted 2 papers 13 days ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 26 days ago • 337

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

Paper • 2603.25730 • Published 20 days ago • 52

upvoted a paper 29 days ago

Attention Residuals

Paper • 2603.15031 • Published 30 days ago • 180

upvoted 2 papers 30 days ago

From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space

Paper • 2603.12648 • Published Mar 13 • 14

Can Vision-Language Models Solve the Shell Game?

Paper • 2603.08436 • Published Mar 9 • 39

upvoted a collection about 1 month ago

Qwen3.5 Unredacted MAX

Collection

Continual “abliteration” models – experimental. • 8 items • Updated 28 days ago • 4

upvoted 2 papers about 1 month ago

Utonia: Toward One Encoder for All Point Clouds

Paper • 2603.03283 • Published Mar 3 • 185

dLLM: Simple Diffusion Language Modeling

Paper • 2602.22661 • Published Feb 26 • 152

upvoted 3 papers about 2 months ago

Unified Latents (UL): How to train your latents

Paper • 2602.17270 • Published Feb 19 • 60

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

Paper • 2602.13515 • Published Feb 13 • 44

SLA2: Sparse-Linear Attention with Learnable Routing and QAT

Paper • 2602.12675 • Published Feb 13 • 58

upvoted an article about 2 months ago

Article

Qwen3.5: Nobody Agrees on Attention Anymore

Feb 17

•

upvoted 2 papers 2 months ago

GENIUS: Generative Fluid Intelligence Evaluation Suite

Paper • 2602.11144 • Published Feb 11 • 55

Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis

Paper • 2602.03139 • Published Feb 3 • 44

upvoted an article 2 months ago

Article

Community Evals: Because we're done trusting black-box leaderboards over the community

Feb 4

•

upvoted a paper 2 months ago

FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space

Paper • 2602.02092 • Published Feb 2 • 18

upvoted a collection 3 months ago

Qwen3-TTS

Collection

7 items • Updated Jan 22 • 345

upvoted a paper 3 months ago

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 194

Rosswill

AI & ML interests

Recent Activity

Organizations

Kutches's activity

Qwen3.5: Nobody Agrees on Attention Anymore

Community Evals: Because we're done trusting black-box leaderboards over the community