Prince Canuma's picture

Building on HF

Prince Canuma PRO

prince-canuma

·

AI & ML interests

None yet

Recent Activity

published a model 5 days ago

mlx-community/GLM-5.1-nvfp4

published a model 5 days ago

mlx-community/GLM-5.1-mxfp8

published a model 5 days ago

mlx-community/GLM-5.1-mxfp4

View all activity

Organizations

upvoted 5 papers 6 days ago

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models

Paper • 2503.16257 • Published Mar 20, 2025 • 28

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Paper • 2402.02750 • Published Feb 5, 2024 • 5

Token Warping Helps MLLMs Look from Nearby Viewpoints

Paper • 2604.02870 • Published 10 days ago • 33

Self-Distilled RLVR

Paper • 2604.03128 • Published 10 days ago • 155

Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

Paper • 2406.05955 • Published Jun 10, 2024 • 28

upvoted a paper about 2 months ago

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 139

upvoted an article 3 months ago

Article

Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR

Jan 5

•

85

upvoted a collection 3 months ago

Nemotron Speech

Open, state-of-the-art, production‑ready enterprise speech models from the NVIDIA Speech research team for ASR, TTS, Speaker Diarization and S2S • 11 items • Updated 6 days ago • 46

upvoted 2 articles 3 months ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

Dec 18, 2025

•

124

Article

NVIDIA brings agents to life with DGX Spark and Reachy Mini

+1

Jan 5

•

66

upvoted a collection 5 months ago

INTELLECT 3

5 items • Updated Nov 27, 2025 • 1

upvoted a collection 7 months ago

EmbeddingGemma

7 items • Updated Sep 4, 2025 • 4

upvoted 2 collections 8 months ago

Gemma 3-270m

20 items • Updated Aug 14, 2025 • 6

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 29 items • Updated Aug 14, 2025 • 32

upvoted 2 collections 11 months ago

Perception Encoder

16 items • Updated Mar 2 • 80

LLaMA-Omni

13 items • Updated May 17, 2025 • 20

upvoted 2 collections 12 months ago

VideoChat-R1

VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning • 4 items • Updated Sep 28, 2025 • 9

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Mar 12 • 218

upvoted 2 papers about 1 year ago

DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models

Paper • 2504.02882 • Published Apr 2, 2025 • 7

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 207