"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked
a model
1 day ago
Qwen/Qwen3-VL-8B-Thinking
liked
a model
2 days ago
moonshotai/Kimi-Linear-48B-A3B-Base
liked
a model
3 days ago
Qwen/Qwen3-VL-8B-Instruct
Organizations
None yet