This is a MXFP4 quant of Qwen3-Next-80B-A3B-Thinking
Welcome to the bleeding edge.
I must to point out that this a experimental release.
Say it after me, EXPERIMENTAL.
This has been made possible cause of the excellent work done by pwilkin and others.
He has a development branch of llama.cpp for Qwen3-Next.
It has not yet been released officially, and things are moving quite fast.
For the time being, as of 2025-10-24 I got the source code from his fork and compiled in order to be able to generate the GGUF's, from here: https://github.com/pwilkin/llama.cpp/tree/qwen3_next
This GGUF will run only with it.
If you cannot compile it yourself, I have made a Windows version with Vulkan support you can find here:
llama-qwen3-next-5edfe78-bin-win-vulkan-x64.zip
I should state that this may trigger false positives from your AV, this has NO virus, I compiled it on a my Windows 11 PC that i check regularly for viruses.
If you don't trust strangers giving out binaries, you can try compiling it for yourself, in order to be sure.
https://www.virustotal.com/gui/file/35a134a8977488ff6b82ce3f2b5df20da742ec212859a5e0c30813c55519f4f0
When the support for Qwen3-Next officially releases on mainline llama.cpp, I will see if these files will need a new updated quantization, and update if needed.
- Downloads last month
- 239
4-bit
Model tree for noctrex/Qwen3-Next-80B-A3B-Thinking-MXFP4_MOE-GGUF
Base model
Qwen/Qwen3-VL-235B-A22B-Thinking