This is a MXFP4 quant of Qwen3-Next-80B-A3B-Thinking

Welcome to the bleeding edge.

I must to point out that this a experimental release.

Say it after me, EXPERIMENTAL.

This has been made possible cause of the excellent work done by pwilkin and others.

He has a development branch of llama.cpp for Qwen3-Next.

It has not yet been released officially, and things are moving quite fast.

For the time being, as of 2025-10-24 I got the source code from his fork and compiled in order to be able to generate the GGUF's, from here: https://github.com/pwilkin/llama.cpp/tree/qwen3_next

This GGUF will run only with it.

If you cannot compile it yourself, I have made a Windows version with Vulkan support you can find here:

llama-qwen3-next-5edfe78-bin-win-vulkan-x64.zip

I should state that this may trigger false positives from your AV, this has NO virus, I compiled it on a my Windows 11 PC that i check regularly for viruses.

If you don't trust strangers giving out binaries, you can try compiling it for yourself, in order to be sure.

https://www.virustotal.com/gui/file/35a134a8977488ff6b82ce3f2b5df20da742ec212859a5e0c30813c55519f4f0

When the support for Qwen3-Next officially releases on mainline llama.cpp, I will see if these files will need a new updated quantization, and update if needed.

Downloads last month: 239

GGUF

Model size

80B params

Architecture

qwen3next

Hardware compatibility

4-bit

Model tree for noctrex/Qwen3-Next-80B-A3B-Thinking-MXFP4_MOE-GGUF

Base model

Qwen/Qwen3-VL-235B-A22B-Thinking

Quantized

(4)

this model

Collection including noctrex/Qwen3-Next-80B-A3B-Thinking-MXFP4_MOE-GGUF

Qwen

Collection

Models from the Qwen team • 13 items • Updated about 18 hours ago • 1