Update README.md
Browse files
README.md
CHANGED
|
@@ -16,6 +16,28 @@ tags:
|
|
| 16 |
This model was converted to GGUF format from [`Qwen/QwQ-32B`](https://huggingface.co/Qwen/QwQ-32B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
|
| 17 |
Refer to the [original model card](https://huggingface.co/Qwen/QwQ-32B) for more details on the model.
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
## Use with llama.cpp
|
| 20 |
Install llama.cpp through brew (works on Mac and Linux)
|
| 21 |
|
|
|
|
| 16 |
This model was converted to GGUF format from [`Qwen/QwQ-32B`](https://huggingface.co/Qwen/QwQ-32B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
|
| 17 |
Refer to the [original model card](https://huggingface.co/Qwen/QwQ-32B) for more details on the model.
|
| 18 |
|
| 19 |
+
---
|
| 20 |
+
QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.
|
| 21 |
+
|
| 22 |
+
This repo contains the QwQ 32B model, which has the following features:
|
| 23 |
+
|
| 24 |
+
-Type: Causal Language Models
|
| 25 |
+
|
| 26 |
+
-Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
|
| 27 |
+
|
| 28 |
+
-Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
| 29 |
+
|
| 30 |
+
-Number of Parameters: 32.5B
|
| 31 |
+
|
| 32 |
+
-Number of Paramaters (Non-Embedding): 31.0B
|
| 33 |
+
|
| 34 |
+
-Number of Layers: 64
|
| 35 |
+
|
| 36 |
+
-Number of Attention Heads (GQA): 40 for Q and 8 for KV
|
| 37 |
+
|
| 38 |
+
-Context Length: Full 131,072 tokens
|
| 39 |
+
|
| 40 |
+
---
|
| 41 |
## Use with llama.cpp
|
| 42 |
Install llama.cpp through brew (works on Mac and Linux)
|
| 43 |
|