Triangle104
/

QwQ-32B-Q4_K_S-GGUF

Text Generation

Model card Files Files and versions

Triangle104 commited on Mar 29

Commit

724f8d3

·

verified ·

1 Parent(s): 1192d92

Update README.md

Files changed (1) hide show

README.md +22 -0

README.md CHANGED Viewed

@@ -16,6 +16,28 @@ tags:
 This model was converted to GGUF format from [`Qwen/QwQ-32B`](https://huggingface.co/Qwen/QwQ-32B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/Qwen/QwQ-32B) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`Qwen/QwQ-32B`](https://huggingface.co/Qwen/QwQ-32B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/Qwen/QwQ-32B) for more details on the model.
+---
+QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.
+This repo contains the QwQ 32B model, which has the following features:
+-Type: Causal Language Models
+-Training Stage: Pretraining & Post-training (Supervised Finetuning and Reinforcement Learning)
+-Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
+-Number of Parameters: 32.5B
+-Number of Paramaters (Non-Embedding): 31.0B
+-Number of Layers: 64
+-Number of Attention Heads (GQA): 40 for Q and 8 for KV
+-Context Length: Full 131,072 tokens
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)