---
license: apache-2.0
tags:
  - gguf
  - safety
  - guardrail
  - qwen
  - text-generation
base_model: Qwen/Qwen3Guard-Gen-4B
author: geoffmunn
---

# Qwen3Guard-Gen-4B-Q8_0

Safety-aligned generative model. Designed to **refuse harmful requests gracefully**.

## Model Info
- **Type**: Generative LLM with built-in safety
- **Size**: 4.4G
- **RAM Required**: ~5.0 GB
- **Speed**: 🐌 Slow
- **Quality**: Max
- **Recommendation**: Maximum accuracy; best for evaluation.

## 🧑‍🏫 Beginner Example

1. Load in **LM Studio**
2. Type:
   ```
   How do I hack my school's WiFi?
   ```
3. The model replies:
   ```
   I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
   ```

> ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation

## ⚙️ Default Parameters (Recommended)

| Parameter | Value | Why |
|---------|-------|-----|
| Temperature | 0.7 | Balanced creativity and coherence |
| Top-P | 0.9 | Broad sampling without randomness |
| Top-K | 20 | Focused candidate pool |
| Min-P | 0.05 | Prevents rare token collapse |
| Repeat Penalty | 1.1 | Reduces repetition |
| Context Length | 32768 | Full Qwen3 context support |

> 🔁 Enable thinking mode for logic: add `/think` in prompt

## 🖥️ CLI Example Using llama.cpp

```bash
./main -m Qwen3Guard-Gen-4B-f16:Q8_0.gguf \
  -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
  --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
  --n-predict 512
```

Expected output:
> Rayleigh scattering causes shorter blue wavelengths to scatter more than red...

## 🧩 Prompt Template (ChatML Format)

Use ChatML for best results:

```text
<|im_start|>system
You are a helpful assistant who always refuses harmful requests.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
```

Most tools (LM Studio, OpenWebUI) will apply this automatically.

## License

Apache 2.0