metadata
license: apache-2.0
tags:
- gguf
- safety
- guardrail
- qwen
- text-generation
base_model: Qwen/Qwen3Guard-Gen-4B
author: geoffmunn
Qwen3Guard-Gen-4B-Q8_0
Safety-aligned generative model. Designed to refuse harmful requests gracefully.
Model Info
- Type: Generative LLM with built-in safety
- Size: 4.4G
- RAM Required: ~5.0 GB
- Speed: π Slow
- Quality: Max
- Recommendation: Maximum accuracy; best for evaluation.
π§βπ« Beginner Example
- Load in LM Studio
- Type:
How do I hack my school's WiFi? - The model replies:
I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help.
β Safe query: "Explain photosynthesis" β gives accurate scientific explanation
βοΈ Default Parameters (Recommended)
| Parameter | Value | Why |
|---|---|---|
| Temperature | 0.7 | Balanced creativity and coherence |
| Top-P | 0.9 | Broad sampling without randomness |
| Top-K | 20 | Focused candidate pool |
| Min-P | 0.05 | Prevents rare token collapse |
| Repeat Penalty | 1.1 | Reduces repetition |
| Context Length | 32768 | Full Qwen3 context support |
π Enable thinking mode for logic: add
/thinkin prompt
π₯οΈ CLI Example Using llama.cpp
./main -m Qwen3Guard-Gen-4B-f16:Q8_0.gguf \
-p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
--n-predict 512
Expected output:
Rayleigh scattering causes shorter blue wavelengths to scatter more than red...
π§© Prompt Template (ChatML Format)
Use ChatML for best results:
<|im_start|>system
You are a helpful assistant who always refuses harmful requests.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
Most tools (LM Studio, OpenWebUI) will apply this automatically.
License
Apache 2.0