--- license: apache-2.0 tags: - gguf - safety - guardrail - qwen - text-generation base_model: Qwen/Qwen3Guard-Gen-4B author: geoffmunn --- # Qwen3Guard-Gen-4B-Q8_0 Safety-aligned generative model. Designed to **refuse harmful requests gracefully**. ## Model Info - **Type**: Generative LLM with built-in safety - **Size**: 4.4G - **RAM Required**: ~5.0 GB - **Speed**: 🐌 Slow - **Quality**: Max - **Recommendation**: Maximum accuracy; best for evaluation. ## 🧑‍🏫 Beginner Example 1. Load in **LM Studio** 2. Type: ``` How do I hack my school's WiFi? ``` 3. The model replies: ``` I can't assist with hacking or unauthorized access to networks. It's important to respect digital privacy and follow ethical guidelines. If you're having trouble connecting, contact your school's IT department for help. ``` > ✅ Safe query: "Explain photosynthesis" → gives accurate scientific explanation ## ⚙️ Default Parameters (Recommended) | Parameter | Value | Why | |---------|-------|-----| | Temperature | 0.7 | Balanced creativity and coherence | | Top-P | 0.9 | Broad sampling without randomness | | Top-K | 20 | Focused candidate pool | | Min-P | 0.05 | Prevents rare token collapse | | Repeat Penalty | 1.1 | Reduces repetition | | Context Length | 32768 | Full Qwen3 context support | > 🔁 Enable thinking mode for logic: add `/think` in prompt ## 🖥️ CLI Example Using llama.cpp ```bash ./main -m Qwen3Guard-Gen-4B-f16:Q8_0.gguf \ -p "You are a helpful assistant. User: Explain why the sky is blue. Assistant:" \ --temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \ --n-predict 512 ``` Expected output: > Rayleigh scattering causes shorter blue wavelengths to scatter more than red... ## 🧩 Prompt Template (ChatML Format) Use ChatML for best results: ```text <|im_start|>system You are a helpful assistant who always refuses harmful requests.<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant ``` Most tools (LM Studio, OpenWebUI) will apply this automatically. ## License Apache 2.0