--- tags: - gguf - llama.cpp license: apache-2.0 datasets: - Jackrong/ShareGPT-Qwen3-235B-A22B-Instuct-2507 language: - en - zh base_model: - Qwen/Qwen3-4B-Instruct-2507 --- # GPT-5-Distill-Qwen3-4B-Instruct-2507 **Model Type**: Instruction-tuned conversational LLM Supports LoRA adapters and full-finetuned models for inference - **Base Model**: `Qwen/Qwen3-4B-Instruct-2507` - **Parameters**: 4B - **Training Method**: - Supervised Fine-Tuning (SFT) on ShareGPT data - Knowledge distillation from LMSYS GPT-5 responses - **Supported Languages**: Chinese, English, mixed inputs/outputs - **Max Context Length**: Up to **32K tokens** (`max_seq_length = 32768`) This model is trained on ShareGPT-Qwen3 instruction datasets and distilled toward the conversational style and quality of GPT-5. It aims to achieve high-quality, natural-sounding dialogues with low computational overhead—perfect for lightweight applications without sacrificing responsiveness. --- ## 2. Intended Use Cases ### ✅ Recommended: - Casual chat in Chinese/English - General knowledge explanations & reasoning guidance - Code suggestions and simple debugging tips - Writing assistance: editing, summarizing, rewriting - Role-playing conversations (with well-designed prompts) ### ⚠️ Not Suitable For: - High-risk decision-making: - Medical diagnosis, mental health support - Legal advice, financial investment recommendations - Real-time factual tasks (e.g., news, stock updates) - Authoritative judgment on sensitive topics > **Note**: Outputs are for reference only and not intended as the sole basis for critical decisions. --- ## 3. Training Data & Distillation Process ### Key Datasets: #### (1) ds1: ShareGPT-Qwen3 Instruction Dataset - Source: `Jackrong/ShareGPT-Qwen3-235B-A22B-Instuct-2507` - Purpose: - Provides diverse instruction-response pairs - Supports multi-turn dialogues and context awareness - Processing: - Cleaned for quality and relevance - Standardized into `instruction`, `input`, `output` format #### (2) ds2: LMSYS GPT-5 Teacher Response Data - Source: `ytz20/LMSYS-Chat-GPT-5-Chat-Response` - Filtering: - Only kept samples with `flaw == "normal"` - Removed hallucinations and inconsistent responses - Purpose: - Distillation target for conversational quality - Enhances clarity, coherence, and fluency ### Training Flow: 1. Prepare unified Chat-formatted dataset 2. Fine-tune base Qwen3-4B-Instruct-2507 via SFT 3. Conduct knowledge distillation using GPT-5's normal responses as teacher outputs 4. Balance style imitation with semantic fidelity to ensure robustness > ⚖️ **Note**: This work is based on publicly available, non-sensitive datasets and uses them responsibly under fair use principles. --- ## 4. Key Features Summary | Feature | Description | |--------|-------------| | **Lightweight** | ~4B parameter model – fast inference, low resource usage | | **Distillation-Style Responses** | Mimics GPT-5’s conversational fluency and helpfulness | | **Highly Conversational** | Excellent for chatbot-style interactions with rich dialogue flow | | **Multilingual Ready** | Seamless support for Chinese and English | --- ## 5. Acknowledgements We thank: - LMSYS team for sharing GPT-5 response data - Jackrong for the ShareGPT-Qwen3 dataset - Qwen team for releasing `Qwen3-4B-Instruct` This project is an open research effort aimed at making high-quality conversational AI accessible with smaller models. ---