--- license: apache-2.0 base_model: Qwen/Qwen2.5-3B-Instruct tags: - peft - lora - value-stream-mapping - manufacturing - qwen2.5 library_name: peft --- # VSM-LLM-3B-Fast LoRA adapter for Qwen2.5-3B-Instruct, fine-tuned for Value Stream Mapping (VSM) generation. ## Model Details - **Base Model**: Qwen/Qwen2.5-3B-Instruct - **Training Method**: LoRA (r=8, alpha=16) - **Task**: Generate structured JSON for VSM plots - **Training Data**: 20 synthetic VSM examples - **Trainable Parameters**: 14.9M (0.48% of base model) ## Usage ### With Transformers + PEFT ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel # Load base model base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct") tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct") # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "nishit1945/VSM-LLM-3B-Fast") # Generate messages = [ {"role": "system", "content": "You are a VSM expert."}, {"role": "user", "content": "Generate VSM JSON..."} ] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt") outputs = model.generate(inputs, max_new_tokens=400, temperature=0.1) ``` ### HuggingFace Inference API ```bash curl https://router.huggingface.co/hf-inference/models/nishit1945/VSM-LLM-3B-Fast \ -H "Authorization: Bearer YOUR_HF_TOKEN" \ -H "Content-Type: application/json" \ -d '{"inputs": "VSM prompt...", "parameters": {"max_new_tokens": 400, "temperature": 0.1}}' ``` ## Training Details - Epochs: 3 - Batch Size: 1 (gradient accumulation: 16) - Learning Rate: 2e-4 - Optimizer: paged_adamw_8bit - Quantization: 4-bit (training only) ## Performance - **Inference Time**: 2-4 seconds on CPU - **Output Format**: Structured JSON with processes, flows, metrics, coordinates ## License Apache 2.0