Model Card: lily211/moe-llm-127m
Model Details
Model Description
lily211/moe-llm-127m is a small-scale Mixture of Experts (MoE) language model trained for experimentation and research purposes.
It is based on a transformer architecture with sparsely activated experts to reduce memory and compute requirements while still achieving competitive performance for its size.
- Developed by: lily211
- Model type: Mixture-of-Experts Language Model
- Parameters: ~127M (base + experts, sparsely activated)
- Language(s): English (primary)
- License: MIT (default โ please confirm if different)
- Finetuned from: Custom architecture (inspired by GPT-2/decoder-only transformers)
Model Sources
- Repository: Hugging Face Model Hub
Uses
Direct Use
- Text generation (English)
- Research in efficient architectures (MoE scaling)
- Educational experiments in training & inference with MoE models
Downstream Use
- Fine-tuning on domain-specific datasets (e.g., instruction-tuning, Q&A, dialogue)
- Distillation into smaller dense models
- Experimentation with MoE routing strategies
Out-of-Scope Use
- Production-ready deployment in safety-critical applications
- Factual knowledge retrieval or reasoning-intensive tasks
- Use in sensitive domains without additional fine-tuning, evaluation, and safety checks
Bias, Risks, and Limitations
- Trained primarily on open web-style text; may reflect biases and stereotypes.
- Limited knowledge scope compared to larger LLMs.
- May generate hallucinations, incoherent responses, or unsafe content.
Recommendations
- Always human-review outputs before downstream use.
- Do not rely on this model for factual accuracy without verification.
- For safety-sensitive domains, prefer larger, audited LLMs.
How to Get Started with the Model
Example usage with transformers:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "lily211/moe-llm-127m"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
inputs = tokenizer("The bluebells are blooming", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 209