π§ Zen Training Space
Unified Training Platform for All Zen Models
Train any Zen model with any dataset combination from HuggingFace. Everything runs directly from HF datasets - no local storage needed!
π― Features
Supported Models
Language Models:
zen-nano(0.6B) - Edge deploymentzen-eco(4B) - Balanced performancezen-omni(7B) - Multi-taskzen-coder(14B) - Code generationzen-next(32B) - Frontier performance
Vision-Language Models:
zen-vl-4b- Efficient VL with function callingzen-vl-8b- Enhanced VL capabilitieszen-vl-30b- Maximum VL performance
Supported Datasets
Agent Training (ADP):
- AgentTuning OS/KG/DB (~15k samples)
- Synatra (99k agent trajectories)
- Code Feedback (66k samples)
- Go Browse (27k web interactions)
Function Calling:
- xLAM 60k (Salesforce high-quality function calling)
Instruction Tuning:
- Alpaca (52k instruction samples)
π How to Use
- Select Model: Choose from language or vision-language models
- Select Datasets: Check multiple datasets to combine them
- Configure Training: Set epochs, batch size, learning rate, max samples
- Set Output Repo: Specify HuggingFace repo for trained model
- Start Training: Click the button and monitor logs
βοΈ Training Configuration
Recommended Settings
4B Models (A10G - 24GB):
- Batch Size: 1-2
- Max Samples: 10,000-30,000
- Time: 4-8 hours
- Cost: ~$3-5
8B Models (A100 - 40GB):
- Batch Size: 2-4
- Max Samples: 30,000-50,000
- Time: 8-12 hours
- Cost: ~$15-20
32B Models (A100 - 80GB):
- Batch Size: 1-2
- Max Samples: 50,000-100,000
- Time: 20-30 hours
- Cost: ~$50-80
π Dataset Combinations
For Agent Training:
ADP Synatra (80%) + xLAM (20%)
= Strong agent + quality function calling
For Code Models:
Code Feedback (70%) + Alpaca (30%)
= Code expertise + general instruction following
For VL Models:
ADP (all configs) + xLAM
= Complete vision-language agent training
π Requirements
- HuggingFace Pro account (for GPU access)
- Write access to output repository
- HF_TOKEN secret set in Space settings
π‘ Tips
- Start Small: Test with 1,000 samples first
- Mix Datasets: Combine complementary datasets for best results
- Monitor Logs: Watch for OOM errors and adjust batch size
- Save Often: Lower save_steps for longer training runs
π Resources
- Website: https://zenlm.org
- GitHub: https://github.com/zenlm
- Models: https://huggingface.co/zenlm
- Datasets:
π License
Apache 2.0
π Citations
@software{zen-training-2025,
title={Zen Training: Unified Training Platform for Zen Models},
author={Zen AI Team},
year={2025},
url={https://huggingface.co/spaces/zenlm/zen-training}
}
@article{adp2024,
title={Agent Data Protocol},
author={NeuLab},
journal={arXiv preprint arXiv:2510.24702},
year={2024}
}
@dataset{xlam2024,
title={xLAM Function Calling Dataset},
author={Salesforce Research},
year={2024}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support