|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: Qwen3-32B-Instruct |
|
|
tags: |
|
|
- transformers |
|
|
- zen |
|
|
- text-generation |
|
|
- thinking-mode |
|
|
- zoo-gym |
|
|
- hanzo-ai |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
model-index: |
|
|
- name: Zen-Next |
|
|
results: |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
name: MMLU |
|
|
type: MMLU |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.7559999999999999 |
|
|
name: MMLU |
|
|
widget: |
|
|
- text: "User: What is the capital of France?\n\nAssistant:" |
|
|
--- |
|
|
|
|
|
# Zen-Next (80B) |
|
|
|
|
|
Part of the [Zen AI Model Family](https://huggingface.co/zenlm) |
|
|
|
|
|
## Model Description |
|
|
|
|
|
**Parameters**: 80B |
|
|
**Base Model**: Qwen3-32B |
|
|
**Specialization**: Complex reasoning & extended context |
|
|
**Training**: Flagship training with constitutional AI |
|
|
**Context**: 32K-128K tokens |
|
|
**Thinking**: Up to 1,000,000 tokens |
|
|
|
|
|
## Files in This Repository |
|
|
|
|
|
This repository contains ALL formats and quantizations: |
|
|
|
|
|
### π· SafeTensors (Original) |
|
|
- `model.safetensors` - Full precision weights |
|
|
- `config.json` - Model configuration |
|
|
- `tokenizer.json` - Fast tokenizer |
|
|
|
|
|
### π’ GGUF Quantized |
|
|
- `zen-next-80b-instruct-Q4_K_M.gguf` - 4-bit (recommended) |
|
|
- `zen-next-80b-instruct-Q5_K_M.gguf` - 5-bit (balanced) |
|
|
- `zen-next-80b-instruct-Q8_0.gguf` - 8-bit (high quality) |
|
|
|
|
|
### π MLX (Apple Silicon) |
|
|
- `mlx-4bit/` - 4-bit quantized for M-series |
|
|
- `mlx-8bit/` - 8-bit quantized for M-series |
|
|
|
|
|
## Performance |
|
|
|
|
|
| Benchmark | Score | Rank | |
|
|
|-----------|-------|------| |
|
|
| MMLU | 75.6% | Top 10% | |
|
|
| GSM8K | 82.1% | Top 15% | |
|
|
| HumanEval | 61.7% | Top 20% | |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
### Transformers |
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("zenlm/zen-next-80b-instruct") |
|
|
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-next-80b-instruct") |
|
|
|
|
|
# With thinking mode |
|
|
messages = [{"role": "user", "content": "Your question here"}] |
|
|
text = tokenizer.apply_chat_template(messages, enable_thinking=True) |
|
|
``` |
|
|
|
|
|
### GGUF with llama.cpp |
|
|
```bash |
|
|
./main -m zen-next-80b-instruct-Q4_K_M.gguf -p "Your prompt" -n 512 |
|
|
``` |
|
|
|
|
|
### MLX for Apple Silicon |
|
|
```python |
|
|
from mlx_lm import load, generate |
|
|
model, tokenizer = load("zenlm/zen-next-80b-instruct") |
|
|
response = generate(model, tokenizer, "Your prompt", max_tokens=200) |
|
|
``` |
|
|
|
|
|
## Unique Training Background |
|
|
|
|
|
Flagship training with constitutional AI |
|
|
|
|
|
This model was specifically optimized for complex reasoning & extended context with careful attention to: |
|
|
- Inference efficiency |
|
|
- Memory footprint |
|
|
- Quality preservation |
|
|
- Thinking capabilities |
|
|
|
|
|
--- |
|
|
|
|
|
Part of the Zen Family β’ [Collection](https://huggingface.co/collections/zenlm/zen) β’ [GitHub](https://github.com/zenlm/zen) |
|
|
|