Fix README: Replace Qwen/Qwen2.5-72B-Instruct with Qwen3-32B

3271756 verified 29 days ago

2.68 kB

	---
	license: apache-2.0
	base_model: Qwen3-32B-Instruct
	tags:
	- transformers
	- zen
	- text-generation
	- thinking-mode
	- zoo-gym
	- hanzo-ai
	language:
	- en
	pipeline_tag: text-generation
	library_name: transformers
	model-index:
	- name: Zen-Next
	results:
	- task:
	type: text-generation
	dataset:
	name: MMLU
	type: MMLU
	metrics:
	- type: accuracy
	value: 0.7559999999999999
	name: MMLU
	widget:
	- text: "User: What is the capital of France?\n\nAssistant:"
	---

	# Zen-Next (80B)

	Part of the [Zen AI Model Family](https://huggingface.co/zenlm)

	## Model Description

	Parameters: 80B
	Base Model: Qwen3-32B
	Specialization: Complex reasoning & extended context
	Training: Flagship training with constitutional AI
	Context: 32K-128K tokens
	Thinking: Up to 1,000,000 tokens

	## Files in This Repository

	This repository contains ALL formats and quantizations:

	### 🔷 SafeTensors (Original)
	- `model.safetensors` - Full precision weights
	- `config.json` - Model configuration
	- `tokenizer.json` - Fast tokenizer

	### 🟢 GGUF Quantized
	- `zen-next-80b-instruct-Q4_K_M.gguf` - 4-bit (recommended)
	- `zen-next-80b-instruct-Q5_K_M.gguf` - 5-bit (balanced)
	- `zen-next-80b-instruct-Q8_0.gguf` - 8-bit (high quality)

	### 🍎 MLX (Apple Silicon)
	- `mlx-4bit/` - 4-bit quantized for M-series
	- `mlx-8bit/` - 8-bit quantized for M-series

	## Performance

	\| Benchmark \| Score \| Rank \|
	\|-----------\|-------\|------\|
	\| MMLU \| 75.6% \| Top 10% \|
	\| GSM8K \| 82.1% \| Top 15% \|
	\| HumanEval \| 61.7% \| Top 20% \|

	## Quick Start

	### Transformers
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("zenlm/zen-next-80b-instruct")
	tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-next-80b-instruct")

	# With thinking mode
	messages = [{"role": "user", "content": "Your question here"}]
	text = tokenizer.apply_chat_template(messages, enable_thinking=True)
	```

	### GGUF with llama.cpp
	```bash
	./main -m zen-next-80b-instruct-Q4_K_M.gguf -p "Your prompt" -n 512
	```

	### MLX for Apple Silicon
	```python
	from mlx_lm import load, generate
	model, tokenizer = load("zenlm/zen-next-80b-instruct")
	response = generate(model, tokenizer, "Your prompt", max_tokens=200)
	```

	## Unique Training Background

	Flagship training with constitutional AI

	This model was specifically optimized for complex reasoning & extended context with careful attention to:
	- Inference efficiency
	- Memory footprint
	- Quality preservation
	- Thinking capabilities

	---

	Part of the Zen Family • [Collection](https://huggingface.co/collections/zenlm/zen) • [GitHub](https://github.com/zenlm/zen)