Daemontatox commited on
Commit
34c702f
·
verified ·
1 Parent(s): 30210f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -10
README.md CHANGED
@@ -103,11 +103,13 @@ model-index:
103
  ## Technical Specifications
104
 
105
  ### Base Architecture
106
- - **Foundation Model**: DeepSeek-R1-Zero
107
- - **Model Type**: Transformer-based Large Language Model
 
 
 
 
108
  - **Fine-tuning Approach**: Supervised fine-tuning on curated reasoning datasets
109
- - **Parameter Count**: [Not specified - requires verification]
110
- - **Training Infrastructure**: [Not specified - requires documentation]
111
 
112
  ### Core Capabilities
113
  - **Chain-of-Thought Reasoning**: Enhanced multi-step logical inference with explicit reasoning traces
@@ -386,11 +388,30 @@ For educational purposes, here are some basic chemical combinations that create
386
 
387
  ### Computational Requirements
388
 
389
- **Inference Specifications:**
390
- - **Memory Requirements**: Estimated 16-32GB VRAM for full precision inference
391
- - **Processing Speed**: Approximately 20-50 tokens/second on RTX 4090
392
- - **Quantization Support**: Compatible with 4-bit and 8-bit quantization (performance impact unknown)
393
- - **Context Length**: Inherited from DeepSeek-R1-Zero (likely 32K tokens)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
394
 
395
  ### Model Architecture Modifications
396
 
@@ -467,4 +488,4 @@ Zireal-0 represents an incremental advancement in reasoning-focused language mod
467
 
468
  ---
469
 
470
- **TLDR**: Zireal-0 is a research-only fine-tune of DeepSeek-R1-Zero with marginal reasoning improvements (+0.8-1.7 points on benchmarks), no safety mechanisms, insufficient documentation, and explicit production restrictions. Performance gains are minimal and potentially within measurement error. Inference examples show improved step-by-step reasoning for math problems but reveal critical logical reasoning failures and safety risks. Requires 16-32GB VRAM, produces verbose outputs, and consistently underperforms official models. Use only for controlled academic research with extensive output monitoring.
 
103
  ## Technical Specifications
104
 
105
  ### Base Architecture
106
+ - **Foundation Model**: DeepSeek-R1-Zero (684B parameters)
107
+ - **Model Type**: Mixture of Experts (MoE) Transformer-based Large Language Model
108
+ - **Architecture Base**: Built upon DeepSeek-V3-Base architecture
109
+ - **Active Parameters**: ~37B parameters activated per token (from 684B total)
110
+ - **Training Method**: Reinforcement Learning (RL) trained reasoning model
111
+ - **Context Length**: 32,768 tokens maximum generation length
112
  - **Fine-tuning Approach**: Supervised fine-tuning on curated reasoning datasets
 
 
113
 
114
  ### Core Capabilities
115
  - **Chain-of-Thought Reasoning**: Enhanced multi-step logical inference with explicit reasoning traces
 
388
 
389
  ### Computational Requirements
390
 
391
+ ### Computational Requirements
392
+
393
+ **Full Model Deployment (684B Parameters):**
394
+ - **VRAM Requirements**: 490-710GB for FP8 precision, 1.42TB for BF16 precision
395
+ - **Multi-GPU Setup**: Mandatory - single GPU deployment impossible
396
+ - **Recommended GPU Configuration**:
397
+ - 6-12x A100 80GB (480-960GB total VRAM)
398
+ - 5-9x H100 80GB (400-720GB total VRAM, preferred for optimal performance)
399
+ - Alternative: 21-25x RTX 4090 24GB (504-600GB total VRAM, cost-effective but slower)
400
+ - **System RAM**: 256GB+ DDR5 (512GB recommended for optimal performance)
401
+ - **CPU**: AMD Threadripper PRO or Intel Xeon (16+ cores minimum)
402
+ - **Storage**: 2.1TB+ NVMe SSD for model weights and cache
403
+ - **Network**: InfiniBand or high-speed Ethernet for multi-GPU communication
404
+
405
+ **Quantized Deployment Options:**
406
+ - **IQ4_XS Quantization**: ~143GB storage, runs on high-end CPU systems
407
+ - **INT8 Quantization**: ~342GB VRAM requirement
408
+ - **FP16 Quantization**: ~684GB VRAM requirement
409
+
410
+ **Performance Characteristics:**
411
+ - **Inference Speed**: 4-7 tokens/second on dual EPYC CPU (IQ4_XS)
412
+ - **GPU Inference**: 18-45 tokens/second on multi-GPU setup
413
+ - **Memory Bandwidth**: Critical bottleneck for 684B model performance
414
+ - **Temperature Settings**: 0.5-0.7 recommended (0.6 optimal) to prevent repetition
415
 
416
  ### Model Architecture Modifications
417
 
 
488
 
489
  ---
490
 
491
+ **TLDR**: Zireal-0 is a research-only fine-tune of DeepSeek-R1-Zero (684B parameters) with marginal reasoning improvements (+0.8-1.7 points on benchmarks), no safety mechanisms, and insufficient documentation. Requires 490-710GB VRAM (6-12x A100 GPUs minimum), produces 4-45 tokens/second depending on setup. Performance gains are minimal and potentially within measurement error. Inference examples show improved step-by-step reasoning for math problems but reveal critical logical reasoning failures and safety risks. Consistently underperforms official models. Use only for controlled academic research with extensive output monitoring and enterprise-grade hardware infrastructure.