Update README.md
Browse files
README.md
CHANGED
|
@@ -145,7 +145,7 @@ The following factors can influence MAI-DS-R1's behavior and performance:
|
|
| 145 |
|
| 146 |
- **Harm Mitigation**: MAI-DS-R1 outperforms both R1-1776 and the original R1 model in minimizing harmful content.
|
| 147 |
### Model Architecture and Objective
|
| 148 |
-
- **Model Name**: MAI-DS-R1
|
| 149 |
- **Architecture**: Based on DeepSeek-R1, a transformer-based autoregressive language model utilizing multi-head self-attention and Mixture-of-Experts (MoE) for scalable and efficient inference.
|
| 150 |
- **Objective**: Post-trained to reduce CCP-aligned restrictions and enhance harm protection, while preserving the original model’s strong chain-of-thought reasoning and general-purpose language understanding capabilities.
|
| 151 |
- **Pre-trained Model Base**: DeepSeek-R1 (671B)
|
|
|
|
| 145 |
|
| 146 |
- **Harm Mitigation**: MAI-DS-R1 outperforms both R1-1776 and the original R1 model in minimizing harmful content.
|
| 147 |
### Model Architecture and Objective
|
| 148 |
+
- **Model Name**: MAI-DS-R1
|
| 149 |
- **Architecture**: Based on DeepSeek-R1, a transformer-based autoregressive language model utilizing multi-head self-attention and Mixture-of-Experts (MoE) for scalable and efficient inference.
|
| 150 |
- **Objective**: Post-trained to reduce CCP-aligned restrictions and enhance harm protection, while preserving the original model’s strong chain-of-thought reasoning and general-purpose language understanding capabilities.
|
| 151 |
- **Pre-trained Model Base**: DeepSeek-R1 (671B)
|