Update README.md
Browse files
README.md
CHANGED
|
@@ -20,7 +20,7 @@ language:
|
|
| 20 |
|
| 21 |
**Aramis-2B-BitNet** *(2.41B params / Context Length: Maximum sequence length of 4096 tokens)*
|
| 22 |
A compact, agent-oriented small language model focused on contextual reasoning, language understanding and multi-turn instruction following.
|
| 23 |
-
Built with an iterative post-training recipe: bilingual DPO (FR+EN) + model merging of FR-centric and EN-centric variants.
|
| 24 |
Runs natively as BitNet 1.58-bit (ternary) and is available in GGUF 1.58-bit, lossless to the BF16 checkpoint.
|
| 25 |
|
| 26 |
**Why BitNet (and why this model)**
|
|
@@ -69,7 +69,7 @@ All scores are reported in comparison with the original [microsoft/bitnet-b1.58-
|
|
| 69 |
| mmlu 5 shot acc | 52.96 | **53.39** |
|
| 70 |
| commonsense_qa 10 shot acc | **71.17** | 70.76 |
|
| 71 |
|
| 72 |
-
**ARC-Challenge:** 51.62
|
| 73 |
|
| 74 |
| Model | arc_challenge (0 shot) |
|
| 75 |
|----------------------------------------------------|------------------------|
|
|
|
|
| 20 |
|
| 21 |
**Aramis-2B-BitNet** *(2.41B params / Context Length: Maximum sequence length of 4096 tokens)*
|
| 22 |
A compact, agent-oriented small language model focused on contextual reasoning, language understanding and multi-turn instruction following.
|
| 23 |
+
Built with an iterative post-training recipe: bilingual DPO (FR+EN) + model merging of FR-centric and EN-centric variants.
|
| 24 |
Runs natively as BitNet 1.58-bit (ternary) and is available in GGUF 1.58-bit, lossless to the BF16 checkpoint.
|
| 25 |
|
| 26 |
**Why BitNet (and why this model)**
|
|
|
|
| 69 |
| mmlu 5 shot acc | 52.96 | **53.39** |
|
| 70 |
| commonsense_qa 10 shot acc | **71.17** | 70.76 |
|
| 71 |
|
| 72 |
+
**ARC-Challenge (zero-shot):** 51.62 — first-ever ≥50 reported for a 2B-class model (>1.5B, <2.5B) *based on publicly available results*.
|
| 73 |
|
| 74 |
| Model | arc_challenge (0 shot) |
|
| 75 |
|----------------------------------------------------|------------------------|
|