jpacifico
/

Aramis-2B-BitNet-bf16

Text Generation

Model card Files Files and versions

jpacifico commited on Aug 12

Commit

3d51135

·

verified ·

1 Parent(s): f06bcef

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -16,6 +16,8 @@ This is a merge of pre-trained language models created using [mergekit](https://
 # First benchmarks
 | Benchmark (metric)                 | microsoft/bitnet-b1.58-2B-4T-bf16 | bitnet-dpo-merged-modelstock7 |
 |------------------------------------|-----------------------------------|--------------------------------|
 | arc_challenge 0 shot               | 47.95                             | **51.62**                      |
@@ -41,6 +43,7 @@ This is a merge of pre-trained language models created using [mergekit](https://
 | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B          | 34,9                   |
 | openbmb/MiniCPM-2B-dpo-bf16                        | 44,28                  |
 | microsoft/bitnet-b1.58-2B-4T-bf16 (base model)     | 47,95                  |
 | jpacifico/bitnet-dpo-merged-modelstock7            | **51,62**              |

 # First benchmarks
+**Interpretation:** Significant gains on language understanding & pragmatic reasoning (ARC-C/E, Wino, BoolQ, HellaSwag, TriviaQA) with stability on other skills. Math/code are not the optimization target; GSM8K stays essentially stable relative to the BitNet 1.58-bit baseline.
 | Benchmark (metric)                 | microsoft/bitnet-b1.58-2B-4T-bf16 | bitnet-dpo-merged-modelstock7 |
 |------------------------------------|-----------------------------------|--------------------------------|
 | arc_challenge 0 shot               | 47.95                             | **51.62**                      |
 | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B          | 34,9                   |
 | openbmb/MiniCPM-2B-dpo-bf16                        | 44,28                  |
 | microsoft/bitnet-b1.58-2B-4T-bf16 (base model)     | 47,95                  |
+| microsoft/bitnet-b1.58-2B-4T                       | 49,91                  |
 | jpacifico/bitnet-dpo-merged-modelstock7            | **51,62**              |