jpacifico commited on
Commit
3d51135
·
verified ·
1 Parent(s): f06bcef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -16,6 +16,8 @@ This is a merge of pre-trained language models created using [mergekit](https://
16
 
17
  # First benchmarks
18
 
 
 
19
  | Benchmark (metric) | microsoft/bitnet-b1.58-2B-4T-bf16 | bitnet-dpo-merged-modelstock7 |
20
  |------------------------------------|-----------------------------------|--------------------------------|
21
  | arc_challenge 0 shot | 47.95 | **51.62** |
@@ -41,6 +43,7 @@ This is a merge of pre-trained language models created using [mergekit](https://
41
  | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | 34,9 |
42
  | openbmb/MiniCPM-2B-dpo-bf16 | 44,28 |
43
  | microsoft/bitnet-b1.58-2B-4T-bf16 (base model) | 47,95 |
 
44
  | jpacifico/bitnet-dpo-merged-modelstock7 | **51,62** |
45
 
46
 
 
16
 
17
  # First benchmarks
18
 
19
+ **Interpretation:** Significant gains on language understanding & pragmatic reasoning (ARC-C/E, Wino, BoolQ, HellaSwag, TriviaQA) with stability on other skills. Math/code are not the optimization target; GSM8K stays essentially stable relative to the BitNet 1.58-bit baseline.
20
+
21
  | Benchmark (metric) | microsoft/bitnet-b1.58-2B-4T-bf16 | bitnet-dpo-merged-modelstock7 |
22
  |------------------------------------|-----------------------------------|--------------------------------|
23
  | arc_challenge 0 shot | 47.95 | **51.62** |
 
43
  | deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | 34,9 |
44
  | openbmb/MiniCPM-2B-dpo-bf16 | 44,28 |
45
  | microsoft/bitnet-b1.58-2B-4T-bf16 (base model) | 47,95 |
46
+ | microsoft/bitnet-b1.58-2B-4T | 49,91 |
47
  | jpacifico/bitnet-dpo-merged-modelstock7 | **51,62** |
48
 
49