jpacifico commited on
Commit
2ad0240
·
verified ·
1 Parent(s): 535f749

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -16,6 +16,24 @@ This is a merge of pre-trained language models created using [mergekit](https://
16
 
17
  # First benchmarks
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  | Model | arc_challenge (0 shot) |
20
  |----------------------------------------------------|------------------------|
21
  | Qwen/Qwen3-1.7B | 43 |
 
16
 
17
  # First benchmarks
18
 
19
+ | Benchmark (metric) | microsoft/bitnet-b1.58-2B-4T-bf16 | bitnet-dpo-merged-modelstock7 |
20
+ |------------------------------------|-----------------------------------|--------------------------------|
21
+ | arc_challenge 0 shot | 47.95 | **51.62** |
22
+ | arc_easy 0 shot | 73.44 | **75.25** |
23
+ | hellaswag 0 shot | 68.27 | **68.52** |
24
+ | openbookqa 0 shot | **41.6** | 41.4 |
25
+ | boolq 0 shot | **79.39** | 79.33 |
26
+ | piqa 0 shot | **77.86** | 77.53 |
27
+ | winogrande 0 shot | 70.64 | **72.06** |
28
+ | ifeval 0 shot | **41.85** | 44.12 |
29
+ | triviaqa 0 shot | 11.95 | **15.06** |
30
+ | triviaqa 5 shot EM | **33.51** | 33.51 |
31
+ | truthfulqa_mc2 10 shot | 45.89 | **46.52** |
32
+ | gsm8k 4 shot EM | **62.4** | 59.67 |
33
+ | mmlu 5 shot acc | **52.96** | 53.39 |
34
+ | commonsense_qa 10 shot acc | **71.17** | 70.76 |
35
+
36
+
37
  | Model | arc_challenge (0 shot) |
38
  |----------------------------------------------------|------------------------|
39
  | Qwen/Qwen3-1.7B | 43 |