Update README.md
Browse files
README.md
CHANGED
|
@@ -16,6 +16,24 @@ This is a merge of pre-trained language models created using [mergekit](https://
|
|
| 16 |
|
| 17 |
# First benchmarks
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
| Model | arc_challenge (0 shot) |
|
| 20 |
|----------------------------------------------------|------------------------|
|
| 21 |
| Qwen/Qwen3-1.7B | 43 |
|
|
|
|
| 16 |
|
| 17 |
# First benchmarks
|
| 18 |
|
| 19 |
+
| Benchmark (metric) | microsoft/bitnet-b1.58-2B-4T-bf16 | bitnet-dpo-merged-modelstock7 |
|
| 20 |
+
|------------------------------------|-----------------------------------|--------------------------------|
|
| 21 |
+
| arc_challenge 0 shot | 47.95 | **51.62** |
|
| 22 |
+
| arc_easy 0 shot | 73.44 | **75.25** |
|
| 23 |
+
| hellaswag 0 shot | 68.27 | **68.52** |
|
| 24 |
+
| openbookqa 0 shot | **41.6** | 41.4 |
|
| 25 |
+
| boolq 0 shot | **79.39** | 79.33 |
|
| 26 |
+
| piqa 0 shot | **77.86** | 77.53 |
|
| 27 |
+
| winogrande 0 shot | 70.64 | **72.06** |
|
| 28 |
+
| ifeval 0 shot | **41.85** | 44.12 |
|
| 29 |
+
| triviaqa 0 shot | 11.95 | **15.06** |
|
| 30 |
+
| triviaqa 5 shot EM | **33.51** | 33.51 |
|
| 31 |
+
| truthfulqa_mc2 10 shot | 45.89 | **46.52** |
|
| 32 |
+
| gsm8k 4 shot EM | **62.4** | 59.67 |
|
| 33 |
+
| mmlu 5 shot acc | **52.96** | 53.39 |
|
| 34 |
+
| commonsense_qa 10 shot acc | **71.17** | 70.76 |
|
| 35 |
+
|
| 36 |
+
|
| 37 |
| Model | arc_challenge (0 shot) |
|
| 38 |
|----------------------------------------------------|------------------------|
|
| 39 |
| Qwen/Qwen3-1.7B | 43 |
|