laineyyy commited on
Commit
a615f9d
·
verified ·
1 Parent(s): 462fbc6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -10
README.md CHANGED
@@ -85,20 +85,29 @@ Poro 2 8B was trained on a balanced 165B token dataset designed to maintain Engl
85
  Poro 2 8B shows substantial improvements in Finnish capabilities over Llama 3.1 8B, while maintaining English performance:
86
 
87
  ### Finnish Performance
88
- - ARC Challenge (Finnish): 48.90 (vs 38.82 baseline)
89
- - HellaSwag (Finnish): 50.49 (vs 30.97 baseline)
90
- - MMLU (Finnish): 56.25 (vs 49.64 baseline)
91
- - TruthfulQA (Finnish): 49.78 (vs 45.54 baseline)
 
 
 
92
 
93
  ### English Performance
94
- - ARC Challenge: 60.75 (vs 57.94 baseline)
95
- - HellaSwag: 80.55 (vs 80.05 baseline)
96
- - MMLU: 63.48 (vs 65.08 baseline)
97
- - TruthfulQA: 48.06 (vs 54.02 baseline)
 
 
 
98
 
99
  ### Translation Performance
100
- - FLORES-200 EN→FI BLEU: 36.48 (vs 23.92 baseline)
101
- - FLORES-200 FI→EN BLEU: 40.71 (vs 37.42 baseline)
 
 
 
102
 
103
  **Overall**: ~10 percentage point average improvement in Finnish benchmarks with only ~1 percentage point decrease in English performance.
104
 
 
85
  Poro 2 8B shows substantial improvements in Finnish capabilities over Llama 3.1 8B, while maintaining English performance:
86
 
87
  ### Finnish Performance
88
+ | | Poro 2 8B | Llama 3.1 8B |
89
+ |-----------------|------------------|----------------|
90
+ | ARC Challenge | **48.90** | 38.82 |
91
+ | HellaSwag | **50.49** | 30.97 |
92
+ | MMLU | **56.25** | 49.64 |
93
+ | TruthfulQA | **49.78** | 45.54 |
94
+
95
 
96
  ### English Performance
97
+ | | Poro 2 8B | Llama 3.1 8B |
98
+ |-----------------|--------|----------------|
99
+ | ARC Challenge | **60.75** | 57.94 |
100
+ | HellaSwag | **80.55** | 80.05 |
101
+ | MMLU | 63.48 | **65.08** |
102
+ | TruthfulQA | 48.06 | **54.02** |
103
+
104
 
105
  ### Translation Performance
106
+ | | Poro 2 8B | Llama 3.1 8B |
107
+ |--------------------|--------|----------------|
108
+ | EN→FI BLEU | **36.48** | 23.92 |
109
+ | FI→EN BLEU | **40.71** | 37.42 |
110
+
111
 
112
  **Overall**: ~10 percentage point average improvement in Finnish benchmarks with only ~1 percentage point decrease in English performance.
113