LumiOpen
/

Llama-Poro-2-8B-base

Text Generation

text-generation-inference

Model card Files Files and versions

laineyyy commited on Jun 17

Commit

a615f9d

·

verified ·

1 Parent(s): 462fbc6

Update README.md

Files changed (1) hide show

README.md +19 -10

README.md CHANGED Viewed

@@ -85,20 +85,29 @@ Poro 2 8B was trained on a balanced 165B token dataset designed to maintain Engl
 Poro 2 8B shows substantial improvements in Finnish capabilities over Llama 3.1 8B, while maintaining English performance:
 ### Finnish Performance
-- ARC Challenge (Finnish): 48.90 (vs 38.82 baseline)
-- HellaSwag (Finnish): 50.49 (vs 30.97 baseline)
-- MMLU (Finnish): 56.25 (vs 49.64 baseline)
-- TruthfulQA (Finnish): 49.78 (vs 45.54 baseline)
 ### English Performance
-- ARC Challenge: 60.75 (vs 57.94 baseline)
-- HellaSwag: 80.55 (vs 80.05 baseline)
-- MMLU: 63.48 (vs 65.08 baseline)
-- TruthfulQA: 48.06 (vs 54.02 baseline)
 ### Translation Performance
-- FLORES-200 EN→FI BLEU: 36.48 (vs 23.92 baseline)
-- FLORES-200 FI→EN BLEU: 40.71 (vs 37.42 baseline)
 **Overall**: ~10 percentage point average improvement in Finnish benchmarks with only ~1 percentage point decrease in English performance.

 Poro 2 8B shows substantial improvements in Finnish capabilities over Llama 3.1 8B, while maintaining English performance:
 ### Finnish Performance
+|        | Poro 2 8B | Llama 3.1 8B |
+|-----------------|------------------|----------------|
+| ARC Challenge   | **48.90**            | 38.82          |
+| HellaSwag       | **50.49**            | 30.97          |
+| MMLU            | **56.25**            | 49.64          |
+| TruthfulQA      | **49.78**            | 45.54          |
 ### English Performance
+|        | Poro 2 8B | Llama 3.1 8B |
+|-----------------|--------|----------------|
+| ARC Challenge   | **60.75**  | 57.94          |
+| HellaSwag       | **80.55**  | 80.05          |
+| MMLU            | 63.48  | **65.08**          |
+| TruthfulQA      | 48.06  | **54.02**          |
 ### Translation Performance
+|        | Poro 2 8B | Llama 3.1 8B |
+|--------------------|--------|----------------|
+| EN→FI BLEU         | **36.48**  | 23.92          |
+| FI→EN BLEU         | **40.71**  | 37.42          |
 **Overall**: ~10 percentage point average improvement in Finnish benchmarks with only ~1 percentage point decrease in English performance.