Shengkun commited on
Commit
5b9270d
·
verified ·
1 Parent(s): 5d66b4d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -2
README.md CHANGED
@@ -28,9 +28,36 @@ model = AutoModelForCausalLM.from_pretrained("Shengkun/DarwinLM-2.7B-Pruned", tr
28
  | Method | Param. | SciQ | PIQA | WG | ArcE | ArcC | HS | LogiQA | BoolQ | Avg |
29
  |----------------------------|--------|------|------|------|------|------|------|--------|-------|------|
30
  | **Dense** | 6.7B | 93.7 | 78.1 | 69.3 | 76.4 | 53.0 | 78.6 | 30.7 | 77.7 | 69.2 |
31
- | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
- **(Results for 4.6B and 8.4B)**
34
 
35
  ## Installation
36
 
 
28
  | Method | Param. | SciQ | PIQA | WG | ArcE | ArcC | HS | LogiQA | BoolQ | Avg |
29
  |----------------------------|--------|------|------|------|------|------|------|--------|-------|------|
30
  | **Dense** | 6.7B | 93.7 | 78.1 | 69.3 | 76.4 | 53.0 | 78.6 | 30.7 | 77.7 | 69.2 |
31
+ | **Uniform** | 3.4B | 44.1 | 57.1 | 53.3 | 33.5 | 32.2 | 27.3 | 25.0 | 49.0 | 40.1 |
32
+ | **ZipLM** | 4.0B | 87.4 | 64.4 | 58.3 | 53.2 | 33.6 | 50.1 | 25.5 | 63.6 | 54.5 |
33
+ | **ShearedLLama** | 2.7B | 84.5 | 66.4 | 53.4 | 49.8 | 28.4 | 47.6 | 27.6 | 50.9 | 51.0 |
34
+ | *DarwinLM (one-shot)* | 2.7B | 85.6 | 70.8 | 55.8 | 63.3 | 38.1 | 53.2 | 28.5 | 62.7 | 57.2 |
35
+ | **ShearedLLama (50B)** | 2.7B | 90.8 | 75.8 | 64.2 | 67.0 | 41.2 | 70.8 | 28.2 | 63.0 | 62.6 |
36
+ | **ShearedLLama (10B†)** | 2.7B | 92.0 | 73.6 | 63.1 | 69.8 | 42.0 | 64.4 | 29.0 | 62.1 | 61.9 |
37
+ | *DarwinLM (10B)* | 2.6B | 90.8 | 72.2 | 65.1 | 68.5 | 45.0 | 67.2 | 28.5 | 64.6 | 62.8 |
38
+
39
+ **4.6B**
40
+
41
+ | Model | Method | Param. | SciQ | PIQA | WG | ArcE | ArcC | HS | LogiQA | BoolQ | MMLU | Avg |
42
+ |-----------------|------------------------|--------|------|------|------|------|------|------|--------|-------|------|------|
43
+ | **Llama-3.1-8B** | **Dense** | 8B | 96.3 | 81.2 | 74.3 | 81.4 | 58.2 | 81.7 | 31.1 | 84.0 | 65.2 | 72.8 |
44
+ | | **Uniform** | 4.5B | 29.1 | 53.6 | 51.7 | 26.0 | 23.6 | 27.1 | 25.5 | 62.1 | 25.7 | 36.1 |
45
+ | | **ZipLM** | 6B | 65.5 | 60.6 | 56.0 | 40.2 | 34.4 | 34.4 | 28.1 | 63.0 | 27.9 | 45.7 |
46
+ | | *DarwinLM (one-shot)* | 4.6B | 84.9 | 69.4 | 57.3 | 59.6 | 34.2 | 44.6 | 24.1 | 62.2 | 28.5 | 51.6 |
47
+ | | **OLMO (2.5T)** | 7B | 92.8 | 79.4 | 70.4 | 73.3 | 44.9 | 77.1 | 27.9 | 72.5 | 28.3 | 62.9 |
48
+ | | *DarwinLM (10.0B)* | 4.6B | 93.2 | 74.8 | 67.4 | 73.2 | 51.6 | 71.3 | 30.7 | 71.1 | 40.6 | 63.7 |
49
+
50
+ **8.4B**
51
+
52
+ | Model | Method | Param. | SciQ | PIQA | WG | ArcE | ArcC | HS | LogiQA | BoolQ | MMLU | Avg |
53
+ |---------------------------|------------------------|--------|------|------|------|------|------|------|--------|-------|------|------|
54
+ | **Qwen-2.5-14B-Instruct** | **Dense** | 14B | 96.8 | 81.9 | 79.1 | 85.7 | 72.8 | 85.1 | 38.5 | 87.9 | 80.0 | 78.6 |
55
+ | | **Uniform** | 8.6B | 78.2 | 72.7 | 57.6 | 76.1 | 45.6 | 47.0 | 28.1 | 61.6 | 45.5 | 56.9 |
56
+ | | **ZipLM** | 8.5B | 69.0 | 66.4 | 52.8 | 60.1 | 38.3 | 43.3 | 29.6 | 60.2 | 25.0 | 49.4 |
57
+ | | *DarwinLM (one-shot)* | 8.4B | 84.3 | 73.9 | 60.5 | 75.7 | 48.0 | 53.3 | 29.3 | 66.9 | 43.1 | 59.4 |
58
+ | | **OLMO-0424 (2.05T)** | 7B | 96.1 | 80.1 | 72.1 | 73.8 | 49.2 | 78.0 | 29.3 | 80.8 | 52.1 | 67.9 |
59
+ | | *DarwinLM (10.0B)* | 8.4B | 89.5 | 78.1 | 70.7 | 79.6 | 57.6 | 74.9 | 33.5 | 73.9 | 57.9 | 68.4 |
60
 
 
61
 
62
  ## Installation
63