sedrickkeh commited on
Commit
e946b12
·
verified ·
1 Parent(s): c74d502

update table

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -26,13 +26,13 @@ The [OpenThinker2-32B](https://huggingface.co/open-thoughts/OpenThinker2-32B) mo
26
  This model improves upon our previous [OpenThinker-32B](https://huggingface.co/open-thoughts/OpenThinker-32B) model, which was trained on 114k examples from [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/open-thoughts-114k).
27
  The numbers reported in the table below are evaluated with our open-source tool [Evalchemy](https://github.com/mlfoundations/Evalchemy).
28
 
29
- | Model | Open Data? | Avg | AIME24 | AIME25 | AMC23 | MATH500 | GPQA-D | LCBv2 |
30
- | ---------------- | ---------- | ---- | ------ | ------ | ----- | ------- | ------ | ----- |
31
- | OpenThinker-32B | ✅ | 72.6 | 68.0 | 49.3 | 95.5 | 90.6 | 63.5 | 68.6 |
32
- | OpenThinker2-32B | ✅ | 76.1 | 76.7 | 58.7 | 94.0 | 90.8 | 64.1 | 72.5 |
33
- | R1-Distill-32B | ❌ | 74.9 | 74.7 | 50.0 | 96.5 | 90.0 | 65.8 | 72.3 |
34
- | Light-R1-32B | ✅ | 72.9 | 74.7 | 58.0 | 96.0 | 90.4 | 62.0 | 56.0 |
35
- | QwQ-32B | | 80.9 | 78.0 | 62.0 | 98.0 | 91.6 | 66.3 | 89.2 |
36
 
37
 
38
  ## Data
 
26
  This model improves upon our previous [OpenThinker-32B](https://huggingface.co/open-thoughts/OpenThinker-32B) model, which was trained on 114k examples from [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/open-thoughts-114k).
27
  The numbers reported in the table below are evaluated with our open-source tool [Evalchemy](https://github.com/mlfoundations/Evalchemy).
28
 
29
+ | Model | Data | AIME24 | AIME25 | AMC23 | MATH500 | GPQA-D | LCBv2 |
30
+ | ----------------------------------------------------------------------------------------------- | ---- | ------ | ------ | ----- | ------- | ------ | ----- |
31
+ | [OpenThinker2-32B](https://huggingface.co/open-thoughts/OpenThinker2-32B) | ✅ | 76.7 | 58.7 | 94.0 | 90.8 | 64.1 | 72.5 |
32
+ | [OpenThinker-32B](https://huggingface.co/open-thoughts/OpenThinker-32B) | ✅ | 68.0 | 49.3 | 95.5 | 90.6 | 63.5 | 68.6 |
33
+ | [DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) | ❌ | 74.7 | 50.0 | 96.5 | 90.0 | 65.8 | 72.3 |
34
+ | [Light-R1-32B](https://huggingface.co/qihoo360/Light-R1-32B) | ✅ | 74.7 | 58.0 | 96.0 | 90.4 | 62.0 | 56.0 |
35
+ | [S1.1-32B](https://huggingface.co/simplescaling/s1.1-32B) | | 59.3 | 42.7 | 91.5 | 87.4 | 62.0 | 58.7 |
36
 
37
 
38
  ## Data