RabotniKuma nielsr HF Staff commited on
Commit
f77e2a1
·
verified ·
1 Parent(s): e5c9e74

Improve model card: Add pipeline tag, library name, and link to paper (#1)

Browse files

- Improve model card: Add pipeline tag, library name, and link to paper (e440cec9ca1c11fe4f7856aa4aed97790256905b)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +23 -20
README.md CHANGED
@@ -1,15 +1,18 @@
1
  ---
2
- license: apache-2.0
3
  base_model:
4
  - Qwen/Qwen3-14B
 
 
 
5
  ---
 
6
  # Fast-Math-Qwen3-14B
7
- By applying SFT and GRPO on difficult math problems, we enhanced the performance of `DeepSeek-R1-Distill-Qwen-14B` and developed [`Fast-Math-R1-14B`](https://huggingface.co/RabotniKuma/Fast-Math-R1-14B),
8
- which achieves approx. 30% faster inference on average, while maintaining accuracy.
9
 
10
- In addition, we trained and open-sourced `Fast-Math-Qwen3-14B`, an efficiency-optimized version of Qwen3-14B`, following the same approach.
 
 
11
 
12
- **Compared to Qwen3-14B, this model enables approx. 65% faster inference on average, with minimal loss in performance.**
13
 
14
  Technical details can be found in [our github repository](https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/tree/master).
15
 
@@ -19,19 +22,19 @@ This model likely inherits the ability to perform inference in TIR mode from the
19
  ## Evaluation
20
  <img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true' max-height='400px'>
21
 
22
- | | | AIME 2024 | | AIME 2025 | |
23
- | ------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
24
- | Model | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
25
- | Qwen3-14B | 32000 | 79.3 | 13669 | 69.5 | 16481 |
26
- | | 24000 | 75.9 | 13168 | 65.6 | 15235 |
27
- | | 16000 | 64.5 | 11351 | 50.4 | 12522 |
28
- | | 12000 | 49.7 | 9746 | 36.3 | 10353 |
29
- | | 8000 | 28.4 | 7374 | 19.5 | 7485 |
30
- | Fast-Math-Qwen3-14B | 32000 | 77.6 | 9740 | 66.6 | 12281 |
31
- | | 24000 | 76.5 | 9634 | 65.3 | 11847 |
32
- | | 16000 | 72.6 | 8793 | 60.1 | 10195 |
33
- | | 12000 | 65.1 | 7775 | 49.4 | 8733 |
34
- | | 8000 | 50.7 | 6260 | 36 | 6618 |
35
 
36
  # Inference
37
  ## vLLM
@@ -55,9 +58,9 @@ sampling_params = SamplingParams(
55
  )
56
  messages = [
57
  {
58
- 'role': 'user',
59
  'content': (
60
- 'Solve the problem, and put the answer in \boxed{{}}. '
61
  'Sarah is twice as old as her youngest brother. If the difference between their ages is 15 years. How old is her youngest brother?'
62
  )
63
  }
 
1
  ---
 
2
  base_model:
3
  - Qwen/Qwen3-14B
4
+ license: apache-2.0
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
  ---
8
+
9
  # Fast-Math-Qwen3-14B
 
 
10
 
11
+ **Fast-Math-Qwen3-14B** is an efficiency-optimized version of `Qwen3-14B`, developed following the two-stage recipe of Supervised Fine-Tuning (SFT) and Reinforcement Learning from Online Inference (GRPO) presented in the paper:
12
+
13
+ **[A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning](https://huggingface.co/papers/2507.08267)**
14
 
15
+ This model enables **approx. 65% faster inference on average, with minimal loss in performance**, compared to the base `Qwen3-14B`.
16
 
17
  Technical details can be found in [our github repository](https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/tree/master).
18
 
 
22
  ## Evaluation
23
  <img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true' max-height='400px'>
24
 
25
+ | | | AIME 2024 | | AIME 2025 | |
26
+ | ------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
27
+ | Model | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
28
+ | Qwen3-14B | 32000 | 79.3 | 13669 | 69.5 | 16481 |
29
+ | | 24000 | 75.9 | 13168 | 65.6 | 15235 |
30
+ | | 16000 | 64.5 | 11351 | 50.4 | 12522 |
31
+ | | 12000 | 49.7 | 9746 | 36.3 | 10353 |
32
+ | | 8000 | 28.4 | 7374 | 19.5 | 7485 |
33
+ | Fast-Math-Qwen3-14B | 32000 | 77.6 | 9740 | 66.6 | 12281 |
34
+ | | 24000 | 76.5 | 9634 | 65.3 | 11847 |
35
+ | | 16000 | 72.6 | 8793 | 60.1 | 10195 |
36
+ | | 12000 | 65.1 | 7775 | 49.4 | 8733 |
37
+ | | 8000 | 50.7 | 6260 | 36 | 6618 |
38
 
39
  # Inference
40
  ## vLLM
 
58
  )
59
  messages = [
60
  {
61
+ 'role': 'user',
62
  'content': (
63
+ 'Solve the problem, and put the answer in \\boxed{{}}. '
64
  'Sarah is twice as old as her youngest brother. If the difference between their ages is 15 years. How old is her youngest brother?'
65
  )
66
  }