Improve model card: Add pipeline tag, library name, and link to paper
Browse filesThis PR improves the model card for `RabotniKuma/Fast-Math-Qwen3-14B` by:
- Adding the `pipeline_tag: text-generation` so the model appears in relevant searches on the Hub.
- Adding `library_name: transformers` to indicate compatibility with the Hugging Face Transformers library, enabling the "how to use" widget.
- Updating the model card content to prominently link to the associated paper: "[A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning](https://huggingface.co/papers/2507.08267)".
README.md
CHANGED
|
@@ -1,15 +1,18 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
base_model:
|
| 4 |
- Qwen/Qwen3-14B
|
|
|
|
|
|
|
|
|
|
| 5 |
---
|
|
|
|
| 6 |
# Fast-Math-Qwen3-14B
|
| 7 |
-
By applying SFT and GRPO on difficult math problems, we enhanced the performance of `DeepSeek-R1-Distill-Qwen-14B` and developed [`Fast-Math-R1-14B`](https://huggingface.co/RabotniKuma/Fast-Math-R1-14B),
|
| 8 |
-
which achieves approx. 30% faster inference on average, while maintaining accuracy.
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
| 11 |
|
| 12 |
-
|
| 13 |
|
| 14 |
Technical details can be found in [our github repository](https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/tree/master).
|
| 15 |
|
|
@@ -19,19 +22,19 @@ This model likely inherits the ability to perform inference in TIR mode from the
|
|
| 19 |
## Evaluation
|
| 20 |
<img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true' max-height='400px'>
|
| 21 |
|
| 22 |
-
| | | AIME 2024 | | AIME 2025 | |
|
| 23 |
-
| ------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
|
| 24 |
-
| Model | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
|
| 25 |
-
| Qwen3-14B | 32000 | 79.3 | 13669 | 69.5 | 16481 |
|
| 26 |
-
| | 24000 | 75.9 | 13168 | 65.6 | 15235 |
|
| 27 |
-
| | 16000 | 64.5 | 11351 | 50.4 | 12522 |
|
| 28 |
-
| | 12000 | 49.7 | 9746 | 36.3 | 10353 |
|
| 29 |
-
| | 8000 | 28.4 | 7374 | 19.5 | 7485 |
|
| 30 |
-
| Fast-Math-Qwen3-14B | 32000 | 77.6 | 9740 | 66.6 | 12281 |
|
| 31 |
-
| | 24000 | 76.5 | 9634 | 65.3 | 11847 |
|
| 32 |
-
| | 16000 | 72.6 | 8793 | 60.1 | 10195 |
|
| 33 |
-
| | 12000 | 65.1 | 7775 | 49.4 | 8733 |
|
| 34 |
-
| | 8000 | 50.7 | 6260 | 36 | 6618 |
|
| 35 |
|
| 36 |
# Inference
|
| 37 |
## vLLM
|
|
@@ -55,9 +58,9 @@ sampling_params = SamplingParams(
|
|
| 55 |
)
|
| 56 |
messages = [
|
| 57 |
{
|
| 58 |
-
'role': 'user',
|
| 59 |
'content': (
|
| 60 |
-
'Solve the problem, and put the answer in
|
| 61 |
'Sarah is twice as old as her youngest brother. If the difference between their ages is 15 years. How old is her youngest brother?'
|
| 62 |
)
|
| 63 |
}
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model:
|
| 3 |
- Qwen/Qwen3-14B
|
| 4 |
+
license: apache-2.0
|
| 5 |
+
pipeline_tag: text-generation
|
| 6 |
+
library_name: transformers
|
| 7 |
---
|
| 8 |
+
|
| 9 |
# Fast-Math-Qwen3-14B
|
|
|
|
|
|
|
| 10 |
|
| 11 |
+
**Fast-Math-Qwen3-14B** is an efficiency-optimized version of `Qwen3-14B`, developed following the two-stage recipe of Supervised Fine-Tuning (SFT) and Reinforcement Learning from Online Inference (GRPO) presented in the paper:
|
| 12 |
+
|
| 13 |
+
**[A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning](https://huggingface.co/papers/2507.08267)**
|
| 14 |
|
| 15 |
+
This model enables **approx. 65% faster inference on average, with minimal loss in performance**, compared to the base `Qwen3-14B`.
|
| 16 |
|
| 17 |
Technical details can be found in [our github repository](https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/tree/master).
|
| 18 |
|
|
|
|
| 22 |
## Evaluation
|
| 23 |
<img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true' max-height='400px'>
|
| 24 |
|
| 25 |
+
| | | AIME 2024 | | AIME 2025 | |
|
| 26 |
+
| ------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
|
| 27 |
+
| Model | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
|
| 28 |
+
| Qwen3-14B | 32000 | 79.3 | 13669 | 69.5 | 16481 |
|
| 29 |
+
| | 24000 | 75.9 | 13168 | 65.6 | 15235 |
|
| 30 |
+
| | 16000 | 64.5 | 11351 | 50.4 | 12522 |
|
| 31 |
+
| | 12000 | 49.7 | 9746 | 36.3 | 10353 |
|
| 32 |
+
| | 8000 | 28.4 | 7374 | 19.5 | 7485 |
|
| 33 |
+
| Fast-Math-Qwen3-14B | 32000 | 77.6 | 9740 | 66.6 | 12281 |
|
| 34 |
+
| | 24000 | 76.5 | 9634 | 65.3 | 11847 |
|
| 35 |
+
| | 16000 | 72.6 | 8793 | 60.1 | 10195 |
|
| 36 |
+
| | 12000 | 65.1 | 7775 | 49.4 | 8733 |
|
| 37 |
+
| | 8000 | 50.7 | 6260 | 36 | 6618 |
|
| 38 |
|
| 39 |
# Inference
|
| 40 |
## vLLM
|
|
|
|
| 58 |
)
|
| 59 |
messages = [
|
| 60 |
{
|
| 61 |
+
'role': 'user',
|
| 62 |
'content': (
|
| 63 |
+
'Solve the problem, and put the answer in \\boxed{{}}. '
|
| 64 |
'Sarah is twice as old as her youngest brother. If the difference between their ages is 15 years. How old is her youngest brother?'
|
| 65 |
)
|
| 66 |
}
|