Improve model card: Add pipeline tag, library name, and link to paper

This PR improves the model card for `RabotniKuma/Fast-Math-Qwen3-14B` by:
- Adding the `pipeline_tag: text-generation` so the model appears in relevant searches on the Hub.
- Adding `library_name: transformers` to indicate compatibility with the Hugging Face Transformers library, enabling the "how to use" widget.
- Updating the model card content to prominently link to the associated paper: "[A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning](https://huggingface.co/papers/2507.08267)".

Files changed (1) hide show

README.md +23 -20

README.md CHANGED Viewed

@@ -1,15 +1,18 @@
 ---
-license: apache-2.0
 base_model:
 - Qwen/Qwen3-14B
 ---
 # Fast-Math-Qwen3-14B
-By applying SFT and GRPO on difficult math problems, we enhanced the performance of `DeepSeek-R1-Distill-Qwen-14B` and developed [`Fast-Math-R1-14B`](https://huggingface.co/RabotniKuma/Fast-Math-R1-14B),
-which achieves approx. 30% faster inference on average, while maintaining accuracy.
-In addition, we trained and open-sourced `Fast-Math-Qwen3-14B`, an efficiency-optimized version of Qwen3-14B`, following the same approach.
-**Compared to Qwen3-14B, this model enables approx. 65% faster inference on average, with minimal loss in performance.**
 Technical details can be found in [our github repository](https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/tree/master).
@@ -19,19 +22,19 @@ This model likely inherits the ability to perform inference in TIR mode from the
 ## Evaluation
 <img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true' max-height='400px'>
-|                     |              | AIME 2024        |                    | AIME 2025        |                    |
-| ------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
-| Model               | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
-| Qwen3-14B           | 32000        | 79.3             | 13669              | 69.5             | 16481              |
-|                     | 24000        | 75.9             | 13168              | 65.6             | 15235              |
-|                     | 16000        | 64.5             | 11351              | 50.4             | 12522              |
-|                     | 12000        | 49.7             | 9746               | 36.3             | 10353              |
-|                     | 8000         | 28.4             | 7374               | 19.5             | 7485               |
-| Fast-Math-Qwen3-14B | 32000        | 77.6             | 9740               | 66.6             | 12281              |
-|                     | 24000        | 76.5             | 9634               | 65.3             | 11847              |
-|                     | 16000        | 72.6             | 8793               | 60.1             | 10195              |
-|                     | 12000        | 65.1             | 7775               | 49.4             | 8733               |
-|                     | 8000         | 50.7             | 6260               | 36               | 6618               |
 # Inference
 ## vLLM
@@ -55,9 +58,9 @@ sampling_params = SamplingParams(
 )
 messages = [
     {
-        'role': 'user',
         'content': (
-            'Solve the problem, and put the answer in \boxed{{}}. '
             'Sarah is twice as old as her youngest brother. If the difference between their ages is 15 years. How old is her youngest brother?'
         )
     }

 ---
 base_model:
 - Qwen/Qwen3-14B
+license: apache-2.0
+pipeline_tag: text-generation
+library_name: transformers
 ---
 # Fast-Math-Qwen3-14B
+**Fast-Math-Qwen3-14B** is an efficiency-optimized version of `Qwen3-14B`, developed following the two-stage recipe of Supervised Fine-Tuning (SFT) and Reinforcement Learning from Online Inference (GRPO) presented in the paper:
+**[A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning](https://huggingface.co/papers/2507.08267)**
+This model enables **approx. 65% faster inference on average, with minimal loss in performance**, compared to the base `Qwen3-14B`.
 Technical details can be found in [our github repository](https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/tree/master).
 ## Evaluation
 <img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true' max-height='400px'>
+|                     |              | AIME 2024        |                    | AIME 2025        |                    |
+| ------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
+| Model               | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
+| Qwen3-14B           | 32000        | 79.3             | 13669              | 69.5             | 16481              |
+|                     | 24000        | 75.9             | 13168              | 65.6             | 15235              |
+|                     | 16000        | 64.5             | 11351              | 50.4             | 12522              |
+|                     | 12000        | 49.7             | 9746               | 36.3             | 10353              |
+|                     | 8000         | 28.4             | 7374               | 19.5             | 7485               |
+| Fast-Math-Qwen3-14B | 32000        | 77.6             | 9740               | 66.6             | 12281              |
+|                     | 24000        | 76.5             | 9634               | 65.3             | 11847              |
+|                     | 16000        | 72.6             | 8793               | 60.1             | 10195              |
+|                     | 12000        | 65.1             | 7775               | 49.4             | 8733               |
+|                     | 8000         | 50.7             | 6260               | 36               | 6618               |
 # Inference
 ## vLLM
 )
 messages = [
     {
+        'role': 'user',
         'content': (
+            'Solve the problem, and put the answer in \\boxed{{}}. '
             'Sarah is twice as old as her youngest brother. If the difference between their ages is 15 years. How old is her youngest brother?'
         )
     }