Jasaxion
/

MathSmith-Qwen3-8B-LongCoT

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Jasaxion commited on 25 days ago

Commit

037f885

·

verified ·

1 Parent(s): b5be52c

Update README.md

Files changed (1) hide show

README.md +26 -40

README.md CHANGED Viewed

@@ -9,53 +9,39 @@ tags:
 model-index:
 - name: MathSmith-Qwen3-8B-LongCoT
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# MathSmith-Qwen3-8B-LongCoT
-This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the mathsmith-hc-longcot-50k dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 1
-- eval_batch_size: 8
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 8
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 16
-- total_eval_batch_size: 64
-- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 5.0
-### Training results
-### Framework versions
 - Transformers 4.52.4
 - Pytorch 2.7.0+cu126
 - Datasets 3.6.0
 - Tokenizers 0.21.1

 model-index:
 - name: MathSmith-Qwen3-8B-LongCoT
   results: []
+datasets:
+- Jasaxion/MathSmith-HC-Problems
+- Jasaxion/MathSmith-HC-Solution-Generation-LongCoT-Qwen3-30B-A3B
+language:
+- en
 ---
+**MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy**
+[![Paper](https://img.shields.io/badge/arXiv-2508.05592-b31b1b.svg)](https://arxiv.org/abs/2508.05592)
+[![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](LICENSE)
+[![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)]()
+[![GitHub](https://img.shields.io/badge/-GitHub-181717?logo=github)](https://github.com/Jasaxion/MathSmith)
+This model is a fine-tuned version of [Qwen3/Qwen3-8B](https://huggingface.co/Qwen3/Qwen3-8B) on the MathSmith-HC longCoT setting.
+Check Details at https://github.com/Jasaxion/MathSmith
+## Dependence
 - Transformers 4.52.4
 - Pytorch 2.7.0+cu126
 - Datasets 3.6.0
 - Tokenizers 0.21.1
+## Citation
+If you find this work useful, please cite:
+```bibtex
+@article{zhan2025mathsmith,
+  title={MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy},
+  author={Zhan, Shaoxiong and Lai, Yanlin and Lu, Ziyu and Lin, Dahua and Yang, Ziqing and Tan, Fei},
+  journal={arXiv preprint arXiv:2508.05592},
+  year={2025}
+}
+```