Jasaxion commited on
Commit
037f885
·
verified ·
1 Parent(s): b5be52c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -40
README.md CHANGED
@@ -9,53 +9,39 @@ tags:
9
  model-index:
10
  - name: MathSmith-Qwen3-8B-LongCoT
11
  results: []
 
 
 
 
 
12
  ---
13
 
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
 
17
- # MathSmith-Qwen3-8B-LongCoT
 
 
 
18
 
19
- This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the mathsmith-hc-longcot-50k dataset.
20
 
21
- ## Model description
22
-
23
- More information needed
24
-
25
- ## Intended uses & limitations
26
-
27
- More information needed
28
-
29
- ## Training and evaluation data
30
-
31
- More information needed
32
-
33
- ## Training procedure
34
-
35
- ### Training hyperparameters
36
-
37
- The following hyperparameters were used during training:
38
- - learning_rate: 1e-05
39
- - train_batch_size: 1
40
- - eval_batch_size: 8
41
- - seed: 42
42
- - distributed_type: multi-GPU
43
- - num_devices: 8
44
- - gradient_accumulation_steps: 2
45
- - total_train_batch_size: 16
46
- - total_eval_batch_size: 64
47
- - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
- - lr_scheduler_type: cosine
49
- - lr_scheduler_warmup_ratio: 0.1
50
- - num_epochs: 5.0
51
-
52
- ### Training results
53
-
54
-
55
-
56
- ### Framework versions
57
 
 
58
  - Transformers 4.52.4
59
  - Pytorch 2.7.0+cu126
60
  - Datasets 3.6.0
61
  - Tokenizers 0.21.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  model-index:
10
  - name: MathSmith-Qwen3-8B-LongCoT
11
  results: []
12
+ datasets:
13
+ - Jasaxion/MathSmith-HC-Problems
14
+ - Jasaxion/MathSmith-HC-Solution-Generation-LongCoT-Qwen3-30B-A3B
15
+ language:
16
+ - en
17
  ---
18
 
19
+ **MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy**
 
20
 
21
+ [![Paper](https://img.shields.io/badge/arXiv-2508.05592-b31b1b.svg)](https://arxiv.org/abs/2508.05592)
22
+ [![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](LICENSE)
23
+ [![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)]()
24
+ [![GitHub](https://img.shields.io/badge/-GitHub-181717?logo=github)](https://github.com/Jasaxion/MathSmith)
25
 
26
+ This model is a fine-tuned version of [Qwen3/Qwen3-8B](https://huggingface.co/Qwen3/Qwen3-8B) on the MathSmith-HC longCoT setting.
27
 
28
+ Check Details at https://github.com/Jasaxion/MathSmith
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
+ ## Dependence
31
  - Transformers 4.52.4
32
  - Pytorch 2.7.0+cu126
33
  - Datasets 3.6.0
34
  - Tokenizers 0.21.1
35
+
36
+ ## Citation
37
+
38
+ If you find this work useful, please cite:
39
+
40
+ ```bibtex
41
+ @article{zhan2025mathsmith,
42
+ title={MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy},
43
+ author={Zhan, Shaoxiong and Lai, Yanlin and Lu, Ziyu and Lin, Dahua and Yang, Ziqing and Tan, Fei},
44
+ journal={arXiv preprint arXiv:2508.05592},
45
+ year={2025}
46
+ }
47
+ ```