jjzha commited on
Commit
1125824
·
verified ·
1 Parent(s): eff9f49

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -0
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - HuggingFaceTB/SmolLM2-1.7B-Instruct
4
+ datasets:
5
+ - jjzha/fs1-2708
6
+ language:
7
+ - en
8
+ library_name: transformers
9
+ license: mit
10
+ pipeline_tag: text-generation
11
+ tags:
12
+ - en
13
+ - factuality
14
+ - thinking
15
+ - reasoning
16
+ ---
17
+
18
+ ## Model Details
19
+
20
+ **SmolLM2-1.7B-Instruct-rt-2708** is a 1.7B parameter language model designed for English text generation tasks. This model builds upon [HuggingFaceTB/SmolLM2-1.7B-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct) and is further fine-tuned on the [jjzha/fs1-2708](https://huggingface.co/datasets/jjzha/fs1-2708) dataset. It focuses on enhancing factual reasoning abilities in generated text.
21
+
22
+ ### Model Developers
23
+
24
+ This model was fine-tuned by independent contributors using the Hugging Face Transformers library.
25
+
26
+ ### Variations
27
+
28
+ This is a fine-tuned version of the `HuggingFaceTB/SmolLM2-1.7B-Instruct` model. No additional variants or intermediate checkpoints are currently provided.
29
+
30
+ ### Input
31
+
32
+ Text only.
33
+
34
+ ### Output
35
+
36
+ Text only.
37
+
38
+ ### Model Architecture
39
+
40
+ The model is an auto-regressive, transformer-based language model, fine-tuned with supervised learning to improve instruction-following and reasoning capabilities in English.
41
+
42
+ ### Model Dates
43
+
44
+ Fine-tuning was performed in February-April 2025. The base and instruct model was originally released by the Qwen team.
45
+
46
+ ### License
47
+
48
+ This model is released under the [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0).
49
+
50
+ ### Research Paper
51
+
52
+ [Scaling Reasoning can Improve Factuality in Large Language Models](https://huggingface.co/papers/2505.11140)
53
+
54
+ ## Intended Use & Limitations
55
+
56
+ ### Intended Use Cases
57
+
58
+ This model is intended for English language text generation tasks that require improved factual accuracy and reasoning. It is suitable for research, experimentation, and development of assistant-like chat applications.
59
+
60
+ The instruction-tuned base model follows the Qwen instruction format, and this fine-tuned version preserves that behavior.
61
+
62
+ ### Limitations
63
+
64
+ Despite improvements, the model may still produce factually incorrect or logically inconsistent outputs. It is not recommended for high-stakes decision-making applications without human oversight. Always verify generated content before relying on it in critical scenarios.
65
+
66
+ ## Hardware and Software
67
+
68
+ ### Training Factors
69
+
70
+ Fine-tuning was performed using the Hugging Face Transformers library and Pytorch FSDP. We used a multinode and multigpu setup with AMD MI250x GPUs.
71
+
72
+ ### Carbon Footprint
73
+
74
+ We only have aggregated statistics of all models fine-tuned and inferences. A cumulative of 6,500 GPU hours of computation was performed on AMD MI250x GPU
75
+ modules, which has a TDP of 500 Watts. The experiments were ran from February to April 2025. During this time, the average carbon efficiency in Finland was 0.085 kg/kW h.
76
+ This means we released about 276 kg of CO2 equivalent.
77
+
78
+ ## Training Data
79
+
80
+ ### Overview
81
+
82
+ Fine-tuning was performed on the [jjzha/fs1-2708](https://huggingface.co/datasets/jjzha/fs1-2708) dataset, which focuses on enhancing reasoning and factual accuracy.
83
+
84
+ ## Evaluation Results
85
+
86
+ See paper for results.
87
+
88
+ ## Citation
89
+
90
+ ```
91
+ @misc{zhang2025scalingreasoningimprovefactuality,
92
+ title={Scaling Reasoning can Improve Factuality in Large Language Models},
93
+ author={Mike Zhang and Johannes Bjerva and Russa Biswas},
94
+ year={2025},
95
+ eprint={2505.11140},
96
+ archivePrefix={arXiv},
97
+ primaryClass={cs.CL},
98
+ url={https://arxiv.org/abs/2505.11140},
99
+ }
100
+ ```
101
+
102
+ Code: https://github.com/jjzha/fs1