Commit
bab5a34
·
verified ·
1 Parent(s): c98d7ed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +205 -147
README.md CHANGED
@@ -8,202 +8,260 @@ tags:
8
  - transformers
9
  - trl
10
  - unsloth
 
 
 
 
 
 
11
  ---
12
 
13
- # Model Card for Model ID
14
 
15
- <!-- Provide a quick summary of what the model is/does. -->
16
 
 
 
 
 
 
17
 
 
18
 
19
- ## Model Details
20
-
21
- ### Model Description
22
-
23
- <!-- Provide a longer summary of what this model is. -->
24
-
25
-
26
-
27
- - **Developed by:** [More Information Needed]
28
- - **Funded by [optional]:** [More Information Needed]
29
- - **Shared by [optional]:** [More Information Needed]
30
- - **Model type:** [More Information Needed]
31
- - **Language(s) (NLP):** [More Information Needed]
32
- - **License:** [More Information Needed]
33
- - **Finetuned from model [optional]:** [More Information Needed]
34
-
35
- ### Model Sources [optional]
36
-
37
- <!-- Provide the basic links for the model. -->
38
-
39
- - **Repository:** [More Information Needed]
40
- - **Paper [optional]:** [More Information Needed]
41
- - **Demo [optional]:** [More Information Needed]
42
-
43
- ## Uses
44
-
45
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
46
-
47
- ### Direct Use
48
-
49
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
50
-
51
- [More Information Needed]
52
-
53
- ### Downstream Use [optional]
54
-
55
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
56
-
57
- [More Information Needed]
58
-
59
- ### Out-of-Scope Use
60
-
61
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
62
-
63
- [More Information Needed]
64
-
65
- ## Bias, Risks, and Limitations
66
-
67
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
68
-
69
- [More Information Needed]
70
-
71
- ### Recommendations
72
-
73
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
74
-
75
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
76
-
77
- ## How to Get Started with the Model
78
-
79
- Use the code below to get started with the model.
80
-
81
- [More Information Needed]
82
-
83
- ## Training Details
84
-
85
- ### Training Data
86
-
87
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
88
-
89
- [More Information Needed]
90
-
91
- ### Training Procedure
92
-
93
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
94
-
95
- #### Preprocessing [optional]
96
-
97
- [More Information Needed]
98
-
99
-
100
- #### Training Hyperparameters
101
-
102
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
103
-
104
- #### Speeds, Sizes, Times [optional]
105
-
106
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
107
-
108
- [More Information Needed]
109
-
110
- ## Evaluation
111
-
112
- <!-- This section describes the evaluation protocols and provides the results. -->
113
-
114
- ### Testing Data, Factors & Metrics
115
 
116
- #### Testing Data
 
 
 
 
117
 
118
- <!-- This should link to a Dataset Card if possible. -->
119
 
120
- [More Information Needed]
121
 
122
- #### Factors
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
123
 
124
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
125
 
126
- [More Information Needed]
127
 
128
- #### Metrics
 
129
 
130
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
131
 
132
- [More Information Needed]
133
 
134
- ### Results
 
 
 
 
135
 
136
- [More Information Needed]
 
 
 
 
137
 
138
- #### Summary
139
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
140
 
 
141
 
142
- ## Model Examination [optional]
143
 
144
- <!-- Relevant interpretability work for the model goes here -->
145
 
146
- [More Information Needed]
 
147
 
148
- ## Environmental Impact
 
 
 
 
 
 
 
 
149
 
150
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
 
 
 
 
151
 
152
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
153
 
154
- - **Hardware Type:** [More Information Needed]
155
- - **Hours used:** [More Information Needed]
156
- - **Cloud Provider:** [More Information Needed]
157
- - **Compute Region:** [More Information Needed]
158
- - **Carbon Emitted:** [More Information Needed]
159
 
160
- ## Technical Specifications [optional]
 
 
 
 
161
 
162
- ### Model Architecture and Objective
 
 
163
 
164
- [More Information Needed]
 
 
165
 
166
- ### Compute Infrastructure
167
 
168
- [More Information Needed]
169
 
170
- #### Hardware
 
 
171
 
172
- [More Information Needed]
173
 
174
- #### Software
175
 
176
- [More Information Needed]
 
 
177
 
178
- ## Citation [optional]
 
 
 
179
 
180
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 
 
181
 
182
- **BibTeX:**
183
 
184
- [More Information Needed]
185
 
186
- **APA:**
 
187
 
188
- [More Information Needed]
 
 
 
 
 
 
 
 
189
 
190
- ## Glossary [optional]
 
 
 
191
 
192
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
193
 
194
- [More Information Needed]
195
 
196
- ## More Information [optional]
197
 
198
- [More Information Needed]
199
 
200
- ## Model Card Authors [optional]
201
 
202
- [More Information Needed]
203
 
204
- ## Model Card Contact
205
 
206
- [More Information Needed]
207
- ### Framework versions
208
 
209
- - PEFT 0.17.0
 
8
  - transformers
9
  - trl
10
  - unsloth
11
+ license: apache-2.0
12
+ datasets:
13
+ - Omartificial-Intelligence-Space/Arabic-gsm8k
14
+ language:
15
+ - ar
16
+ - en
17
  ---
18
 
19
+ # Omartificial-Intelligence-Space/gpt-oss-math-ar
20
 
21
+ Arabic step-by-step math solver fine-tuned from **gpt-oss-20B** using **LoRA (PEFT)** on curated Arabic GSM8K-style problems. The model is instructed to reason **in Arabic** and explain each solution step clearly before giving the final answer.
22
 
23
+ - **Base model:** `unsloth/gpt-oss-20b-unsloth-bnb-4bit`
24
+ - **Parameter-efficient fine-tuning:** LoRA (PEFT) via Unsloth + TRL SFT
25
+ - **Primary objective:** Arabic chain-of-thought style arithmetic / word-problem reasoning (grade-school to early middle-school range)
26
+ - **License:** Apache-2.0
27
+ - **Maintainer:** **Omer Nacar** (Omartificial-Intelligence-Space)
28
 
29
+ ---
30
 
31
+ # Model summary
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
+ - **Name:** `Omartificial-Intelligence-Space/gpt-oss-math-ar`
34
+ - **Size:** 20B (adapter on top of the base)
35
+ - **Languages:** Arabic (primary), English (instructions/support)
36
+ - **Capabilities:** Step-by-step solutions to math word problems, showing intermediate calculations in Arabic, ending with a concise final result line.
37
+ - **Intended use:** Educational assistance, practice solutions, and Arabic math reasoning research.
38
 
39
+ > ⚠️ **Note on reasoning:** The model is optimized to *write out* reasoning steps in Arabic. For sensitive use cases (exams, grading, or high-stakes evaluation), always verify outputs.
40
 
41
+ ---
42
 
43
+ # Example usage (Transformers + Unsloth)
44
+
45
+ ```python
46
+ from unsloth import FastLanguageModel
47
+ from transformers import TextStreamer
48
+ import torch
49
+
50
+ max_seq_length = 1024
51
+ dtype = None # auto-detect
52
+
53
+ model, tokenizer = FastLanguageModel.from_pretrained(
54
+ model_name="Omartificial-Intelligence-Space/gpt-oss-math-ar",
55
+ dtype=dtype,
56
+ max_seq_length=max_seq_length,
57
+ load_in_4bit=True,
58
+ full_finetuning=False,
59
+ )
60
+
61
+ messages = [
62
+ {"role": "system", "content": "reasoning language: Arabic\n\nYou are an Arabic AI math questions solver that solves math problems step-by-step and explian in Arabic language only."},
63
+ {"role": "user", "content": "بطات جانيت تضع 16 بيضة في اليوم. فهي تأكل ثلاث منها على الفطور كل صباح وتخبز الكعك لأصدقائها كل يوم بأربع منها. إنها تبيع ما تبقى منها في سوق المزارعين كل يوم مقابل دولارين لكل بيضة بطازجة. كم تجني من الدولار كل يوم في سوق المزارعين؟"},
64
+ ]
65
+
66
+ inputs = tokenizer.apply_chat_template(
67
+ messages,
68
+ add_generation_prompt=True,
69
+ return_tensors="pt",
70
+ return_dict=True,
71
+ reasoning_effort="low",
72
+ ).to(model.device)
73
+
74
+ streamer = TextStreamer(tokenizer)
75
+ _ = model.generate(
76
+ **inputs,
77
+ max_new_tokens=256,
78
+ do_sample=True,
79
+ temperature=0.2,
80
+ top_p=0.9,
81
+ streamer=streamer,
82
+ )
83
+ ```
84
+
85
+ **Prompting tip:** Keep the **system** message as above so the model stays in Arabic and explains step-by-step.
86
 
87
+ ---
88
 
89
+ # Training data
90
 
91
+ - **Primary dataset (fine-tuning):** `Omartificial-Intelligence-Space/Arabic-gsm8k-v2`
92
+ Curated Arabic word problems with gold step-by-step solutions.
93
 
94
+ - **Evaluation set (unseen):** `Omartificial-Intelligence-Space/Arabic-gsm8k` (`main_test` split)
95
 
96
+ ### Sample training examples
97
 
98
+ ```
99
+ [ { "content": "reasoning language: Arabic\n\nYou are an Arabic AI math questions solver that solves math problems step-by-step and explian in Arabic language only.", "role": "system" },
100
+ { "content": "باعت نتاليا مشابك إلى 48 من أصدقائها في شهر أبريل، ثم باعت نصف ذلك العدد من المشابك في شهر مايو. كم عدد المشابك التي باعتها نتاليا في شهري أبريل ومايو معًا؟", "role": "user" },
101
+ { "content": "باعت ناتاليا 48 ÷ 2 = 24 مشبكًا في مايو.\nباعت ناتاليا 48 + 24 = 72 مشبكًا في أبريل ومايو مجتمعين.\n#### 72", "role": "assistant" } ]
102
+ ```
103
 
104
+ ```
105
+ [ { "content": "reasoning language: Arabic\n\nYou are an Arabic AI math questions solver that solves math problems step-by-step and explian in Arabic language only.", "role": "system" },
106
+ { "content": "تقرأ جولي كتاباً مؤلفاً من 120 صفحة. بالأمس، تمكنت من قراءة 12 صفحة، واليوم قرأت ضعف عدد الصفحات التي قرأتها بالأمس. إذا أرادت قراءة نصف ما تبقى من صفحات غداً، كم صفحة عليها أن تقرأ؟", "role": "user" },
107
+ { "content": "قرأت مايلا اليوم 12 × 2 = 24 صفحة.\nلذلك فإن إجمالي ما قرأته منذ الأمس هو 12 + 24 = 36 صفحة.\nتبقى 120 - 36 = 84 صفحة ليتم قراءتها.\nوبما أنها تريد قراءة نصف الصفحات المتبقية غدًا، فيجب عليها أن تقرأ 84/2 = 42 صفحة.\n#### 42", "role": "assistant" } ]
108
+ ```
109
 
110
+ ---
111
 
112
+ # Training procedure
113
+
114
+ **Frameworks:** Unsloth, Transformers, TRL (SFT)
115
+ **Method:** Supervised fine-tuning with LoRA adapters
116
+
117
+ ### LoRA & optimization (Unsloth)
118
+ ```python
119
+ model = FastLanguageModel.get_peft_model(
120
+ model,
121
+ r = 16,
122
+ target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
123
+ "gate_proj", "up_proj", "down_proj"],
124
+ lora_alpha = 16,
125
+ lora_dropout = 0,
126
+ bias = "none",
127
+ use_gradient_checkpointing = "unsloth",
128
+ random_state = 3407,
129
+ use_rslora = False,
130
+ loftq_config = None,
131
+ )
132
+ ```
133
+
134
+ ### SFT configuration (TRL)
135
+ ```python
136
+ from trl import SFTConfig, SFTTrainer
137
+ trainer = SFTTrainer(
138
+ model = model,
139
+ tokenizer = tokenizer,
140
+ train_dataset = dataset,
141
+ args = SFTConfig(
142
+ per_device_train_batch_size = 16,
143
+ gradient_accumulation_steps = 1,
144
+ warmup_steps = 100,
145
+ num_train_epochs = 3,
146
+ learning_rate = 2e-4,
147
+ logging_steps = 100,
148
+ optim = "adamw_8bit",
149
+ weight_decay = 0.01,
150
+ lr_scheduler_type = "linear",
151
+ seed = 3407,
152
+ output_dir = "outputs",
153
+ report_to = "none",
154
+ ),
155
+ )
156
+ ```
157
+
158
+ **Hardware:** Colab A100 40GB
159
+ **Seed:** 3407
160
 
161
+ ---
162
 
163
+ # Inference
164
 
165
+ You can load the published adapter directly (4-bit) and use the chat template:
166
 
167
+ ```python
168
+ from unsloth import FastLanguageModel
169
 
170
+ model2, tokenizer2 = FastLanguageModel.from_pretrained(
171
+ model_name="Omartificial-Intelligence-Space/gpt-oss-math-ar",
172
+ dtype=None, # auto
173
+ max_seq_length=1024,
174
+ load_in_4bit=True,
175
+ full_finetuning=False,
176
+ # token="hf_..." # if needed for gated bases
177
+ )
178
+ ```
179
 
180
+ **Recommended generation (starting point):**
181
+ - `max_new_tokens`: 128–384 for typical word problems
182
+ - `temperature`: 0.1–0.5 (lower for deterministic math)
183
+ - `top_p`: 0.8–0.95
184
+ - `repetition_penalty`: ~1.05 (optional)
185
 
186
+ ---
187
 
188
+ # Prompting guide (Arabic)
 
 
 
 
189
 
190
+ - Keep the **system** instruction fixed to enforce Arabic step-by-step reasoning.
191
+ - Provide one math word problem per turn.
192
+ - Expect answers in this shape:
193
+ - Short steps showing operations
194
+ - A final line like: `#### <النتيجة>`
195
 
196
+ **Example:**
197
+ ```
198
+ [system] reasoning language: Arabic
199
 
200
+ You are an Arabic AI math questions solver that solves math problems step-by-step and explian in Arabic language only.
201
+ [user] لدى متجر 75 قطعة حلوى. باع 18 قطـعة في الصباح و 23 في المساء. كم تبقى؟
202
+ ```
203
 
204
+ ---
205
 
206
+ # Evaluation
207
 
208
+ - **Unseen test set:** `Omartificial-Intelligence-Space/Arabic-gsm8k` (`main_test`)
209
+ - **Current status:** qualitative checks on arithmetic and simple word-problems; formal benchmark numbers can be added once computed.
210
+ - **Suggested protocol:** exact-match on the final `#### <number>` line; optional step-accuracy analysis for intermediate calculations.
211
 
212
+ ---
213
 
214
+ # Intended use & limitations
215
 
216
+ **Intended use**
217
+ - Educational demos, tutoring aids, and research on Arabic mathematical reasoning.
218
+ - Generating step-by-step worked examples for practice problems.
219
 
220
+ **Limitations**
221
+ - May hallucinate or miscompute under distribution shift or very long contexts.
222
+ - Not a substitute for professional instruction or grading.
223
+ - Arabic is primary; performance in other languages is not targeted.
224
 
225
+ **Safety & responsible use**
226
+ - Verify outputs before use in assessment settings.
227
+ - Avoid using the model to complete academic work where external assistance is prohibited.
228
 
229
+ ---
230
 
231
+ # Model card contacts & citation
232
 
233
+ **Author/Maintainer:** **Omer Nacar** — Omartificial-Intelligence-Space
234
+ **Model page:** https://huggingface.co/Omartificial-Intelligence-Space/gpt-oss-math-ar
235
 
236
+ **Please cite:**
237
+ ```
238
+ @model{gpt_oss_math_ar_oi_space,
239
+ title = {gpt-oss-math-ar: Arabic Step-by-Step Math Reasoning Adapter for gpt-oss-20B},
240
+ author = {Omer Nacar},
241
+ year = {2025},
242
+ howpublished = {\url{https://huggingface.co/Omartificial-Intelligence-Space/gpt-oss-math-ar}}
243
+ }
244
+ ```
245
 
246
+ Also cite the base and tooling:
247
+ - Unsloth, TRL, and Hugging Face Transformers
248
+ - Base model: `unsloth/gpt-oss-20b-unsloth-bnb-4bit`
249
+ - Datasets: `Omartificial-Intelligence-Space/Arabic-gsm8k` and `Arabic-gsm8k-v2`
250
 
251
+ ---
252
 
253
+ # License
254
 
255
+ This adapter is released under **Apache-2.0**. Users must also comply with the licenses and terms of the **base model** and any datasets used.
256
 
257
+ ---
258
 
259
+ # Acknowledgements
260
 
261
+ Thanks to the Unsloth, TRL, and Transformers teams for making efficient adapter training straightforward, and to contributors of Arabic GSM8K-style datasets that enabled high-quality supervised fine-tuning.
262
 
263
+ ---
264
 
265
+ # Changelog
 
266
 
267
+ - Initial public release of `gpt-oss-math-ar` (adapter on gpt-oss-20B) with Arabic step-by-step math reasoning and example inference code.