lbourdois commited on
Commit
313fbf2
·
verified ·
1 Parent(s): 18db99f

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +529 -515
README.md CHANGED
@@ -1,516 +1,530 @@
1
- ---
2
- base_model:
3
- - Qwen/Qwen2.5-Math-72B
4
- - Qwen/Qwen2.5-72B-Instruct
5
- library_name: transformers
6
- tags:
7
- - mergekit
8
- - merge
9
- license: other
10
- ---
11
-
12
- ## Qwen2.5-142B-Doubled72B-Math-Instruct (Mergekit-Merge) by Solshine (Caleb DeLeeuw)
13
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/654527ce2a13610acc25d921/FSWNkg4h9W329CiBYIuiC.png)
14
-
15
- # merge
16
-
17
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
18
-
19
- # License
20
-
21
- Hippocratic License 3.0 + Ecocide module, + Extractive Industries module, + Copyleft
22
- [![Hippocratic License HL3-CL-ECO-EXTR](https://img.shields.io/static/v1?label=Hippocratic%20License&message=HL3-CL-ECO-EXTR&labelColor=5e2751&color=bc8c3d)](https://firstdonoharm.dev/version/3/0/cl-eco-extr.html)
23
- https://firstdonoharm.dev/version/3/0/cl-eco-extr.txt
24
-
25
-
26
- ## Merge Details
27
- ### Merge Method
28
-
29
- This model was merged using the passthrough merge method. Every layer is doubled in order, from Qwen/Qwen2.5-72B-Instruct and Qwen/Qwen2.5-Math-72B, alternating which model is adding a layer and the MLP layers + 2 output layers only taken from the instruct model, creating 142B parameters. No additional fine-tune has been done in this merged model.
30
-
31
- ### Models Merged
32
-
33
- The following models were included in the merge:
34
- * [Qwen/Qwen2.5-Math-72B](https://huggingface.co/Qwen/Qwen2.5-Math-72B)
35
- * [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
36
-
37
- ### Configuration
38
-
39
- The following YAML configuration was used to produce this model:
40
-
41
- ```yaml
42
- slices:
43
- - sources:
44
- - model: Qwen/Qwen2.5-Math-72B
45
- layer_range: [0, 1]
46
- - sources:
47
- - model: Qwen/Qwen2.5-72B-Instruct
48
- layer_range: [0, 1]
49
- - sources:
50
- - model: Qwen/Qwen2.5-Math-72B
51
- layer_range: [1, 2]
52
- - sources:
53
- - model: Qwen/Qwen2.5-72B-Instruct
54
- layer_range: [1, 2]
55
- - sources:
56
- - model: Qwen/Qwen2.5-Math-72B
57
- layer_range: [2, 3]
58
- - sources:
59
- - model: Qwen/Qwen2.5-72B-Instruct
60
- layer_range: [2, 3]
61
- - sources:
62
- - model: Qwen/Qwen2.5-Math-72B
63
- layer_range: [3, 4]
64
- - sources:
65
- - model: Qwen/Qwen2.5-72B-Instruct
66
- layer_range: [3, 4]
67
- - sources:
68
- - model: Qwen/Qwen2.5-Math-72B
69
- layer_range: [4, 5]
70
- - sources:
71
- - model: Qwen/Qwen2.5-72B-Instruct
72
- layer_range: [4, 5]
73
- - sources:
74
- - model: Qwen/Qwen2.5-Math-72B
75
- layer_range: [5, 6]
76
- - sources:
77
- - model: Qwen/Qwen2.5-72B-Instruct
78
- layer_range: [5, 6]
79
- - sources:
80
- - model: Qwen/Qwen2.5-Math-72B
81
- layer_range: [6, 7]
82
- - sources:
83
- - model: Qwen/Qwen2.5-72B-Instruct
84
- layer_range: [6, 7]
85
- - sources:
86
- - model: Qwen/Qwen2.5-Math-72B
87
- layer_range: [7, 8]
88
- - sources:
89
- - model: Qwen/Qwen2.5-72B-Instruct
90
- layer_range: [7, 8]
91
- - sources:
92
- - model: Qwen/Qwen2.5-Math-72B
93
- layer_range: [8, 9]
94
- - sources:
95
- - model: Qwen/Qwen2.5-72B-Instruct
96
- layer_range: [8, 9]
97
- - sources:
98
- - model: Qwen/Qwen2.5-Math-72B
99
- layer_range: [9, 10]
100
- - sources:
101
- - model: Qwen/Qwen2.5-72B-Instruct
102
- layer_range: [9, 10]
103
- - sources:
104
- - model: Qwen/Qwen2.5-Math-72B
105
- layer_range: [10, 11]
106
- - sources:
107
- - model: Qwen/Qwen2.5-72B-Instruct
108
- layer_range: [10, 11]
109
- - sources:
110
- - model: Qwen/Qwen2.5-Math-72B
111
- layer_range: [11, 12]
112
- - sources:
113
- - model: Qwen/Qwen2.5-72B-Instruct
114
- layer_range: [11, 12]
115
- - sources:
116
- - model: Qwen/Qwen2.5-Math-72B
117
- layer_range: [12, 13]
118
- - sources:
119
- - model: Qwen/Qwen2.5-72B-Instruct
120
- layer_range: [12, 13]
121
- - sources:
122
- - model: Qwen/Qwen2.5-Math-72B
123
- layer_range: [13, 14]
124
- - sources:
125
- - model: Qwen/Qwen2.5-72B-Instruct
126
- layer_range: [13, 14]
127
- - sources:
128
- - model: Qwen/Qwen2.5-Math-72B
129
- layer_range: [14, 15]
130
- - sources:
131
- - model: Qwen/Qwen2.5-72B-Instruct
132
- layer_range: [14, 15]
133
- - sources:
134
- - model: Qwen/Qwen2.5-Math-72B
135
- layer_range: [15, 16]
136
- - sources:
137
- - model: Qwen/Qwen2.5-72B-Instruct
138
- layer_range: [15, 16]
139
- - sources:
140
- - model: Qwen/Qwen2.5-Math-72B
141
- layer_range: [16, 17]
142
- - sources:
143
- - model: Qwen/Qwen2.5-72B-Instruct
144
- layer_range: [16, 17]
145
- - sources:
146
- - model: Qwen/Qwen2.5-Math-72B
147
- layer_range: [17, 18]
148
- - sources:
149
- - model: Qwen/Qwen2.5-72B-Instruct
150
- layer_range: [17, 18]
151
- - sources:
152
- - model: Qwen/Qwen2.5-Math-72B
153
- layer_range: [18, 19]
154
- - sources:
155
- - model: Qwen/Qwen2.5-72B-Instruct
156
- layer_range: [18, 19]
157
- - sources:
158
- - model: Qwen/Qwen2.5-Math-72B
159
- layer_range: [19, 20]
160
- - sources:
161
- - model: Qwen/Qwen2.5-72B-Instruct
162
- layer_range: [19, 20]
163
- - sources:
164
- - model: Qwen/Qwen2.5-Math-72B
165
- layer_range: [20, 21]
166
- - sources:
167
- - model: Qwen/Qwen2.5-72B-Instruct
168
- layer_range: [20, 21]
169
- - sources:
170
- - model: Qwen/Qwen2.5-Math-72B
171
- layer_range: [21, 22]
172
- - sources:
173
- - model: Qwen/Qwen2.5-72B-Instruct
174
- layer_range: [21, 22]
175
- - sources:
176
- - model: Qwen/Qwen2.5-Math-72B
177
- layer_range: [22, 23]
178
- - sources:
179
- - model: Qwen/Qwen2.5-72B-Instruct
180
- layer_range: [22, 23]
181
- - sources:
182
- - model: Qwen/Qwen2.5-Math-72B
183
- layer_range: [23, 24]
184
- - sources:
185
- - model: Qwen/Qwen2.5-72B-Instruct
186
- layer_range: [23, 24]
187
- - sources:
188
- - model: Qwen/Qwen2.5-Math-72B
189
- layer_range: [24, 25]
190
- - sources:
191
- - model: Qwen/Qwen2.5-72B-Instruct
192
- layer_range: [24, 25]
193
- - sources:
194
- - model: Qwen/Qwen2.5-Math-72B
195
- layer_range: [25, 26]
196
- - sources:
197
- - model: Qwen/Qwen2.5-72B-Instruct
198
- layer_range: [25, 26]
199
- - sources:
200
- - model: Qwen/Qwen2.5-Math-72B
201
- layer_range: [26, 27]
202
- - sources:
203
- - model: Qwen/Qwen2.5-72B-Instruct
204
- layer_range: [26, 27]
205
- - sources:
206
- - model: Qwen/Qwen2.5-Math-72B
207
- layer_range: [27, 28]
208
- - sources:
209
- - model: Qwen/Qwen2.5-72B-Instruct
210
- layer_range: [27, 28]
211
- - sources:
212
- - model: Qwen/Qwen2.5-Math-72B
213
- layer_range: [28, 29]
214
- - sources:
215
- - model: Qwen/Qwen2.5-72B-Instruct
216
- layer_range: [28, 29]
217
- - sources:
218
- - model: Qwen/Qwen2.5-Math-72B
219
- layer_range: [29, 30]
220
- - sources:
221
- - model: Qwen/Qwen2.5-72B-Instruct
222
- layer_range: [29, 30]
223
- - sources:
224
- - model: Qwen/Qwen2.5-Math-72B
225
- layer_range: [30, 31]
226
- - sources:
227
- - model: Qwen/Qwen2.5-72B-Instruct
228
- layer_range: [30, 31]
229
- - sources:
230
- - model: Qwen/Qwen2.5-Math-72B
231
- layer_range: [31, 32]
232
- - sources:
233
- - model: Qwen/Qwen2.5-72B-Instruct
234
- layer_range: [31, 32]
235
- - sources:
236
- - model: Qwen/Qwen2.5-Math-72B
237
- layer_range: [32, 33]
238
- - sources:
239
- - model: Qwen/Qwen2.5-72B-Instruct
240
- layer_range: [32, 33]
241
- - sources:
242
- - model: Qwen/Qwen2.5-Math-72B
243
- layer_range: [33, 34]
244
- - sources:
245
- - model: Qwen/Qwen2.5-72B-Instruct
246
- layer_range: [33, 34]
247
- - sources:
248
- - model: Qwen/Qwen2.5-Math-72B
249
- layer_range: [34, 35]
250
- - sources:
251
- - model: Qwen/Qwen2.5-72B-Instruct
252
- layer_range: [34, 35]
253
- - sources:
254
- - model: Qwen/Qwen2.5-Math-72B
255
- layer_range: [35, 36]
256
- - sources:
257
- - model: Qwen/Qwen2.5-72B-Instruct
258
- layer_range: [35, 36]
259
- - sources:
260
- - model: Qwen/Qwen2.5-Math-72B
261
- layer_range: [36, 37]
262
- - sources:
263
- - model: Qwen/Qwen2.5-72B-Instruct
264
- layer_range: [36, 37]
265
- - sources:
266
- - model: Qwen/Qwen2.5-Math-72B
267
- layer_range: [37, 38]
268
- - sources:
269
- - model: Qwen/Qwen2.5-72B-Instruct
270
- layer_range: [37, 38]
271
- - sources:
272
- - model: Qwen/Qwen2.5-Math-72B
273
- layer_range: [38, 39]
274
- - sources:
275
- - model: Qwen/Qwen2.5-72B-Instruct
276
- layer_range: [38, 39]
277
- - sources:
278
- - model: Qwen/Qwen2.5-Math-72B
279
- layer_range: [39, 40]
280
- - sources:
281
- - model: Qwen/Qwen2.5-72B-Instruct
282
- layer_range: [39, 40]
283
- - sources:
284
- - model: Qwen/Qwen2.5-Math-72B
285
- layer_range: [40, 41]
286
- - sources:
287
- - model: Qwen/Qwen2.5-72B-Instruct
288
- layer_range: [40, 41]
289
- - sources:
290
- - model: Qwen/Qwen2.5-Math-72B
291
- layer_range: [41, 42]
292
- - sources:
293
- - model: Qwen/Qwen2.5-72B-Instruct
294
- layer_range: [41, 42]
295
- - sources:
296
- - model: Qwen/Qwen2.5-Math-72B
297
- layer_range: [42, 43]
298
- - sources:
299
- - model: Qwen/Qwen2.5-72B-Instruct
300
- layer_range: [42, 43]
301
- - sources:
302
- - model: Qwen/Qwen2.5-Math-72B
303
- layer_range: [43, 44]
304
- - sources:
305
- - model: Qwen/Qwen2.5-72B-Instruct
306
- layer_range: [43, 44]
307
- - sources:
308
- - model: Qwen/Qwen2.5-Math-72B
309
- layer_range: [44, 45]
310
- - sources:
311
- - model: Qwen/Qwen2.5-72B-Instruct
312
- layer_range: [44, 45]
313
- - sources:
314
- - model: Qwen/Qwen2.5-Math-72B
315
- layer_range: [45, 46]
316
- - sources:
317
- - model: Qwen/Qwen2.5-72B-Instruct
318
- layer_range: [45, 46]
319
- - sources:
320
- - model: Qwen/Qwen2.5-Math-72B
321
- layer_range: [46, 47]
322
- - sources:
323
- - model: Qwen/Qwen2.5-72B-Instruct
324
- layer_range: [46, 47]
325
- - sources:
326
- - model: Qwen/Qwen2.5-Math-72B
327
- layer_range: [47, 48]
328
- - sources:
329
- - model: Qwen/Qwen2.5-72B-Instruct
330
- layer_range: [47, 48]
331
- - sources:
332
- - model: Qwen/Qwen2.5-Math-72B
333
- layer_range: [48, 49]
334
- - sources:
335
- - model: Qwen/Qwen2.5-72B-Instruct
336
- layer_range: [48, 49]
337
- - sources:
338
- - model: Qwen/Qwen2.5-Math-72B
339
- layer_range: [49, 50]
340
- - sources:
341
- - model: Qwen/Qwen2.5-72B-Instruct
342
- layer_range: [49, 50]
343
- - sources:
344
- - model: Qwen/Qwen2.5-Math-72B
345
- layer_range: [50, 51]
346
- - sources:
347
- - model: Qwen/Qwen2.5-72B-Instruct
348
- layer_range: [50, 51]
349
- - sources:
350
- - model: Qwen/Qwen2.5-Math-72B
351
- layer_range: [51, 52]
352
- - sources:
353
- - model: Qwen/Qwen2.5-72B-Instruct
354
- layer_range: [51, 52]
355
- - sources:
356
- - model: Qwen/Qwen2.5-Math-72B
357
- layer_range: [52, 53]
358
- - sources:
359
- - model: Qwen/Qwen2.5-72B-Instruct
360
- layer_range: [52, 53]
361
- - sources:
362
- - model: Qwen/Qwen2.5-Math-72B
363
- layer_range: [53, 54]
364
- - sources:
365
- - model: Qwen/Qwen2.5-72B-Instruct
366
- layer_range: [53, 54]
367
- - sources:
368
- - model: Qwen/Qwen2.5-Math-72B
369
- layer_range: [54, 55]
370
- - sources:
371
- - model: Qwen/Qwen2.5-72B-Instruct
372
- layer_range: [54, 55]
373
- - sources:
374
- - model: Qwen/Qwen2.5-Math-72B
375
- layer_range: [55, 56]
376
- - sources:
377
- - model: Qwen/Qwen2.5-72B-Instruct
378
- layer_range: [55, 56]
379
- - sources:
380
- - model: Qwen/Qwen2.5-Math-72B
381
- layer_range: [56, 57]
382
- - sources:
383
- - model: Qwen/Qwen2.5-72B-Instruct
384
- layer_range: [56, 57]
385
- - sources:
386
- - model: Qwen/Qwen2.5-Math-72B
387
- layer_range: [57, 58]
388
- - sources:
389
- - model: Qwen/Qwen2.5-72B-Instruct
390
- layer_range: [57, 58]
391
- - sources:
392
- - model: Qwen/Qwen2.5-Math-72B
393
- layer_range: [58, 59]
394
- - sources:
395
- - model: Qwen/Qwen2.5-72B-Instruct
396
- layer_range: [58, 59]
397
- - sources:
398
- - model: Qwen/Qwen2.5-Math-72B
399
- layer_range: [59, 60]
400
- - sources:
401
- - model: Qwen/Qwen2.5-72B-Instruct
402
- layer_range: [59, 60]
403
- - sources:
404
- - model: Qwen/Qwen2.5-Math-72B
405
- layer_range: [60, 61]
406
- - sources:
407
- - model: Qwen/Qwen2.5-72B-Instruct
408
- layer_range: [60, 61]
409
- - sources:
410
- - model: Qwen/Qwen2.5-Math-72B
411
- layer_range: [61, 62]
412
- - sources:
413
- - model: Qwen/Qwen2.5-72B-Instruct
414
- layer_range: [61, 62]
415
- - sources:
416
- - model: Qwen/Qwen2.5-Math-72B
417
- layer_range: [62, 63]
418
- - sources:
419
- - model: Qwen/Qwen2.5-72B-Instruct
420
- layer_range: [62, 63]
421
- - sources:
422
- - model: Qwen/Qwen2.5-Math-72B
423
- layer_range: [63, 64]
424
- - sources:
425
- - model: Qwen/Qwen2.5-72B-Instruct
426
- layer_range: [63, 64]
427
- - sources:
428
- - model: Qwen/Qwen2.5-Math-72B
429
- layer_range: [64, 65]
430
- - sources:
431
- - model: Qwen/Qwen2.5-72B-Instruct
432
- layer_range: [64, 65]
433
- - sources:
434
- - model: Qwen/Qwen2.5-Math-72B
435
- layer_range: [65, 66]
436
- - sources:
437
- - model: Qwen/Qwen2.5-72B-Instruct
438
- layer_range: [65, 66]
439
- - sources:
440
- - model: Qwen/Qwen2.5-Math-72B
441
- layer_range: [66, 67]
442
- - sources:
443
- - model: Qwen/Qwen2.5-72B-Instruct
444
- layer_range: [66, 67]
445
- - sources:
446
- - model: Qwen/Qwen2.5-Math-72B
447
- layer_range: [67, 68]
448
- - sources:
449
- - model: Qwen/Qwen2.5-72B-Instruct
450
- layer_range: [67, 68]
451
- - sources:
452
- - model: Qwen/Qwen2.5-Math-72B
453
- layer_range: [68, 69]
454
- - sources:
455
- - model: Qwen/Qwen2.5-72B-Instruct
456
- layer_range: [68, 69]
457
- - sources:
458
- - model: Qwen/Qwen2.5-Math-72B
459
- layer_range: [69, 70]
460
- - sources:
461
- - model: Qwen/Qwen2.5-72B-Instruct
462
- layer_range: [69, 70]
463
- - sources:
464
- - model: Qwen/Qwen2.5-Math-72B
465
- layer_range: [70, 71]
466
- - sources:
467
- - model: Qwen/Qwen2.5-72B-Instruct
468
- layer_range: [70, 71]
469
- - sources:
470
- - model: Qwen/Qwen2.5-Math-72B
471
- layer_range: [71, 72]
472
- - sources:
473
- - model: Qwen/Qwen2.5-72B-Instruct
474
- layer_range: [71, 72]
475
- - sources:
476
- - model: Qwen/Qwen2.5-Math-72B
477
- layer_range: [72, 73]
478
- - sources:
479
- - model: Qwen/Qwen2.5-72B-Instruct
480
- layer_range: [72, 73]
481
- - sources:
482
- - model: Qwen/Qwen2.5-Math-72B
483
- layer_range: [73, 74]
484
- - sources:
485
- - model: Qwen/Qwen2.5-72B-Instruct
486
- layer_range: [73, 74]
487
- - sources:
488
- - model: Qwen/Qwen2.5-Math-72B
489
- layer_range: [74, 75]
490
- - sources:
491
- - model: Qwen/Qwen2.5-72B-Instruct
492
- layer_range: [74, 75]
493
- - sources:
494
- - model: Qwen/Qwen2.5-Math-72B
495
- layer_range: [75, 76]
496
- - sources:
497
- - model: Qwen/Qwen2.5-72B-Instruct
498
- layer_range: [75, 76]
499
- - sources:
500
- - model: Qwen/Qwen2.5-Math-72B
501
- layer_range: [76, 77]
502
- - sources:
503
- - model: Qwen/Qwen2.5-72B-Instruct
504
- layer_range: [76, 77]
505
- - sources:
506
- - model: Qwen/Qwen2.5-Math-72B
507
- layer_range: [77, 78]
508
- - sources:
509
- - model: Qwen/Qwen2.5-72B-Instruct
510
- layer_range: [77, 78]
511
- - sources:
512
- - model: Qwen/Qwen2.5-72B-Instruct
513
- layer_range: [77, 80]
514
- merge_method: passthrough
515
- dtype: float16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
516
  ```
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-Math-72B
4
+ - Qwen/Qwen2.5-72B-Instruct
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+ license: other
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ ---
25
+
26
+ ## Qwen2.5-142B-Doubled72B-Math-Instruct (Mergekit-Merge) by Solshine (Caleb DeLeeuw)
27
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/654527ce2a13610acc25d921/FSWNkg4h9W329CiBYIuiC.png)
28
+
29
+ # merge
30
+
31
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
32
+
33
+ # License
34
+
35
+ Hippocratic License 3.0 + Ecocide module, + Extractive Industries module, + Copyleft
36
+ [![Hippocratic License HL3-CL-ECO-EXTR](https://img.shields.io/static/v1?label=Hippocratic%20License&message=HL3-CL-ECO-EXTR&labelColor=5e2751&color=bc8c3d)](https://firstdonoharm.dev/version/3/0/cl-eco-extr.html)
37
+ https://firstdonoharm.dev/version/3/0/cl-eco-extr.txt
38
+
39
+
40
+ ## Merge Details
41
+ ### Merge Method
42
+
43
+ This model was merged using the passthrough merge method. Every layer is doubled in order, from Qwen/Qwen2.5-72B-Instruct and Qwen/Qwen2.5-Math-72B, alternating which model is adding a layer and the MLP layers + 2 output layers only taken from the instruct model, creating 142B parameters. No additional fine-tune has been done in this merged model.
44
+
45
+ ### Models Merged
46
+
47
+ The following models were included in the merge:
48
+ * [Qwen/Qwen2.5-Math-72B](https://huggingface.co/Qwen/Qwen2.5-Math-72B)
49
+ * [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
50
+
51
+ ### Configuration
52
+
53
+ The following YAML configuration was used to produce this model:
54
+
55
+ ```yaml
56
+ slices:
57
+ - sources:
58
+ - model: Qwen/Qwen2.5-Math-72B
59
+ layer_range: [0, 1]
60
+ - sources:
61
+ - model: Qwen/Qwen2.5-72B-Instruct
62
+ layer_range: [0, 1]
63
+ - sources:
64
+ - model: Qwen/Qwen2.5-Math-72B
65
+ layer_range: [1, 2]
66
+ - sources:
67
+ - model: Qwen/Qwen2.5-72B-Instruct
68
+ layer_range: [1, 2]
69
+ - sources:
70
+ - model: Qwen/Qwen2.5-Math-72B
71
+ layer_range: [2, 3]
72
+ - sources:
73
+ - model: Qwen/Qwen2.5-72B-Instruct
74
+ layer_range: [2, 3]
75
+ - sources:
76
+ - model: Qwen/Qwen2.5-Math-72B
77
+ layer_range: [3, 4]
78
+ - sources:
79
+ - model: Qwen/Qwen2.5-72B-Instruct
80
+ layer_range: [3, 4]
81
+ - sources:
82
+ - model: Qwen/Qwen2.5-Math-72B
83
+ layer_range: [4, 5]
84
+ - sources:
85
+ - model: Qwen/Qwen2.5-72B-Instruct
86
+ layer_range: [4, 5]
87
+ - sources:
88
+ - model: Qwen/Qwen2.5-Math-72B
89
+ layer_range: [5, 6]
90
+ - sources:
91
+ - model: Qwen/Qwen2.5-72B-Instruct
92
+ layer_range: [5, 6]
93
+ - sources:
94
+ - model: Qwen/Qwen2.5-Math-72B
95
+ layer_range: [6, 7]
96
+ - sources:
97
+ - model: Qwen/Qwen2.5-72B-Instruct
98
+ layer_range: [6, 7]
99
+ - sources:
100
+ - model: Qwen/Qwen2.5-Math-72B
101
+ layer_range: [7, 8]
102
+ - sources:
103
+ - model: Qwen/Qwen2.5-72B-Instruct
104
+ layer_range: [7, 8]
105
+ - sources:
106
+ - model: Qwen/Qwen2.5-Math-72B
107
+ layer_range: [8, 9]
108
+ - sources:
109
+ - model: Qwen/Qwen2.5-72B-Instruct
110
+ layer_range: [8, 9]
111
+ - sources:
112
+ - model: Qwen/Qwen2.5-Math-72B
113
+ layer_range: [9, 10]
114
+ - sources:
115
+ - model: Qwen/Qwen2.5-72B-Instruct
116
+ layer_range: [9, 10]
117
+ - sources:
118
+ - model: Qwen/Qwen2.5-Math-72B
119
+ layer_range: [10, 11]
120
+ - sources:
121
+ - model: Qwen/Qwen2.5-72B-Instruct
122
+ layer_range: [10, 11]
123
+ - sources:
124
+ - model: Qwen/Qwen2.5-Math-72B
125
+ layer_range: [11, 12]
126
+ - sources:
127
+ - model: Qwen/Qwen2.5-72B-Instruct
128
+ layer_range: [11, 12]
129
+ - sources:
130
+ - model: Qwen/Qwen2.5-Math-72B
131
+ layer_range: [12, 13]
132
+ - sources:
133
+ - model: Qwen/Qwen2.5-72B-Instruct
134
+ layer_range: [12, 13]
135
+ - sources:
136
+ - model: Qwen/Qwen2.5-Math-72B
137
+ layer_range: [13, 14]
138
+ - sources:
139
+ - model: Qwen/Qwen2.5-72B-Instruct
140
+ layer_range: [13, 14]
141
+ - sources:
142
+ - model: Qwen/Qwen2.5-Math-72B
143
+ layer_range: [14, 15]
144
+ - sources:
145
+ - model: Qwen/Qwen2.5-72B-Instruct
146
+ layer_range: [14, 15]
147
+ - sources:
148
+ - model: Qwen/Qwen2.5-Math-72B
149
+ layer_range: [15, 16]
150
+ - sources:
151
+ - model: Qwen/Qwen2.5-72B-Instruct
152
+ layer_range: [15, 16]
153
+ - sources:
154
+ - model: Qwen/Qwen2.5-Math-72B
155
+ layer_range: [16, 17]
156
+ - sources:
157
+ - model: Qwen/Qwen2.5-72B-Instruct
158
+ layer_range: [16, 17]
159
+ - sources:
160
+ - model: Qwen/Qwen2.5-Math-72B
161
+ layer_range: [17, 18]
162
+ - sources:
163
+ - model: Qwen/Qwen2.5-72B-Instruct
164
+ layer_range: [17, 18]
165
+ - sources:
166
+ - model: Qwen/Qwen2.5-Math-72B
167
+ layer_range: [18, 19]
168
+ - sources:
169
+ - model: Qwen/Qwen2.5-72B-Instruct
170
+ layer_range: [18, 19]
171
+ - sources:
172
+ - model: Qwen/Qwen2.5-Math-72B
173
+ layer_range: [19, 20]
174
+ - sources:
175
+ - model: Qwen/Qwen2.5-72B-Instruct
176
+ layer_range: [19, 20]
177
+ - sources:
178
+ - model: Qwen/Qwen2.5-Math-72B
179
+ layer_range: [20, 21]
180
+ - sources:
181
+ - model: Qwen/Qwen2.5-72B-Instruct
182
+ layer_range: [20, 21]
183
+ - sources:
184
+ - model: Qwen/Qwen2.5-Math-72B
185
+ layer_range: [21, 22]
186
+ - sources:
187
+ - model: Qwen/Qwen2.5-72B-Instruct
188
+ layer_range: [21, 22]
189
+ - sources:
190
+ - model: Qwen/Qwen2.5-Math-72B
191
+ layer_range: [22, 23]
192
+ - sources:
193
+ - model: Qwen/Qwen2.5-72B-Instruct
194
+ layer_range: [22, 23]
195
+ - sources:
196
+ - model: Qwen/Qwen2.5-Math-72B
197
+ layer_range: [23, 24]
198
+ - sources:
199
+ - model: Qwen/Qwen2.5-72B-Instruct
200
+ layer_range: [23, 24]
201
+ - sources:
202
+ - model: Qwen/Qwen2.5-Math-72B
203
+ layer_range: [24, 25]
204
+ - sources:
205
+ - model: Qwen/Qwen2.5-72B-Instruct
206
+ layer_range: [24, 25]
207
+ - sources:
208
+ - model: Qwen/Qwen2.5-Math-72B
209
+ layer_range: [25, 26]
210
+ - sources:
211
+ - model: Qwen/Qwen2.5-72B-Instruct
212
+ layer_range: [25, 26]
213
+ - sources:
214
+ - model: Qwen/Qwen2.5-Math-72B
215
+ layer_range: [26, 27]
216
+ - sources:
217
+ - model: Qwen/Qwen2.5-72B-Instruct
218
+ layer_range: [26, 27]
219
+ - sources:
220
+ - model: Qwen/Qwen2.5-Math-72B
221
+ layer_range: [27, 28]
222
+ - sources:
223
+ - model: Qwen/Qwen2.5-72B-Instruct
224
+ layer_range: [27, 28]
225
+ - sources:
226
+ - model: Qwen/Qwen2.5-Math-72B
227
+ layer_range: [28, 29]
228
+ - sources:
229
+ - model: Qwen/Qwen2.5-72B-Instruct
230
+ layer_range: [28, 29]
231
+ - sources:
232
+ - model: Qwen/Qwen2.5-Math-72B
233
+ layer_range: [29, 30]
234
+ - sources:
235
+ - model: Qwen/Qwen2.5-72B-Instruct
236
+ layer_range: [29, 30]
237
+ - sources:
238
+ - model: Qwen/Qwen2.5-Math-72B
239
+ layer_range: [30, 31]
240
+ - sources:
241
+ - model: Qwen/Qwen2.5-72B-Instruct
242
+ layer_range: [30, 31]
243
+ - sources:
244
+ - model: Qwen/Qwen2.5-Math-72B
245
+ layer_range: [31, 32]
246
+ - sources:
247
+ - model: Qwen/Qwen2.5-72B-Instruct
248
+ layer_range: [31, 32]
249
+ - sources:
250
+ - model: Qwen/Qwen2.5-Math-72B
251
+ layer_range: [32, 33]
252
+ - sources:
253
+ - model: Qwen/Qwen2.5-72B-Instruct
254
+ layer_range: [32, 33]
255
+ - sources:
256
+ - model: Qwen/Qwen2.5-Math-72B
257
+ layer_range: [33, 34]
258
+ - sources:
259
+ - model: Qwen/Qwen2.5-72B-Instruct
260
+ layer_range: [33, 34]
261
+ - sources:
262
+ - model: Qwen/Qwen2.5-Math-72B
263
+ layer_range: [34, 35]
264
+ - sources:
265
+ - model: Qwen/Qwen2.5-72B-Instruct
266
+ layer_range: [34, 35]
267
+ - sources:
268
+ - model: Qwen/Qwen2.5-Math-72B
269
+ layer_range: [35, 36]
270
+ - sources:
271
+ - model: Qwen/Qwen2.5-72B-Instruct
272
+ layer_range: [35, 36]
273
+ - sources:
274
+ - model: Qwen/Qwen2.5-Math-72B
275
+ layer_range: [36, 37]
276
+ - sources:
277
+ - model: Qwen/Qwen2.5-72B-Instruct
278
+ layer_range: [36, 37]
279
+ - sources:
280
+ - model: Qwen/Qwen2.5-Math-72B
281
+ layer_range: [37, 38]
282
+ - sources:
283
+ - model: Qwen/Qwen2.5-72B-Instruct
284
+ layer_range: [37, 38]
285
+ - sources:
286
+ - model: Qwen/Qwen2.5-Math-72B
287
+ layer_range: [38, 39]
288
+ - sources:
289
+ - model: Qwen/Qwen2.5-72B-Instruct
290
+ layer_range: [38, 39]
291
+ - sources:
292
+ - model: Qwen/Qwen2.5-Math-72B
293
+ layer_range: [39, 40]
294
+ - sources:
295
+ - model: Qwen/Qwen2.5-72B-Instruct
296
+ layer_range: [39, 40]
297
+ - sources:
298
+ - model: Qwen/Qwen2.5-Math-72B
299
+ layer_range: [40, 41]
300
+ - sources:
301
+ - model: Qwen/Qwen2.5-72B-Instruct
302
+ layer_range: [40, 41]
303
+ - sources:
304
+ - model: Qwen/Qwen2.5-Math-72B
305
+ layer_range: [41, 42]
306
+ - sources:
307
+ - model: Qwen/Qwen2.5-72B-Instruct
308
+ layer_range: [41, 42]
309
+ - sources:
310
+ - model: Qwen/Qwen2.5-Math-72B
311
+ layer_range: [42, 43]
312
+ - sources:
313
+ - model: Qwen/Qwen2.5-72B-Instruct
314
+ layer_range: [42, 43]
315
+ - sources:
316
+ - model: Qwen/Qwen2.5-Math-72B
317
+ layer_range: [43, 44]
318
+ - sources:
319
+ - model: Qwen/Qwen2.5-72B-Instruct
320
+ layer_range: [43, 44]
321
+ - sources:
322
+ - model: Qwen/Qwen2.5-Math-72B
323
+ layer_range: [44, 45]
324
+ - sources:
325
+ - model: Qwen/Qwen2.5-72B-Instruct
326
+ layer_range: [44, 45]
327
+ - sources:
328
+ - model: Qwen/Qwen2.5-Math-72B
329
+ layer_range: [45, 46]
330
+ - sources:
331
+ - model: Qwen/Qwen2.5-72B-Instruct
332
+ layer_range: [45, 46]
333
+ - sources:
334
+ - model: Qwen/Qwen2.5-Math-72B
335
+ layer_range: [46, 47]
336
+ - sources:
337
+ - model: Qwen/Qwen2.5-72B-Instruct
338
+ layer_range: [46, 47]
339
+ - sources:
340
+ - model: Qwen/Qwen2.5-Math-72B
341
+ layer_range: [47, 48]
342
+ - sources:
343
+ - model: Qwen/Qwen2.5-72B-Instruct
344
+ layer_range: [47, 48]
345
+ - sources:
346
+ - model: Qwen/Qwen2.5-Math-72B
347
+ layer_range: [48, 49]
348
+ - sources:
349
+ - model: Qwen/Qwen2.5-72B-Instruct
350
+ layer_range: [48, 49]
351
+ - sources:
352
+ - model: Qwen/Qwen2.5-Math-72B
353
+ layer_range: [49, 50]
354
+ - sources:
355
+ - model: Qwen/Qwen2.5-72B-Instruct
356
+ layer_range: [49, 50]
357
+ - sources:
358
+ - model: Qwen/Qwen2.5-Math-72B
359
+ layer_range: [50, 51]
360
+ - sources:
361
+ - model: Qwen/Qwen2.5-72B-Instruct
362
+ layer_range: [50, 51]
363
+ - sources:
364
+ - model: Qwen/Qwen2.5-Math-72B
365
+ layer_range: [51, 52]
366
+ - sources:
367
+ - model: Qwen/Qwen2.5-72B-Instruct
368
+ layer_range: [51, 52]
369
+ - sources:
370
+ - model: Qwen/Qwen2.5-Math-72B
371
+ layer_range: [52, 53]
372
+ - sources:
373
+ - model: Qwen/Qwen2.5-72B-Instruct
374
+ layer_range: [52, 53]
375
+ - sources:
376
+ - model: Qwen/Qwen2.5-Math-72B
377
+ layer_range: [53, 54]
378
+ - sources:
379
+ - model: Qwen/Qwen2.5-72B-Instruct
380
+ layer_range: [53, 54]
381
+ - sources:
382
+ - model: Qwen/Qwen2.5-Math-72B
383
+ layer_range: [54, 55]
384
+ - sources:
385
+ - model: Qwen/Qwen2.5-72B-Instruct
386
+ layer_range: [54, 55]
387
+ - sources:
388
+ - model: Qwen/Qwen2.5-Math-72B
389
+ layer_range: [55, 56]
390
+ - sources:
391
+ - model: Qwen/Qwen2.5-72B-Instruct
392
+ layer_range: [55, 56]
393
+ - sources:
394
+ - model: Qwen/Qwen2.5-Math-72B
395
+ layer_range: [56, 57]
396
+ - sources:
397
+ - model: Qwen/Qwen2.5-72B-Instruct
398
+ layer_range: [56, 57]
399
+ - sources:
400
+ - model: Qwen/Qwen2.5-Math-72B
401
+ layer_range: [57, 58]
402
+ - sources:
403
+ - model: Qwen/Qwen2.5-72B-Instruct
404
+ layer_range: [57, 58]
405
+ - sources:
406
+ - model: Qwen/Qwen2.5-Math-72B
407
+ layer_range: [58, 59]
408
+ - sources:
409
+ - model: Qwen/Qwen2.5-72B-Instruct
410
+ layer_range: [58, 59]
411
+ - sources:
412
+ - model: Qwen/Qwen2.5-Math-72B
413
+ layer_range: [59, 60]
414
+ - sources:
415
+ - model: Qwen/Qwen2.5-72B-Instruct
416
+ layer_range: [59, 60]
417
+ - sources:
418
+ - model: Qwen/Qwen2.5-Math-72B
419
+ layer_range: [60, 61]
420
+ - sources:
421
+ - model: Qwen/Qwen2.5-72B-Instruct
422
+ layer_range: [60, 61]
423
+ - sources:
424
+ - model: Qwen/Qwen2.5-Math-72B
425
+ layer_range: [61, 62]
426
+ - sources:
427
+ - model: Qwen/Qwen2.5-72B-Instruct
428
+ layer_range: [61, 62]
429
+ - sources:
430
+ - model: Qwen/Qwen2.5-Math-72B
431
+ layer_range: [62, 63]
432
+ - sources:
433
+ - model: Qwen/Qwen2.5-72B-Instruct
434
+ layer_range: [62, 63]
435
+ - sources:
436
+ - model: Qwen/Qwen2.5-Math-72B
437
+ layer_range: [63, 64]
438
+ - sources:
439
+ - model: Qwen/Qwen2.5-72B-Instruct
440
+ layer_range: [63, 64]
441
+ - sources:
442
+ - model: Qwen/Qwen2.5-Math-72B
443
+ layer_range: [64, 65]
444
+ - sources:
445
+ - model: Qwen/Qwen2.5-72B-Instruct
446
+ layer_range: [64, 65]
447
+ - sources:
448
+ - model: Qwen/Qwen2.5-Math-72B
449
+ layer_range: [65, 66]
450
+ - sources:
451
+ - model: Qwen/Qwen2.5-72B-Instruct
452
+ layer_range: [65, 66]
453
+ - sources:
454
+ - model: Qwen/Qwen2.5-Math-72B
455
+ layer_range: [66, 67]
456
+ - sources:
457
+ - model: Qwen/Qwen2.5-72B-Instruct
458
+ layer_range: [66, 67]
459
+ - sources:
460
+ - model: Qwen/Qwen2.5-Math-72B
461
+ layer_range: [67, 68]
462
+ - sources:
463
+ - model: Qwen/Qwen2.5-72B-Instruct
464
+ layer_range: [67, 68]
465
+ - sources:
466
+ - model: Qwen/Qwen2.5-Math-72B
467
+ layer_range: [68, 69]
468
+ - sources:
469
+ - model: Qwen/Qwen2.5-72B-Instruct
470
+ layer_range: [68, 69]
471
+ - sources:
472
+ - model: Qwen/Qwen2.5-Math-72B
473
+ layer_range: [69, 70]
474
+ - sources:
475
+ - model: Qwen/Qwen2.5-72B-Instruct
476
+ layer_range: [69, 70]
477
+ - sources:
478
+ - model: Qwen/Qwen2.5-Math-72B
479
+ layer_range: [70, 71]
480
+ - sources:
481
+ - model: Qwen/Qwen2.5-72B-Instruct
482
+ layer_range: [70, 71]
483
+ - sources:
484
+ - model: Qwen/Qwen2.5-Math-72B
485
+ layer_range: [71, 72]
486
+ - sources:
487
+ - model: Qwen/Qwen2.5-72B-Instruct
488
+ layer_range: [71, 72]
489
+ - sources:
490
+ - model: Qwen/Qwen2.5-Math-72B
491
+ layer_range: [72, 73]
492
+ - sources:
493
+ - model: Qwen/Qwen2.5-72B-Instruct
494
+ layer_range: [72, 73]
495
+ - sources:
496
+ - model: Qwen/Qwen2.5-Math-72B
497
+ layer_range: [73, 74]
498
+ - sources:
499
+ - model: Qwen/Qwen2.5-72B-Instruct
500
+ layer_range: [73, 74]
501
+ - sources:
502
+ - model: Qwen/Qwen2.5-Math-72B
503
+ layer_range: [74, 75]
504
+ - sources:
505
+ - model: Qwen/Qwen2.5-72B-Instruct
506
+ layer_range: [74, 75]
507
+ - sources:
508
+ - model: Qwen/Qwen2.5-Math-72B
509
+ layer_range: [75, 76]
510
+ - sources:
511
+ - model: Qwen/Qwen2.5-72B-Instruct
512
+ layer_range: [75, 76]
513
+ - sources:
514
+ - model: Qwen/Qwen2.5-Math-72B
515
+ layer_range: [76, 77]
516
+ - sources:
517
+ - model: Qwen/Qwen2.5-72B-Instruct
518
+ layer_range: [76, 77]
519
+ - sources:
520
+ - model: Qwen/Qwen2.5-Math-72B
521
+ layer_range: [77, 78]
522
+ - sources:
523
+ - model: Qwen/Qwen2.5-72B-Instruct
524
+ layer_range: [77, 78]
525
+ - sources:
526
+ - model: Qwen/Qwen2.5-72B-Instruct
527
+ layer_range: [77, 80]
528
+ merge_method: passthrough
529
+ dtype: float16
530
  ```