File size: 4,184 Bytes
e749c27
f57b007
 
8009b99
 
 
 
 
 
f57b007
8009b99
 
f57b007
8009b99
f57b007
e749c27
f57b007
8009b99
 
 
 
 
 
 
b6ba0c8
 
8009b99
b6ba0c8
8009b99
 
b6ba0c8
8009b99
 
b6ba0c8
8009b99
b6ba0c8
8009b99
 
b6ba0c8
8009b99
b6ba0c8
42cc894
 
8009b99
 
 
 
b6ba0c8
8009b99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
license: mit
datasets:
  - mizinovmv/ru_example_DeepSeek-R1-Distill-Qwen-32B
  - lightblue/reasoning-multilingual-R1-Llama-70B-train
  - Pinkstack/thinking-multilingual-30-23-small-690
  - Vikhrmodels/reasoning-0.01-ru
  - Vikhrmodels/russian_math
  - kristaller486/Nebo-T1-Russian
language:
  - ru
  - en
base_model:
  - yandex/YandexGPT-5-Lite-8B-pretrain
pipeline_tag: text-generation
library_name: peft
tags:
  - r1
  - reasoning
  - think
  - thinking
  - reflection
  - russian
  - general
---

# Russian r1 / YandexGPT-5-Lite-8B-pretrain

LoRA-адаптер для модели [YandexGPT-5-Lite-8B-pretrain](https://huggingface.co/yandex/YandexGPT-5-Lite-8B-pretrain)
обученный на миксе из датасетов реализующих r1 (ризонинг) подход.

Обученная модель способна имитировать логические размышлению на русском языке по аналогии с тем, как
это делает `r1` от `DeepSeek` или `o1` от `OpenAI` .

W&B отчёт: https://api.wandb.ai/links/evilfreelancer/zj6s02v4

Обучение производилось при помощи утилиты [impruver](https://github.com/EvilFreelancer/impruver) используя конфигурацию
[YandexGPT/8B_lora_r1](https://github.com/EvilFreelancer/impruver/blob/main/recipes/configs/YandexGPT/8B_lora_r1.yaml).

На всё про всё ушло примерно 18 часов на RTX 4090, при этом понадобилось 23.5Гб видеопамяти.

Эффективный контекст 1400 токенов, так как больше не удалось уместить в 24Гб VRAM.

```yaml
output_dir: ./models/YandexGPT-5-Lite_7B_lora_thinking
train_path: ./train.YandexGPT-5-Lite_7B_lora_thinking.jsonl
val_path: ./val.YandexGPT-5-Lite_7B_lora_thinking.jsonl

datasets:
  - name: mizinovmv/ru_example_DeepSeek-R1-Distill-Qwen-32B
    converter: impruver.instruction_to_messages
    add_global_bos: false
    add_global_eos: false
    mapping:
      instruction: ru_query
      output: response
  - name: lightblue/reasoning-multilingual-R1-Llama-70B-train
    converter: impruver.instruction_to_messages
    add_global_bos: false
    add_global_eos: false
    mapping:
      instruction: translated_prompt
      output: response
  - name: Pinkstack/thinking-multilingual-30-23-full-690
    converter: impruver.instruction_to_messages
    add_global_bos: false
    add_global_eos: false
  - name: Vikhrmodels/reasoning-0.01-ru
    converter: impruver.reasoning_to_messages
    add_global_bos: false
    add_global_eos: false
  - name: Vikhrmodels/russian_math
    converter: impruver.reasoning_to_messages
    add_global_bos: false
    add_global_eos: false
    mapping:
      instruction: task
      reasoning: solution
      output: short answer
  - name: kristaller486/Nebo-T1-Russian
    converter: impruver.reasoning_to_messages
    add_global_bos: false
    add_global_eos: false
    mapping:
      instruction: prompt
      reasoning: think
      output: answer

model:
  class: transformers.AutoModelForCausalLM
  name: yandex/YandexGPT-5-Lite-8B-pretrain
  load_in_4bit: true
  load_in_8bit: false
  dtype: bf16

lora:
  r: 8            # higher increases accuracy and memory
  lora_alpha: 16  # usually alpha=2*rank
  lora_dropout: 0
  bias: none
  target_modules: [ 'q_proj', 'v_proj', 'output_proj' ]
  task_type: CAUSAL_LM

tokenizer:
  class: transformers.AutoTokenizer
  name: yandex/YandexGPT-5-Lite-8B-pretrain
  max_tokens_count: 1400
  special_tokens:
    pad_token_id: 1
    pad_token: <s>

trainer:
  eval_strategy: steps
  save_strategy: steps
  eval_steps: 1000
  save_steps: 1000
  per_device_train_batch_size: 1
  per_device_eval_batch_size: 1
  gradient_accumulation_steps: 8
  logging_steps: 10
  learning_rate: 0.000005
  num_train_epochs: 2
  lr_scheduler_type: cosine
  warmup_steps: 100
  optim: adamw_torch_4bit
  metric_for_best_model: eval_loss
  load_best_model_at_end: true
  save_total_limit: 2
  seed: 42
  remove_unused_columns: false
  max_grad_norm: 1.0
  weight_decay: 0.01
  torch_compile: false
```