8b397e93122f02e251305ab6a7ea0137
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [de-ru] dataset. It achieves the following results on the evaluation set:
- Loss: 1.7226
- Data Size: 1.0
- Epoch Runtime: 101.4265
- Bleu: 9.3907
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.3294 | 0 | 8.7264 | 0.3564 |
| No log | 1 | 434 | 11.2120 | 0.0078 | 9.6077 | 0.3557 |
| No log | 2 | 868 | 10.8344 | 0.0156 | 11.1388 | 0.4100 |
| No log | 3 | 1302 | 9.9545 | 0.0312 | 12.9233 | 0.3901 |
| No log | 4 | 1736 | 8.7990 | 0.0625 | 16.4405 | 0.5119 |
| 0.578 | 5 | 2170 | 9.4236 | 0.125 | 22.8667 | 0.2615 |
| 8.3328 | 6 | 2604 | 4.5183 | 0.25 | 33.2893 | 1.8934 |
| 4.0097 | 7 | 3038 | 2.5764 | 0.5 | 56.6631 | 4.8373 |
| 3.0065 | 8.0 | 3472 | 2.1856 | 1.0 | 101.8494 | 5.9452 |
| 2.676 | 9.0 | 3906 | 2.0421 | 1.0 | 101.0378 | 6.6706 |
| 2.5174 | 10.0 | 4340 | 1.9713 | 1.0 | 101.9740 | 7.0471 |
| 2.3989 | 11.0 | 4774 | 1.9171 | 1.0 | 102.9819 | 7.3988 |
| 2.2626 | 12.0 | 5208 | 1.8839 | 1.0 | 103.4988 | 7.6658 |
| 2.1977 | 13.0 | 5642 | 1.8490 | 1.0 | 103.9226 | 7.8602 |
| 2.1132 | 14.0 | 6076 | 1.8302 | 1.0 | 101.6970 | 8.0115 |
| 2.0603 | 15.0 | 6510 | 1.8026 | 1.0 | 101.7626 | 8.1996 |
| 2.006 | 16.0 | 6944 | 1.7834 | 1.0 | 102.9606 | 8.3058 |
| 1.9383 | 17.0 | 7378 | 1.7735 | 1.0 | 102.9428 | 8.4403 |
| 1.8985 | 18.0 | 7812 | 1.7613 | 1.0 | 103.7445 | 8.5071 |
| 1.8056 | 19.0 | 8246 | 1.7496 | 1.0 | 103.5564 | 8.6523 |
| 1.7996 | 20.0 | 8680 | 1.7423 | 1.0 | 102.8571 | 8.7177 |
| 1.7504 | 21.0 | 9114 | 1.7345 | 1.0 | 103.3688 | 8.8099 |
| 1.6857 | 22.0 | 9548 | 1.7290 | 1.0 | 102.4989 | 8.8871 |
| 1.6491 | 23.0 | 9982 | 1.7299 | 1.0 | 101.9331 | 8.9197 |
| 1.613 | 24.0 | 10416 | 1.7237 | 1.0 | 102.1754 | 8.9455 |
| 1.6032 | 25.0 | 10850 | 1.7213 | 1.0 | 103.9956 | 9.0970 |
| 1.5407 | 26.0 | 11284 | 1.7249 | 1.0 | 103.5786 | 9.1021 |
| 1.5496 | 27.0 | 11718 | 1.7145 | 1.0 | 102.7294 | 9.1801 |
| 1.4706 | 28.0 | 12152 | 1.7169 | 1.0 | 102.4893 | 9.2289 |
| 1.4444 | 29.0 | 12586 | 1.7156 | 1.0 | 102.2135 | 9.2741 |
| 1.4228 | 30.0 | 13020 | 1.7265 | 1.0 | 101.6165 | 9.3033 |
| 1.4021 | 31.0 | 13454 | 1.7226 | 1.0 | 101.4265 | 9.3907 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/8b397e93122f02e251305ab6a7ea0137
Base model
google/umt5-base