64e2108fc55edcc92327e54cb544ba55
This model is a fine-tuned version of google-t5/t5-3b on the Helsinki-NLP/opus_books [en-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 1.1715
- Data Size: 1.0
- Epoch Runtime: 455.8688
- Bleu: 8.7940
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 2.6616 | 0 | 28.2970 | 1.7281 |
| No log | 1 | 966 | 2.2039 | 0.0078 | 32.5855 | 2.7291 |
| No log | 2 | 1932 | 2.0539 | 0.0156 | 43.6727 | 3.8064 |
| 0.0505 | 3 | 2898 | 1.9161 | 0.0312 | 48.5563 | 4.2945 |
| 2.0261 | 4 | 3864 | 1.7818 | 0.0625 | 63.5191 | 4.3479 |
| 1.8616 | 5 | 4830 | 1.6349 | 0.125 | 87.2131 | 5.1001 |
| 1.6977 | 6 | 5796 | 1.4992 | 0.25 | 139.6249 | 5.8257 |
| 1.4842 | 7 | 6762 | 1.3557 | 0.5 | 240.0775 | 6.6527 |
| 1.2894 | 8.0 | 7728 | 1.2250 | 1.0 | 446.9474 | 7.6431 |
| 1.1553 | 9.0 | 8694 | 1.1645 | 1.0 | 447.2167 | 8.0025 |
| 1.0527 | 10.0 | 9660 | 1.1327 | 1.0 | 454.2894 | 8.3299 |
| 0.9251 | 11.0 | 10626 | 1.1213 | 1.0 | 455.9962 | 8.4725 |
| 0.8492 | 12.0 | 11592 | 1.1249 | 1.0 | 434.8186 | 8.5219 |
| 0.7611 | 13.0 | 12558 | 1.1337 | 1.0 | 458.0648 | 8.6364 |
| 0.7174 | 14.0 | 13524 | 1.1428 | 1.0 | 457.5247 | 8.7487 |
| 0.6356 | 15.0 | 14490 | 1.1715 | 1.0 | 455.8688 | 8.7940 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/64e2108fc55edcc92327e54cb544ba55
Base model
google-t5/t5-3b