be37dcb56e0edd7abfdd701c1e6cf0df

This model is a fine-tuned version of facebook/mbart-large-50 on the Helsinki-NLP/opus_books [de-en] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3852
  • Data Size: 1.0
  • Epoch Runtime: 322.3205
  • Bleu: 9.3174

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 6.2567 0 26.7048 1.6973
No log 1 1286 4.4748 0.0078 29.5383 5.0592
0.0895 2 2572 3.3679 0.0156 33.2817 10.4733
0.0899 3 3858 2.2590 0.0312 37.5992 6.7743
0.0993 4 5144 2.1548 0.0625 47.5234 8.0927
2.0738 5 6430 2.0462 0.125 66.0811 8.5674
1.9511 6 7716 1.9638 0.25 101.6687 11.8428
1.8928 7 9002 1.9670 0.5 176.5597 12.5154
1.8482 8.0 10288 1.8666 1.0 321.5913 9.3952
1.4755 9.0 11574 1.8589 1.0 322.5186 9.5484
1.2218 10.0 12860 1.9179 1.0 322.5333 9.4680
0.9689 11.0 14146 2.0313 1.0 322.2198 9.4077
0.8023 12.0 15432 2.2226 1.0 324.6607 9.6218
0.6054 13.0 16718 2.3852 1.0 322.3205 9.3174

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
9
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/be37dcb56e0edd7abfdd701c1e6cf0df

Finetuned
(281)
this model