2de48b0fc6d520a89419a800ad3ec48f

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [en-nl] dataset. It achieves the following results on the evaluation set:

Loss: 2.0383
Data Size: 1.0
Epoch Runtime: 219.2923
Bleu: 8.4952

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.3346	0	18.1429	0.2190
No log	1	966	11.4229	0.0078	20.9299	0.1907
No log	2	1932	11.4843	0.0156	22.8287	0.1655
0.3839	3	2898	9.7131	0.0312	26.0710	0.2446
12.1579	4	3864	7.1554	0.0625	33.0564	0.4458
6.7454	5	4830	3.8891	0.125	44.2499	3.4346
4.1993	6	5796	3.0517	0.25	69.9529	3.4198
3.5204	7	6762	2.7321	0.5	118.9289	4.4885
3.1128	8.0	7728	2.5072	1.0	220.3696	5.3991
2.9354	9.0	8694	2.4080	1.0	218.3818	5.8508
2.8061	10.0	9660	2.3294	1.0	218.7114	6.2631
2.6442	11.0	10626	2.2795	1.0	220.9827	6.5370
2.579	12.0	11592	2.2406	1.0	219.3299	6.7654
2.4782	13.0	12558	2.2096	1.0	222.1809	6.9629
2.4441	14.0	13524	2.1860	1.0	219.8450	7.1492
2.3532	15.0	14490	2.1610	1.0	220.8062	7.2802
2.3206	16.0	15456	2.1317	1.0	219.1089	7.4271
2.259	17.0	16422	2.1235	1.0	221.5903	7.5148
2.1963	18.0	17388	2.1094	1.0	221.7594	7.6183
2.1915	19.0	18354	2.0932	1.0	220.4820	7.8209
2.1929	20.0	19320	2.0822	1.0	220.8837	7.8028
2.0831	21.0	20286	2.0794	1.0	220.7040	7.9256
2.0732	22.0	21252	2.0695	1.0	219.8961	7.9830
2.0193	23.0	22218	2.0570	1.0	223.1092	8.0332
1.9825	24.0	23184	2.0538	1.0	221.8799	8.1092
1.939	25.0	24150	2.0504	1.0	219.1382	8.1945
1.9229	26.0	25116	2.0437	1.0	220.3549	8.2220
1.8926	27.0	26082	2.0425	1.0	220.4797	8.2861
1.8835	28.0	27048	2.0338	1.0	219.4541	8.3513
1.8284	29.0	28014	2.0330	1.0	219.6740	8.3704
1.7928	30.0	28980	2.0334	1.0	218.0309	8.4036
1.7554	31.0	29946	2.0360	1.0	219.8970	8.3956
1.7503	32.0	30912	2.0342	1.0	219.4680	8.4556
1.7572	33.0	31878	2.0383	1.0	219.2923	8.4952

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/2de48b0fc6d520a89419a800ad3ec48f

Base model

google/umt5-base

Finetuned

(48)

this model