vi_mbart_mt

This model is a fine-tuned version of vinai/vinai-translate-vi2en-v2 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 25

Training Loss	Epoch	Step	Validation Loss	Smatch Precision	Smatch Recall	Smatch Fscore	Smatch Unparsable	Percent Not Recoverable
0.4926	0.9994	1614	4.7140	2.68	42.02	5.03	4	0.0
0.318	1.9994	3228	4.2151	2.57	41.17	4.83	0	0.0
0.2145	2.9994	4842	3.1800	3.15	44.21	5.87	1	0.0
0.1897	3.9994	6456	3.1017	3.2	47.25	5.99	0	0.0
0.1458	4.9994	8070	2.2906	5.5	57.42	10.04	0	0.0
0.0841	5.9994	9684	1.7533	9.01	62.61	15.76	1	0.0627
0.0894	6.9994	11298	1.4089	11.62	68.96	19.88	0	0.0
0.0678	7.9994	12912	1.2575	14.13	68.91	23.45	0	0.0
0.0469	8.9994	14526	1.0867	18.65	71.86	29.61	0	0.0
0.0327	9.9994	16140	0.9371	27.86	74.71	40.58	0	0.0
0.0291	10.9994	17754	0.8860	29.26	73.53	41.86	0	0.1255
0.0265	11.9994	19368	0.8219	28.9	74.7	41.68	0	0.0
0.0246	12.9994	20982	0.7131	41.17	76.02	53.42	0	0.0
0.0259	13.9994	22596	0.6919	42.93	75.52	54.74	0	0.1255
0.0165	14.9994	24210	0.7349	47.35	76.19	58.4	0	0.0627
0.0056	15.9994	25824	0.7711	53.92	76.24	63.17	0	0.0
0.0171	16.9994	27438	0.6843	60.17	76.83	67.49	0	0.0627
0.0086	17.9994	29052	0.7322	64.33	76.73	69.99	0	0.0
0.0022	18.9994	30666	0.6778	66.15	76.76	71.06	0	0.1255
0.0026	19.9994	32280	0.6665	68.98	77.24	72.87	0	0.0627
0.003	20.9994	33894	0.6389	70.47	77.18	73.67	0	0.0627
0.0015	21.9994	35508	0.6256	71.7	76.98	74.25	0	0.0627
0.0009	22.9994	37122	0.6181	72.69	76.57	74.58	0	0.0627
0.0024	23.9994	38736	0.6084	72.88	76.21	74.51	0	0.1255
0.0017	24.9994	40350	0.5949	73.58	75.79	74.67	0	0.1882

Safetensors

Model size

0.4B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(8)

this model