de_wiki_mlm_13

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0796	2000	8.1153
8.1594	2.1592	4000	7.4675
8.1594	3.2389	6000	7.3558
7.3679	4.3185	8000	7.2731
7.3679	5.3981	10000	7.1905
7.2108	6.4777	12000	7.1281
7.2108	7.5574	14000	7.0444
7.0667	8.6370	16000	6.9835
7.0667	9.7166	18000	6.9460
6.9599	10.7962	20000	6.8962
6.9599	11.8758	22000	6.8452
6.8651	12.9555	24000	6.7725
6.8651	14.0351	26000	6.6713
6.7083	15.1147	28000	6.5472
6.7083	16.1943	30000	6.3977
6.4688	17.2740	32000	6.2481
6.4688	18.3536	34000	5.9439
6.0356	19.4332	36000	5.3813
6.0356	20.5128	38000	5.0142
5.1534	21.5924	40000	4.7447
5.1534	22.6721	42000	4.5206
4.6619	23.7517	44000	4.3437
4.6619	24.8313	46000	4.1933
4.3114	25.9109	48000	4.0463
4.3114	26.9906	50000	3.9254
4.0627	28.0702	52000	3.8380
4.0627	29.1498	54000	3.7413
3.869	30.2294	56000	3.6810
3.869	31.3090	58000	3.6163
3.7225	32.3887	60000	3.5482
3.7225	33.4683	62000	3.4884
3.5982	34.5479	64000	3.4383
3.5982	35.6275	66000	3.3907
3.5029	36.7072	68000	3.3516
3.5029	37.7868	70000	3.3336
3.4207	38.8664	72000	3.2854
3.4207	39.9460	74000	3.2706
3.3526	41.0256	76000	3.2301
3.3526	42.1053	78000	3.1975
3.2969	43.1849	80000	3.1776
3.2969	44.2645	82000	3.1729
3.2474	45.3441	84000	3.1339
3.2474	46.4238	86000	3.1216
3.2034	47.5034	88000	3.0938
3.2034	48.5830	90000	3.1126
3.1741	49.6626	92000	3.0775
3.1741	50.7422	94000	3.0699
3.1477	51.8219	96000	3.0651
3.1477	52.9015	98000	3.0555
3.1277	53.9811	100000	3.0579

Safetensors

Model size

14.9M params

Tensor type

F32