en_wiki_mlm_30

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.1319	2000	7.8720
7.9236	2.2637	4000	7.1074
7.9236	3.3956	6000	7.0304
7.0257	4.5274	8000	6.9532
7.0257	5.6593	10000	6.8760
6.8811	6.7912	12000	6.8110
6.8811	7.9230	14000	6.7316
6.7587	9.0549	16000	6.6892
6.7587	10.1868	18000	6.6501
6.6566	11.3186	20000	6.5951
6.6566	12.4505	22000	6.5255
6.546	13.5823	24000	6.4406
6.546	14.7142	26000	6.3165
6.3494	15.8461	28000	6.1499
6.3494	16.9779	30000	5.9410
6.0156	18.1098	32000	5.6377
6.0156	19.2417	34000	5.1174
5.2999	20.3735	36000	4.8551
5.2999	21.5054	38000	4.6650
4.7633	22.6372	40000	4.4964
4.7633	23.7691	42000	4.3249
4.4471	24.9010	44000	4.2117
4.4471	26.0328	46000	4.0767
4.1884	27.1647	48000	3.9930
4.1884	28.2965	50000	3.9030
3.9939	29.4284	52000	3.8126
3.9939	30.5603	54000	3.7701
3.8479	31.6921	56000	3.6775
3.8479	32.8240	58000	3.6432
3.7265	33.9559	60000	3.5951
3.7265	35.0877	62000	3.5470
3.6305	36.2196	64000	3.5206
3.6305	37.3514	66000	3.4949
3.5483	38.4833	68000	3.4768
3.5483	39.6152	70000	3.4227
3.4798	40.7470	72000	3.3735
3.4798	41.8789	74000	3.3894
3.4256	43.0108	76000	3.3543
3.4256	44.1426	78000	3.3211
3.3707	45.2745	80000	3.3156
3.3707	46.4063	82000	3.2899
3.3325	47.5382	84000	3.2545
3.3325	48.6701	86000	3.2459
3.2983	49.8019	88000	3.2607
3.2983	50.9338	90000	3.2458
3.2655	52.0656	92000	3.2131
3.2655	53.1975	94000	3.2023
3.2442	54.3294	96000	3.1815
3.2442	55.4612	98000	3.1892
3.2225	56.5931	100000	3.2011

Safetensors

Model size

14.9M params

Tensor type

F32