gpt2_m060_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.91	0.0524	1000	2.4654	0.4427
1.978	0.1048	2000	1.8015	0.5671
1.7257	0.1572	3000	1.6135	0.6007
1.6065	0.2097	4000	1.5119	0.6196
1.535	0.2621	5000	1.4495	0.6311
1.4837	0.3145	6000	1.4027	0.6398
1.441	0.3669	7000	1.3684	0.6463
1.4106	0.4193	8000	1.3395	0.6522
1.3837	0.4717	9000	1.3167	0.6567
1.3638	0.5241	10000	1.2977	0.6602
1.3448	0.5766	11000	1.2827	0.6631
1.3316	0.6290	12000	1.2662	0.6664
1.3128	0.6814	13000	1.2530	0.6692
1.3066	0.7338	14000	1.2431	0.6711
1.2959	0.7862	15000	1.2347	0.6729
1.2861	0.8386	16000	1.2259	0.6747
1.2782	0.8910	17000	1.2195	0.6760
1.2709	0.9434	18000	1.2137	0.6773
1.2679	0.9959	19000	1.2104	0.6780

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support