gpt2_m010_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.9272	0.0524	1000	2.4832	0.4383
1.9866	0.1048	2000	1.8065	0.5658
1.7332	0.1572	3000	1.6123	0.6010
1.6149	0.2096	4000	1.5195	0.6176
1.5385	0.2619	5000	1.4548	0.6297
1.4822	0.3143	6000	1.4064	0.6394
1.4466	0.3667	7000	1.3731	0.6457
1.4152	0.4191	8000	1.3438	0.6512
1.3875	0.4715	9000	1.3234	0.6552
1.3701	0.5239	10000	1.3016	0.6597
1.3492	0.5763	11000	1.2850	0.6630
1.333	0.6287	12000	1.2705	0.6656
1.3174	0.6811	13000	1.2582	0.6684
1.3126	0.7334	14000	1.2467	0.6707
1.2962	0.7858	15000	1.2373	0.6725
1.2917	0.8382	16000	1.2294	0.6743
1.2808	0.8906	17000	1.2232	0.6757
1.2737	0.9430	18000	1.2178	0.6768
1.2704	0.9954	19000	1.2140	0.6776

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support