SST-2-HEURISTIC-LoRA-All-Attention-Q_K_V_O-seed20

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 5

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.3991	0.0950	200	0.2096	0.9163
0.2917	0.1900	400	0.1974	0.9174
0.2704	0.2850	600	0.2146	0.9197
0.2428	0.3800	800	0.1842	0.9346
0.2313	0.4751	1000	0.2589	0.9220
0.2147	0.5701	1200	0.2200	0.9278
0.2169	0.6651	1400	0.2166	0.9323
0.2097	0.7601	1600	0.2307	0.9255
0.216	0.8551	1800	0.2100	0.9312
0.2048	0.9501	2000	0.2078	0.9392
0.2004	1.0451	2200	0.2162	0.9335
0.1819	1.1401	2400	0.1884	0.9358
0.1837	1.2352	2600	0.2073	0.9323
0.1793	1.3302	2800	0.2156	0.9278
0.1792	1.4252	3000	0.1997	0.9323
0.1794	1.5202	3200	0.2129	0.9335
0.1788	1.6152	3400	0.1908	0.9346
0.1663	1.7102	3600	0.2561	0.9278
0.1705	1.8052	3800	0.2167	0.9346
0.1837	1.9002	4000	0.1958	0.9392
0.174	1.9952	4200	0.2181	0.9358
0.1602	2.0903	4400	0.2107	0.9335
0.1529	2.1853	4600	0.2229	0.9369
0.1568	2.2803	4800	0.2372	0.9346
0.1466	2.3753	5000	0.2117	0.9335
0.156	2.4703	5200	0.2452	0.9323
0.1544	2.5653	5400	0.2411	0.9312
0.163	2.6603	5600	0.2019	0.9323
0.1431	2.7553	5800	0.2393	0.9289
0.1466	2.8504	6000	0.2157	0.9312
0.1446	2.9454	6200	0.2291	0.9335
0.1395	3.0404	6400	0.2593	0.9278
0.1203	3.1354	6600	0.2339	0.9323
0.1272	3.2304	6800	0.2262	0.9404
0.1484	3.3254	7000	0.2128	0.9381
0.1269	3.4204	7200	0.2254	0.9404
0.1269	3.5154	7400	0.2387	0.9335
0.1321	3.6105	7600	0.2512	0.9358
0.1351	3.7055	7800	0.2333	0.9381
0.1331	3.8005	8000	0.2312	0.9427
0.1396	3.8955	8200	0.2190	0.9427
0.1342	3.9905	8400	0.2214	0.9381
0.1231	4.0855	8600	0.2422	0.9323
0.1159	4.1805	8800	0.2500	0.9323
0.1219	4.2755	9000	0.2348	0.9335
0.1225	4.3705	9200	0.2405	0.9312
0.1205	4.4656	9400	0.2407	0.9312
0.1148	4.5606	9600	0.2384	0.9369
0.12	4.6556	9800	0.2342	0.9381
0.1123	4.7506	10000	0.2384	0.9381
0.1182	4.8456	10200	0.2377	0.9381
0.1298	4.9406	10400	0.2349	0.9369

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Adapter

(306)

this model