CoLA-Fisher-All-Attention-Q_K_V_O-seed10

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 5

Training Loss	Epoch	Step	Validation Loss	Matthews Correlation
0.6325	0.1866	50	0.5840	0.0
0.5409	0.3731	100	0.4712	0.4122
0.4783	0.5597	150	0.4855	0.4526
0.4644	0.7463	200	0.4302	0.5166
0.4258	0.9328	250	0.5848	0.4331
0.4334	1.1194	300	0.4170	0.5443
0.4028	1.3060	350	0.4867	0.5029
0.4013	1.4925	400	0.4688	0.5101
0.4345	1.6791	450	0.4121	0.5323
0.386	1.8657	500	0.5139	0.5074
0.367	2.0522	550	0.4826	0.5291
0.3586	2.2388	600	0.4704	0.5537
0.3742	2.4254	650	0.4335	0.5565
0.3773	2.6119	700	0.4286	0.5651
0.329	2.7985	750	0.4272	0.5879
0.3456	2.9851	800	0.4505	0.5521
0.3444	3.1716	850	0.4049	0.5905
0.3277	3.3582	900	0.4237	0.5838
0.3272	3.5448	950	0.5252	0.5474
0.3343	3.7313	1000	0.4318	0.5754
0.3246	3.9179	1050	0.4299	0.5761
0.317	4.1045	1100	0.4273	0.5832
0.2859	4.2910	1150	0.4620	0.5755
0.2984	4.4776	1200	0.4614	0.5780
0.3073	4.6642	1250	0.4426	0.5810
0.3026	4.8507	1300	0.4605	0.5779

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Adapter

(306)

this model