llama-32-11b-vision_model_output22

This model is a fine-tuned version of meta-llama/Llama-3.2-11B-Vision on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_HF with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 3

Training Loss	Epoch	Step	Validation Loss
1.3164	0.0694	5	1.3101
0.949	0.1389	10	1.3269
1.2146	0.2083	15	1.2872
1.1873	0.2778	20	1.2523
1.0953	0.3472	25	1.2493
0.786	0.4167	30	1.2503
1.2206	0.4861	35	1.2416
1.1274	0.5556	40	1.2249
0.9839	0.625	45	1.1276
1.2363	0.6944	50	0.9029
1.1686	0.7639	55	0.8469
1.1577	0.8333	60	0.8087
0.6341	0.9028	65	0.7726
1.2096	0.9722	70	0.7688
0.3772	1.0417	75	0.7799
0.7109	1.1111	80	0.7550
0.5853	1.1806	85	0.7486
0.3914	1.25	90	0.7481
0.2028	1.3194	95	0.6593
0.4332	1.3889	100	0.6028
0.6432	1.4583	105	0.5718
0.534	1.5278	110	0.5591
0.5914	1.5972	115	0.5523
0.3437	1.6667	120	0.5474
0.4905	1.7361	125	0.5293
0.5633	1.8056	130	0.5293
0.2607	1.875	135	0.5446
0.745	1.9444	140	0.5545

Base model

Finetuned

(9)

this model