IceLlama3.2-3B-01

This model is a fine-tuned version of meta-llama/Llama-3.2-3B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8418

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.2704 0.0707 10000 2.0553
0.2647 0.1414 20000 1.9754
0.2566 0.2121 30000 1.9398
0.2558 0.2828 40000 1.9169
0.251 0.3535 50000 1.9004
0.2479 0.4242 60000 1.8903
0.2507 0.4949 70000 1.8823
0.2538 0.5656 80000 1.8753
0.2468 0.6363 90000 1.8692
0.2504 0.7070 100000 1.8660
0.2466 0.7777 110000 1.8614
0.2489 0.8484 120000 1.8587
0.2489 0.9191 130000 1.8552
0.2455 0.9898 140000 1.8529
0.2457 1.0606 150000 1.8512
0.2417 1.1313 160000 1.8493
0.2441 1.2020 170000 1.8479
0.2437 1.2727 180000 1.8472
0.2408 1.3434 190000 1.8460
0.2413 1.4141 200000 1.8450
0.2409 1.4848 210000 1.8443
0.242 1.5555 220000 1.8439
0.2435 1.6262 230000 1.8437
0.2446 1.6969 240000 1.8428
0.2469 1.7676 250000 1.8427
0.2396 1.8383 260000 1.8424
0.2433 1.9090 270000 1.8422
0.2475 1.9797 280000 1.8422
0.2457 2.0504 290000 1.8420
0.2412 2.1211 300000 1.8419
0.2461 2.1918 310000 1.8418
0.2444 2.2625 320000 1.8417
0.2436 2.3332 330000 1.8418
0.2456 2.4039 340000 1.8418
0.2411 2.4746 350000 1.8417
0.2459 2.5453 360000 1.8417
0.2425 2.6160 370000 1.8417
0.2464 2.6867 380000 1.8418
0.2431 2.7574 390000 1.8417
0.25 2.8281 400000 1.8418
0.2439 2.8988 410000 1.8418
0.2425 2.9695 420000 1.8418

Framework versions

  • Transformers 4.53.1
  • Pytorch 2.3.1+cu118
  • Datasets 3.6.0
  • Tokenizers 0.21.2
Downloads last month
1
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for thorirhrafn/IceLlama3.2-3B-01

Finetuned
(357)
this model
Quantizations
1 model