llama-3-full-data-changed
This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.0986
- Accuracy: 0.5904
- F1: 0.5855
- Precision: 0.5975
- Recall: 0.5904
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 2
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
|---|---|---|---|---|---|---|---|
| 0.4985 | 0.1935 | 2000 | 1.1392 | 0.5867 | 0.5795 | 0.5962 | 0.5867 |
| 0.4899 | 0.3870 | 4000 | 1.0967 | 0.5910 | 0.5827 | 0.6023 | 0.5910 |
| 0.4878 | 0.5805 | 6000 | 1.1262 | 0.5903 | 0.5826 | 0.6007 | 0.5903 |
| 0.478 | 0.7740 | 8000 | 1.0881 | 0.5806 | 0.5784 | 0.5840 | 0.5806 |
| 0.471 | 0.9675 | 10000 | 1.1098 | 0.5786 | 0.5764 | 0.5820 | 0.5786 |
| 0.4632 | 1.1610 | 12000 | 1.1000 | 0.5748 | 0.5743 | 0.5760 | 0.5748 |
| 0.4561 | 1.3546 | 14000 | 1.1171 | 0.5868 | 0.5823 | 0.5933 | 0.5868 |
| 0.4585 | 1.5481 | 16000 | 1.1054 | 0.5914 | 0.5836 | 0.6020 | 0.5914 |
| 0.4561 | 1.7416 | 18000 | 1.0993 | 0.5895 | 0.5848 | 0.5962 | 0.5895 |
| 0.4564 | 1.9351 | 20000 | 1.0986 | 0.5904 | 0.5855 | 0.5975 | 0.5904 |
Framework versions
- PEFT 0.10.0
- Transformers 4.40.0
- Pytorch 2.2.2+cu121
- Datasets 2.18.0
- Tokenizers 0.19.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for tanjumajerin/llama-3-full-data-changed
Base model
meta-llama/Meta-Llama-3-8B