BERT_V8_sp10_lw40_ex100_lo100_k10_k10_fold0
This model is a fine-tuned version of bert-base-uncased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.8576
- Qwk: 0.2720
- Mse: 0.8576
- Rmse: 0.9261
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Qwk | Mse | Rmse |
|---|---|---|---|---|---|---|
| No log | 1.0 | 6 | 6.5736 | 0.0 | 6.5736 | 2.5639 |
| No log | 2.0 | 12 | 3.9041 | 0.0039 | 3.9041 | 1.9759 |
| No log | 3.0 | 18 | 1.7330 | 0.0316 | 1.7330 | 1.3164 |
| No log | 4.0 | 24 | 0.9136 | 0.0316 | 0.9136 | 0.9558 |
| No log | 5.0 | 30 | 0.7790 | 0.2771 | 0.7790 | 0.8826 |
| No log | 6.0 | 36 | 0.8344 | 0.2904 | 0.8344 | 0.9134 |
| No log | 7.0 | 42 | 0.6693 | 0.3340 | 0.6693 | 0.8181 |
| No log | 8.0 | 48 | 0.6759 | 0.2887 | 0.6759 | 0.8221 |
| No log | 9.0 | 54 | 0.6126 | 0.3542 | 0.6126 | 0.7827 |
| No log | 10.0 | 60 | 0.6793 | 0.3193 | 0.6793 | 0.8242 |
| No log | 11.0 | 66 | 0.7651 | 0.3253 | 0.7651 | 0.8747 |
| No log | 12.0 | 72 | 0.6896 | 0.4700 | 0.6896 | 0.8304 |
| No log | 13.0 | 78 | 0.7273 | 0.4175 | 0.7273 | 0.8528 |
| No log | 14.0 | 84 | 0.6511 | 0.4605 | 0.6511 | 0.8069 |
| No log | 15.0 | 90 | 0.7819 | 0.3593 | 0.7819 | 0.8843 |
| No log | 16.0 | 96 | 0.8854 | 0.3024 | 0.8854 | 0.9410 |
| No log | 17.0 | 102 | 0.8160 | 0.3395 | 0.8160 | 0.9033 |
| No log | 18.0 | 108 | 0.7487 | 0.3713 | 0.7487 | 0.8652 |
| No log | 19.0 | 114 | 0.8344 | 0.3355 | 0.8344 | 0.9135 |
| No log | 20.0 | 120 | 0.8314 | 0.2771 | 0.8314 | 0.9118 |
| No log | 21.0 | 126 | 0.7480 | 0.3727 | 0.7480 | 0.8649 |
| No log | 22.0 | 132 | 0.9940 | 0.2074 | 0.9940 | 0.9970 |
| No log | 23.0 | 138 | 0.7924 | 0.3604 | 0.7924 | 0.8902 |
| No log | 24.0 | 144 | 0.9166 | 0.2703 | 0.9166 | 0.9574 |
| No log | 25.0 | 150 | 0.7149 | 0.3915 | 0.7149 | 0.8455 |
| No log | 26.0 | 156 | 0.9768 | 0.2164 | 0.9768 | 0.9884 |
| No log | 27.0 | 162 | 0.7593 | 0.3932 | 0.7593 | 0.8714 |
| No log | 28.0 | 168 | 0.9222 | 0.2397 | 0.9222 | 0.9603 |
| No log | 29.0 | 174 | 0.9315 | 0.2310 | 0.9315 | 0.9652 |
| No log | 30.0 | 180 | 0.8535 | 0.2989 | 0.8535 | 0.9238 |
| No log | 31.0 | 186 | 0.7939 | 0.3224 | 0.7939 | 0.8910 |
| No log | 32.0 | 192 | 0.9193 | 0.2771 | 0.9193 | 0.9588 |
| No log | 33.0 | 198 | 0.7183 | 0.3959 | 0.7183 | 0.8476 |
| No log | 34.0 | 204 | 0.8705 | 0.2184 | 0.8705 | 0.9330 |
| No log | 35.0 | 210 | 0.9986 | 0.1259 | 0.9986 | 0.9993 |
| No log | 36.0 | 216 | 0.7959 | 0.2699 | 0.7959 | 0.8921 |
| No log | 37.0 | 222 | 0.6716 | 0.4293 | 0.6716 | 0.8195 |
| No log | 38.0 | 228 | 0.7933 | 0.3024 | 0.7933 | 0.8906 |
| No log | 39.0 | 234 | 0.8228 | 0.2582 | 0.8228 | 0.9071 |
| No log | 40.0 | 240 | 0.8497 | 0.2589 | 0.8497 | 0.9218 |
| No log | 41.0 | 246 | 0.7943 | 0.3045 | 0.7943 | 0.8912 |
| No log | 42.0 | 252 | 0.8576 | 0.2720 | 0.8576 | 0.9261 |
Framework versions
- Transformers 4.51.1
- Pytorch 2.5.1+cu124
- Datasets 3.5.0
- Tokenizers 0.21.0
- Downloads last month
- -
Model tree for genki10/BERT_V8_sp10_lw40_ex100_lo100_k10_k10_fold0
Base model
google-bert/bert-base-uncased