koelectra-small-clue-mrc

This model is a fine-tuned version of monologg/koelectra-small-discriminator on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.9825

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 10 5.1906
No log 2.0 20 5.1192
No log 3.0 30 5.0688
No log 4.0 40 5.0107
No log 5.0 50 4.9480
No log 6.0 60 4.9031
No log 7.0 70 4.8960
No log 8.0 80 4.8432
No log 9.0 90 4.8552
No log 10.0 100 4.8741
No log 11.0 110 4.8984
No log 12.0 120 4.9298
No log 13.0 130 5.0293
No log 14.0 140 5.1093
No log 15.0 150 5.1975
No log 16.0 160 5.2407
No log 17.0 170 5.2812
No log 18.0 180 5.3304
No log 19.0 190 5.3326
No log 20.0 200 5.3504
No log 21.0 210 5.4283
No log 22.0 220 5.3706
No log 23.0 230 5.4809
No log 24.0 240 5.4175
No log 25.0 250 5.5153
No log 26.0 260 5.5654
No log 27.0 270 5.5992
No log 28.0 280 5.5922
No log 29.0 290 5.5857
No log 30.0 300 5.5917
No log 31.0 310 5.6114
No log 32.0 320 5.6040
No log 33.0 330 5.6078
No log 34.0 340 5.6267
No log 35.0 350 5.6104
No log 36.0 360 5.6511
No log 37.0 370 5.6727
No log 38.0 380 5.6628
No log 39.0 390 5.6860
No log 40.0 400 5.6823
No log 41.0 410 5.6840
No log 42.0 420 5.6866
No log 43.0 430 5.7133
No log 44.0 440 5.7142
No log 45.0 450 5.7158
No log 46.0 460 5.7445
No log 47.0 470 5.7661
No log 48.0 480 5.8024
No log 49.0 490 5.7891
2.8093 50.0 500 5.7941
2.8093 51.0 510 5.7995
2.8093 52.0 520 5.8313
2.8093 53.0 530 5.8219
2.8093 54.0 540 5.8150
2.8093 55.0 550 5.8358
2.8093 56.0 560 5.8556
2.8093 57.0 570 5.8572
2.8093 58.0 580 5.8155
2.8093 59.0 590 5.8147
2.8093 60.0 600 5.8284
2.8093 61.0 610 5.8620
2.8093 62.0 620 5.8671
2.8093 63.0 630 5.8614
2.8093 64.0 640 5.8766
2.8093 65.0 650 5.8640
2.8093 66.0 660 5.8477
2.8093 67.0 670 5.8822
2.8093 68.0 680 5.9165
2.8093 69.0 690 5.9093
2.8093 70.0 700 5.9004
2.8093 71.0 710 5.9206
2.8093 72.0 720 5.8907
2.8093 73.0 730 5.8735
2.8093 74.0 740 5.9002
2.8093 75.0 750 5.9071
2.8093 76.0 760 5.9089
2.8093 77.0 770 5.9141
2.8093 78.0 780 5.9224
2.8093 79.0 790 5.9457
2.8093 80.0 800 5.9314
2.8093 81.0 810 5.9446
2.8093 82.0 820 5.9498
2.8093 83.0 830 5.9368
2.8093 84.0 840 5.9480
2.8093 85.0 850 5.9376
2.8093 86.0 860 5.9519
2.8093 87.0 870 5.9569
2.8093 88.0 880 5.9599
2.8093 89.0 890 5.9637
2.8093 90.0 900 5.9699
2.8093 91.0 910 5.9839
2.8093 92.0 920 5.9828
2.8093 93.0 930 5.9779
2.8093 94.0 940 5.9812
2.8093 95.0 950 5.9796
2.8093 96.0 960 5.9805
2.8093 97.0 970 5.9829
2.8093 98.0 980 5.9834
2.8093 99.0 990 5.9827
1.3469 100.0 1000 5.9825

Framework versions

  • Transformers 4.53.2
  • Pytorch 2.6.0+cu124
  • Datasets 4.0.0
  • Tokenizers 0.21.2
Downloads last month
5
Safetensors
Model size
13.7M params
Tensor type
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for sungkwan2/koelectra-small-clue-mrc

Finetuned
(33)
this model