koelectra-small-clue-mrc

This model is a fine-tuned version of monologg/koelectra-small-discriminator on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.8581

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 10 4.6897
No log 2.0 20 4.4693
No log 3.0 30 4.2655
No log 4.0 40 4.0839
No log 5.0 50 3.9436
No log 6.0 60 3.8337
No log 7.0 70 3.7263
No log 8.0 80 3.6341
No log 9.0 90 3.5482
No log 10.0 100 3.4676
No log 11.0 110 3.4204
No log 12.0 120 3.3777
No log 13.0 130 3.3747
No log 14.0 140 3.3718
No log 15.0 150 3.3744
No log 16.0 160 3.3705
No log 17.0 170 3.4027
No log 18.0 180 3.4602
No log 19.0 190 3.4393
No log 20.0 200 3.4334
No log 21.0 210 3.4517
No log 22.0 220 3.5102
No log 23.0 230 3.4325
No log 24.0 240 3.5081
No log 25.0 250 3.6420
No log 26.0 260 3.4309
No log 27.0 270 3.6206
No log 28.0 280 3.5686
No log 29.0 290 3.5590
No log 30.0 300 3.6886
No log 31.0 310 3.6212
No log 32.0 320 3.6698
No log 33.0 330 3.6788
No log 34.0 340 3.7010
No log 35.0 350 3.6021
No log 36.0 360 3.7157
No log 37.0 370 3.7417
No log 38.0 380 3.7430
No log 39.0 390 3.7674
No log 40.0 400 3.7571
No log 41.0 410 3.7722
No log 42.0 420 3.7632
No log 43.0 430 3.7932
No log 44.0 440 3.7984
No log 45.0 450 3.7543
No log 46.0 460 3.7958
No log 47.0 470 3.7896
No log 48.0 480 3.8006
No log 49.0 490 3.7804
2.5056 50.0 500 3.8028
2.5056 51.0 510 3.7837
2.5056 52.0 520 3.8119
2.5056 53.0 530 3.7886
2.5056 54.0 540 3.8160
2.5056 55.0 550 3.8105
2.5056 56.0 560 3.8317
2.5056 57.0 570 3.8297
2.5056 58.0 580 3.8404
2.5056 59.0 590 3.8278
2.5056 60.0 600 3.8303
2.5056 61.0 610 3.8366
2.5056 62.0 620 3.7981
2.5056 63.0 630 3.8200
2.5056 64.0 640 3.8334
2.5056 65.0 650 3.8134
2.5056 66.0 660 3.8372
2.5056 67.0 670 3.8387
2.5056 68.0 680 3.8292
2.5056 69.0 690 3.8438
2.5056 70.0 700 3.8319
2.5056 71.0 710 3.8410
2.5056 72.0 720 3.8494
2.5056 73.0 730 3.8498
2.5056 74.0 740 3.8341
2.5056 75.0 750 3.8738
2.5056 76.0 760 3.8724
2.5056 77.0 770 3.8130
2.5056 78.0 780 3.8290
2.5056 79.0 790 3.8730
2.5056 80.0 800 3.8412
2.5056 81.0 810 3.8470
2.5056 82.0 820 3.8599
2.5056 83.0 830 3.8423
2.5056 84.0 840 3.8392
2.5056 85.0 850 3.8661
2.5056 86.0 860 3.8680
2.5056 87.0 870 3.8677
2.5056 88.0 880 3.8614
2.5056 89.0 890 3.8483
2.5056 90.0 900 3.8629
2.5056 91.0 910 3.8731
2.5056 92.0 920 3.8622
2.5056 93.0 930 3.8396
2.5056 94.0 940 3.8424
2.5056 95.0 950 3.8392
2.5056 96.0 960 3.8437
2.5056 97.0 970 3.8513
2.5056 98.0 980 3.8566
2.5056 99.0 990 3.8581
1.3292 100.0 1000 3.8581

Framework versions

  • Transformers 4.53.2
  • Pytorch 2.6.0+cu124
  • Datasets 4.0.0
  • Tokenizers 0.21.2
Downloads last month
5
Safetensors
Model size
13.7M params
Tensor type
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for holic25/koelectra-small-clue-mrc

Finetuned
(33)
this model