flan-t5-base-gen-chat_base-10
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.4467
- Rouge 1: 24.1929
- Rouge 2: 15.3985
- Rouge L: 23.5324
- Avg Len: 12.7466
- Bertscore Prec: 0.8809
- Bertscore Rec: 0.8753
- Bertscore F1: 0.8778
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge 1 | Rouge 2 | Rouge L | Avg Len | Bertscore Prec | Bertscore Rec | Bertscore F1 |
|---|---|---|---|---|---|---|---|---|---|---|
| 3.9688 | 0.0886 | 200 | 3.5322 | 5.9122 | 0.5798 | 5.5275 | 9.7881 | 0.8463 | 0.8392 | 0.8423 |
| 3.6824 | 0.1771 | 400 | 3.3814 | 6.1152 | 0.3767 | 5.5539 | 12.7129 | 0.8499 | 0.8468 | 0.8479 |
| 3.5561 | 0.2657 | 600 | 3.2920 | 6.2105 | 0.3485 | 5.6438 | 13.4937 | 0.8523 | 0.8487 | 0.8501 |
| 3.486 | 0.3543 | 800 | 3.2058 | 6.3465 | 0.4184 | 5.7973 | 14.3754 | 0.8535 | 0.8497 | 0.8512 |
| 3.3976 | 0.4429 | 1000 | 3.1337 | 6.2243 | 0.414 | 5.7298 | 14.8228 | 0.8544 | 0.85 | 0.8519 |
| 3.3255 | 0.5314 | 1200 | 3.0607 | 6.329 | 0.3778 | 5.9526 | 15.949 | 0.8591 | 0.8506 | 0.8546 |
| 3.2571 | 0.6200 | 1400 | 2.9876 | 6.1155 | 0.4345 | 5.8176 | 16.7797 | 0.8583 | 0.8505 | 0.8541 |
| 3.2099 | 0.7086 | 1600 | 2.9138 | 6.2444 | 0.5219 | 5.9243 | 16.622 | 0.8602 | 0.8518 | 0.8557 |
| 3.1218 | 0.7972 | 1800 | 2.8463 | 7.0885 | 0.6325 | 6.6268 | 15.511 | 0.8613 | 0.8531 | 0.8569 |
| 3.0519 | 0.8857 | 2000 | 2.7680 | 6.9581 | 0.6418 | 6.4719 | 15.4558 | 0.8612 | 0.8528 | 0.8567 |
| 2.9952 | 0.9743 | 2200 | 2.6964 | 6.4184 | 0.5555 | 6.0587 | 16.2513 | 0.8615 | 0.8513 | 0.8561 |
| 2.9122 | 1.0629 | 2400 | 2.6121 | 7.7542 | 0.7494 | 7.1916 | 15.6651 | 0.8596 | 0.8538 | 0.8564 |
| 2.8186 | 1.1515 | 2600 | 2.5330 | 8.0613 | 0.9297 | 7.5236 | 15.419 | 0.8603 | 0.8534 | 0.8566 |
| 2.7677 | 1.2400 | 2800 | 2.4607 | 7.9132 | 0.9312 | 7.3354 | 15.2203 | 0.8606 | 0.8535 | 0.8567 |
| 2.693 | 1.3286 | 3000 | 2.3973 | 8.3183 | 0.8438 | 7.6764 | 14.6951 | 0.8597 | 0.8539 | 0.8565 |
| 2.6307 | 1.4172 | 3200 | 2.3284 | 8.5698 | 0.9722 | 7.9813 | 14.7492 | 0.8631 | 0.8544 | 0.8584 |
| 2.5983 | 1.5058 | 3400 | 2.2664 | 7.993 | 0.9142 | 7.4419 | 13.7944 | 0.8637 | 0.8525 | 0.8577 |
| 2.562 | 1.5943 | 3600 | 2.2105 | 9.3753 | 1.3081 | 8.6725 | 12.8717 | 0.8633 | 0.8542 | 0.8584 |
| 2.5138 | 1.6829 | 3800 | 2.1486 | 9.1004 | 1.2136 | 8.3044 | 13.0662 | 0.8633 | 0.8541 | 0.8584 |
| 2.4489 | 1.7715 | 4000 | 2.0869 | 9.6227 | 1.4731 | 8.9298 | 13.3586 | 0.8623 | 0.8552 | 0.8584 |
| 2.4131 | 1.8601 | 4200 | 2.0376 | 9.8466 | 1.627 | 9.057 | 13.2366 | 0.861 | 0.855 | 0.8577 |
| 2.3756 | 1.9486 | 4400 | 1.9884 | 10.0342 | 1.7841 | 9.2826 | 12.7713 | 0.8625 | 0.8557 | 0.8588 |
| 2.301 | 2.0372 | 4600 | 1.9312 | 10.4934 | 1.7158 | 9.5836 | 13.4175 | 0.8608 | 0.8563 | 0.8582 |
| 2.2251 | 2.1258 | 4800 | 1.8776 | 10.2075 | 1.8632 | 9.3856 | 12.1667 | 0.8611 | 0.855 | 0.8577 |
| 2.2019 | 2.2143 | 5000 | 1.8374 | 10.7066 | 2.1214 | 9.8472 | 11.9201 | 0.8635 | 0.8559 | 0.8593 |
| 2.1614 | 2.3029 | 5200 | 1.7892 | 10.9383 | 2.0411 | 10.0264 | 12.215 | 0.8629 | 0.8562 | 0.8592 |
| 2.1052 | 2.3915 | 5400 | 1.7346 | 10.9833 | 2.1318 | 10.0323 | 12.7492 | 0.8618 | 0.8571 | 0.8591 |
| 2.1 | 2.4801 | 5600 | 1.6900 | 10.7328 | 2.2224 | 9.851 | 12.7482 | 0.8623 | 0.8563 | 0.859 |
| 2.0505 | 2.5686 | 5800 | 1.6426 | 11.5388 | 2.778 | 10.6766 | 12.6909 | 0.8637 | 0.8578 | 0.8604 |
| 2.0312 | 2.6572 | 6000 | 1.6005 | 12.1037 | 2.8293 | 11.0478 | 12.7923 | 0.864 | 0.8583 | 0.8608 |
| 1.9812 | 2.7458 | 6200 | 1.5591 | 11.4098 | 2.716 | 10.5515 | 11.5657 | 0.8633 | 0.8566 | 0.8596 |
| 1.9536 | 2.8344 | 6400 | 1.5258 | 12.3058 | 3.1725 | 11.3085 | 12.4963 | 0.8645 | 0.8584 | 0.8611 |
| 1.9247 | 2.9229 | 6600 | 1.4968 | 12.616 | 3.1686 | 11.6157 | 12.4853 | 0.8643 | 0.8585 | 0.861 |
| 1.8953 | 3.0115 | 6800 | 1.4401 | 12.4791 | 3.4918 | 11.5941 | 11.9516 | 0.8654 | 0.858 | 0.8613 |
| 1.8013 | 3.1001 | 7000 | 1.4081 | 12.2478 | 3.2588 | 11.3239 | 12.2829 | 0.8648 | 0.8578 | 0.8609 |
| 1.7944 | 3.1887 | 7200 | 1.3696 | 13.2274 | 4.0563 | 12.3305 | 11.8575 | 0.8671 | 0.8589 | 0.8626 |
| 1.7455 | 3.2772 | 7400 | 1.3380 | 13.3023 | 4.223 | 12.3943 | 11.7429 | 0.8672 | 0.8594 | 0.8629 |
| 1.7357 | 3.3658 | 7600 | 1.3095 | 12.9876 | 4.0003 | 12.1323 | 12.2713 | 0.8651 | 0.8592 | 0.8618 |
| 1.7085 | 3.4544 | 7800 | 1.2724 | 12.5145 | 3.7926 | 11.6837 | 12.0988 | 0.8664 | 0.8589 | 0.8623 |
| 1.6849 | 3.5430 | 8000 | 1.2421 | 13.7242 | 4.5574 | 12.7929 | 12.2739 | 0.8666 | 0.86 | 0.863 |
| 1.6697 | 3.6315 | 8200 | 1.2151 | 13.4541 | 4.4375 | 12.6551 | 12.1004 | 0.867 | 0.8597 | 0.863 |
| 1.6378 | 3.7201 | 8400 | 1.1852 | 13.9238 | 4.9652 | 12.9342 | 12.2487 | 0.8666 | 0.86 | 0.8629 |
| 1.6271 | 3.8087 | 8600 | 1.1541 | 13.9867 | 4.7587 | 12.974 | 12.7992 | 0.866 | 0.8607 | 0.863 |
| 1.6001 | 3.8973 | 8800 | 1.1140 | 14.6864 | 5.2988 | 13.6278 | 12.3286 | 0.8671 | 0.8612 | 0.8638 |
| 1.5819 | 3.9858 | 9000 | 1.0984 | 14.9938 | 5.6795 | 13.9942 | 12.1046 | 0.8691 | 0.8618 | 0.8651 |
| 1.5135 | 4.0744 | 9200 | 1.0698 | 14.5309 | 5.29 | 13.4662 | 13.0163 | 0.867 | 0.8617 | 0.864 |
| 1.4944 | 4.1630 | 9400 | 1.0490 | 14.9356 | 5.5801 | 13.936 | 12.48 | 0.8681 | 0.862 | 0.8647 |
| 1.5009 | 4.2516 | 9600 | 1.0305 | 14.7679 | 5.58 | 13.8111 | 12.5836 | 0.867 | 0.8615 | 0.8639 |
| 1.4563 | 4.3401 | 9800 | 0.9986 | 15.3408 | 5.9754 | 14.3541 | 12.3039 | 0.869 | 0.8619 | 0.8651 |
| 1.4193 | 4.4287 | 10000 | 0.9760 | 15.2089 | 6.0598 | 14.2366 | 12.7303 | 0.867 | 0.8619 | 0.8641 |
| 1.4096 | 4.5173 | 10200 | 0.9484 | 15.2806 | 5.9376 | 14.2978 | 12.3149 | 0.8682 | 0.8617 | 0.8646 |
| 1.3949 | 4.6058 | 10400 | 0.9292 | 16.1246 | 6.7061 | 15.1157 | 12.1141 | 0.8686 | 0.8621 | 0.865 |
| 1.3677 | 4.6944 | 10600 | 0.9050 | 16.3699 | 7.0386 | 15.3328 | 12.1751 | 0.8701 | 0.8632 | 0.8663 |
| 1.3622 | 4.7830 | 10800 | 0.8906 | 16.6265 | 7.0838 | 15.5958 | 12.6793 | 0.8695 | 0.8635 | 0.8662 |
| 1.3472 | 4.8716 | 11000 | 0.8717 | 16.9069 | 7.3785 | 15.85 | 12.295 | 0.871 | 0.8639 | 0.8671 |
| 1.3063 | 4.9601 | 11200 | 0.8585 | 16.6315 | 7.1464 | 15.6959 | 12.7629 | 0.8699 | 0.8639 | 0.8665 |
| 1.2917 | 5.0487 | 11400 | 0.8323 | 17.1631 | 7.8583 | 16.216 | 12.0589 | 0.8709 | 0.8638 | 0.867 |
| 1.2651 | 5.1373 | 11600 | 0.8188 | 17.381 | 7.7789 | 16.3956 | 12.4816 | 0.8707 | 0.865 | 0.8675 |
| 1.2571 | 5.2259 | 11800 | 0.7999 | 17.0964 | 7.8506 | 16.2973 | 12.3623 | 0.8713 | 0.8646 | 0.8676 |
| 1.2305 | 5.3144 | 12000 | 0.7869 | 17.3516 | 8.2409 | 16.4774 | 12.4159 | 0.871 | 0.8651 | 0.8677 |
| 1.2144 | 5.4030 | 12200 | 0.7676 | 17.5367 | 8.4245 | 16.6064 | 12.4038 | 0.8716 | 0.8649 | 0.8679 |
| 1.2118 | 5.4916 | 12400 | 0.7504 | 17.5323 | 8.4859 | 16.5036 | 13.0079 | 0.8705 | 0.8657 | 0.8677 |
| 1.1949 | 5.5802 | 12600 | 0.7399 | 18.1521 | 9.1443 | 17.1626 | 12.8906 | 0.8721 | 0.8663 | 0.8688 |
| 1.1669 | 5.6687 | 12800 | 0.7326 | 18.7114 | 9.4922 | 17.7586 | 12.7592 | 0.8729 | 0.867 | 0.8696 |
| 1.1607 | 5.7573 | 13000 | 0.7144 | 18.7837 | 9.2667 | 17.7867 | 12.2823 | 0.8736 | 0.8669 | 0.8699 |
| 1.1606 | 5.8459 | 13200 | 0.6989 | 18.9936 | 9.7086 | 17.9808 | 12.4175 | 0.8744 | 0.8677 | 0.8707 |
| 1.1472 | 5.9345 | 13400 | 0.6922 | 18.89 | 9.6679 | 17.9687 | 12.9332 | 0.8731 | 0.8675 | 0.8699 |
| 1.1122 | 6.0230 | 13600 | 0.6747 | 19.4554 | 10.1127 | 18.535 | 12.4953 | 0.8743 | 0.8678 | 0.8706 |
| 1.0979 | 6.1116 | 13800 | 0.6605 | 19.2042 | 10.0256 | 18.3507 | 12.6399 | 0.8743 | 0.8678 | 0.8707 |
| 1.0983 | 6.2002 | 14000 | 0.6490 | 19.2952 | 10.404 | 18.449 | 12.3465 | 0.8747 | 0.8675 | 0.8707 |
| 1.0625 | 6.2888 | 14200 | 0.6447 | 19.6198 | 10.2986 | 18.7111 | 12.7355 | 0.8744 | 0.8682 | 0.8709 |
| 1.0731 | 6.3773 | 14400 | 0.6295 | 19.8891 | 10.7334 | 18.9588 | 12.4164 | 0.8746 | 0.8678 | 0.8708 |
| 1.0483 | 6.4659 | 14600 | 0.6230 | 20.308 | 11.0122 | 19.4288 | 12.2792 | 0.8754 | 0.8687 | 0.8717 |
| 1.0425 | 6.5545 | 14800 | 0.6166 | 20.3186 | 11.1015 | 19.4082 | 12.2744 | 0.875 | 0.8687 | 0.8715 |
| 1.0569 | 6.6430 | 15000 | 0.6041 | 19.5917 | 10.714 | 18.7884 | 12.3764 | 0.8737 | 0.868 | 0.8705 |
| 1.0276 | 6.7316 | 15200 | 0.5973 | 20.1387 | 10.9804 | 19.22 | 12.7061 | 0.8751 | 0.869 | 0.8717 |
| 0.9947 | 6.8202 | 15400 | 0.5856 | 20.3623 | 11.3342 | 19.4165 | 12.654 | 0.8758 | 0.8695 | 0.8723 |
| 1.0182 | 6.9088 | 15600 | 0.5778 | 20.9678 | 12.1136 | 20.1893 | 12.1898 | 0.8771 | 0.8703 | 0.8733 |
| 0.9909 | 6.9973 | 15800 | 0.5667 | 21.2498 | 12.0591 | 20.3519 | 12.6572 | 0.8769 | 0.8709 | 0.8735 |
| 0.972 | 7.0859 | 16000 | 0.5600 | 21.4474 | 12.1037 | 20.5792 | 12.3849 | 0.8766 | 0.8704 | 0.8732 |
| 0.9656 | 7.1745 | 16200 | 0.5507 | 21.5475 | 12.4501 | 20.7857 | 12.4884 | 0.8782 | 0.8714 | 0.8744 |
| 0.9756 | 7.2631 | 16400 | 0.5423 | 21.3981 | 12.3185 | 20.5697 | 12.5079 | 0.8765 | 0.8708 | 0.8733 |
| 0.9319 | 7.3516 | 16600 | 0.5359 | 21.8811 | 12.951 | 21.1801 | 12.295 | 0.8779 | 0.8713 | 0.8742 |
| 0.9665 | 7.4402 | 16800 | 0.5378 | 21.968 | 13.2006 | 21.136 | 12.7865 | 0.878 | 0.8719 | 0.8746 |
| 0.9435 | 7.5288 | 17000 | 0.5282 | 22.3704 | 13.1908 | 21.562 | 12.5126 | 0.8771 | 0.872 | 0.8742 |
| 0.9301 | 7.6174 | 17200 | 0.5190 | 22.2889 | 13.2708 | 21.4298 | 13.0584 | 0.8774 | 0.8721 | 0.8744 |
| 0.9332 | 7.7059 | 17400 | 0.5168 | 22.254 | 13.3531 | 21.4963 | 12.7455 | 0.8774 | 0.8719 | 0.8743 |
| 0.925 | 7.7945 | 17600 | 0.5099 | 22.6425 | 13.4854 | 21.7613 | 12.7718 | 0.8788 | 0.8729 | 0.8755 |
| 0.9136 | 7.8831 | 17800 | 0.5068 | 22.5949 | 13.6532 | 21.7553 | 12.7713 | 0.8782 | 0.8726 | 0.875 |
| 0.8967 | 7.9717 | 18000 | 0.5005 | 22.5535 | 13.6107 | 21.7333 | 12.9748 | 0.8786 | 0.8729 | 0.8754 |
| 0.9006 | 8.0602 | 18200 | 0.4938 | 23.0001 | 13.8053 | 22.181 | 12.8743 | 0.8787 | 0.8733 | 0.8756 |
| 0.8994 | 8.1488 | 18400 | 0.4920 | 22.2808 | 13.4752 | 21.5215 | 12.6567 | 0.8777 | 0.8722 | 0.8746 |
| 0.8855 | 8.2374 | 18600 | 0.4858 | 22.7675 | 13.7282 | 21.9464 | 12.7198 | 0.8785 | 0.873 | 0.8754 |
| 0.8701 | 8.3260 | 18800 | 0.4785 | 22.887 | 14.0293 | 22.0317 | 12.4096 | 0.879 | 0.8734 | 0.8758 |
| 0.8866 | 8.4145 | 19000 | 0.4784 | 23.2105 | 14.1194 | 22.3646 | 12.7708 | 0.8796 | 0.874 | 0.8764 |
| 0.8543 | 8.5031 | 19200 | 0.4751 | 23.4786 | 14.352 | 22.6102 | 12.6824 | 0.8794 | 0.8739 | 0.8763 |
| 0.8594 | 8.5917 | 19400 | 0.4703 | 23.0914 | 14.2145 | 22.3521 | 12.8223 | 0.8786 | 0.8737 | 0.8758 |
| 0.8648 | 8.6802 | 19600 | 0.4699 | 23.4638 | 14.3472 | 22.6763 | 12.9611 | 0.8786 | 0.8744 | 0.8762 |
| 0.8659 | 8.7688 | 19800 | 0.4657 | 23.2376 | 14.3604 | 22.5048 | 12.5131 | 0.8796 | 0.8738 | 0.8763 |
| 0.8647 | 8.8574 | 20000 | 0.4654 | 23.2513 | 14.3413 | 22.5309 | 12.8675 | 0.879 | 0.8742 | 0.8763 |
| 0.8635 | 8.9460 | 20200 | 0.4598 | 23.3858 | 14.4641 | 22.7185 | 12.7702 | 0.8795 | 0.8741 | 0.8764 |
| 0.8526 | 9.0345 | 20400 | 0.4581 | 23.7295 | 15.0048 | 23.0682 | 12.8586 | 0.88 | 0.8747 | 0.877 |
| 0.8501 | 9.1231 | 20600 | 0.4565 | 23.3145 | 14.5665 | 22.6584 | 12.5941 | 0.88 | 0.8737 | 0.8765 |
| 0.8375 | 9.2117 | 20800 | 0.4534 | 23.835 | 14.9155 | 23.0955 | 12.929 | 0.8796 | 0.8747 | 0.8768 |
| 0.8489 | 9.3003 | 21000 | 0.4534 | 23.6909 | 14.798 | 23.0078 | 12.8191 | 0.88 | 0.8745 | 0.8769 |
| 0.8213 | 9.3888 | 21200 | 0.4509 | 23.9297 | 15.077 | 23.2475 | 12.7608 | 0.8808 | 0.8749 | 0.8775 |
| 0.8313 | 9.4774 | 21400 | 0.4493 | 24.0578 | 15.1554 | 23.3283 | 12.8644 | 0.8806 | 0.8752 | 0.8775 |
| 0.8348 | 9.5660 | 21600 | 0.4493 | 24.1275 | 15.3226 | 23.4299 | 12.8607 | 0.8806 | 0.8751 | 0.8775 |
| 0.8294 | 9.6546 | 21800 | 0.4484 | 24.019 | 15.2318 | 23.3294 | 12.7666 | 0.8809 | 0.8752 | 0.8777 |
| 0.8217 | 9.7431 | 22000 | 0.4472 | 24.3012 | 15.4436 | 23.6118 | 12.8312 | 0.881 | 0.8755 | 0.8779 |
| 0.8274 | 9.8317 | 22200 | 0.4469 | 24.1533 | 15.3296 | 23.4845 | 12.7413 | 0.8809 | 0.8752 | 0.8777 |
| 0.829 | 9.9203 | 22400 | 0.4467 | 24.1929 | 15.3985 | 23.5324 | 12.7466 | 0.8809 | 0.8753 | 0.8778 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for greatakela/flan-t5-base-gen-chat_base-10
Base model
google/flan-t5-base