Text classification model for argument mining and detection
gbert-base-argument_mining is a text classification model in the scientific domain in German, finetuned from the model gbert-base. It was trained using a synthetically created, annotated dataset containing different sentence types occuring in conclusions of scientific theses and papers.
Training
Training was conducted on a 10 epoch fine-tuning approach, however this repository contains the results of the fourth epoch, since it has the best accuracy:
| epoch | accuracy | loss |
|---|---|---|
| 1.0 | 0.9315 | 0.3872 |
| 2.0 | 0.9178 | 0.2987 |
| 3.0 | 0.9589 | 0.1519 |
| 4.0 | 0.9658 | 0.1162 |
| 5.0 | 0.9521 | 0.2100 |
| 6.0 | 0.9521 | 0.1979 |
| 7.0 | 0.9521 | 0.2453 |
| 8.0 | 0.9521 | 0.2251 |
| 9.0 | 0.9452 | 0.2225 |
| 10.0 | 0.9521 | 0.2286 |
In relation to the dataset, the model demonstrates that it can effectively learn to distinguish between the two classes claim and premise. However, the rapid onset of overfitting after epoch 4 suggests that the dataset is imbalanced and noisy. Further work should enable the model to be trained on more robust data to ensure better evaluation results.
Text Classification Tags
| Text Classification Tag | Text Classification Label |
|---|---|
| 0 | CLAIM |
| 1 | COUNTERCLAIM |
| 2 | LINK |
| 3 | CONC |
| 4 | FUT |
| 5 | OTH |
- Downloads last month
- 84
Model tree for samirmsallem/gbert-base-argument_mining
Base model
deepset/gbert-baseDataset used to train samirmsallem/gbert-base-argument_mining
Collection including samirmsallem/gbert-base-argument_mining
Evaluation results
- Accuracy on samirmsallem/argument_mining_deself-reported0.966