jziebura commited on
Commit
7df5de1
·
verified ·
1 Parent(s): d78d4ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -7
README.md CHANGED
@@ -17,27 +17,30 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  # finetuning
19
 
20
- This model is a fine-tuned version of [allegro/herbert-base-cased](https://huggingface.co/allegro/herbert-base-cased) on an unknown dataset.
 
21
  It achieves the following results on the evaluation set:
22
  - Loss: 0.7289
23
  - Accuracy: 0.7127
24
- - F1: 0.7110
25
  - F1 Macro: 0.6977
26
 
27
  ## Model description
28
 
29
- More information needed
30
-
31
- ## Intended uses & limitations
32
 
33
- More information needed
34
 
35
  ## Training and evaluation data
36
 
37
- More information needed
38
 
39
  ## Training procedure
40
 
 
 
 
 
41
  ### Training hyperparameters
42
 
43
  The following hyperparameters were used during training:
 
17
 
18
  # finetuning
19
 
20
+ This model is a fine-tuned version of [allegro/herbert-base-cased](https://huggingface.co/allegro/herbert-base-cased) on a [jziebura/polish_youth_slang_classification](https://huggingface.co/datasets/jziebura/polish_youth_slang_classification) dataset.
21
+
22
  It achieves the following results on the evaluation set:
23
  - Loss: 0.7289
24
  - Accuracy: 0.7127
25
+ - F1 weighted: 0.7110
26
  - F1 Macro: 0.6977
27
 
28
  ## Model description
29
 
30
+ The model is part of the experiments conducted during the creation of my master's thesis titled: *"A language model analyzing Polish youth slang"*.
 
 
31
 
32
+ It was fine-tuned to classify the sentiment of the Polish youth slang into three categories: negative, neutral or ambiguous, and positive.
33
 
34
  ## Training and evaluation data
35
 
36
+ All data comes from the [jziebura/polish_youth_slang_classification](https://huggingface.co/datasets/jziebura/polish_youth_slang_classification) dataset
37
 
38
  ## Training procedure
39
 
40
+ The hyperparameters were selected from those recommended in the [BERT introduction paper](https://arxiv.org/abs/1810.04805) and then optimized using the Optuna backend.
41
+
42
+ The HPO and fine-tuning were both conducted on the Google Colab platform on their free-tier T4 GPU instances.
43
+
44
  ### Training hyperparameters
45
 
46
  The following hyperparameters were used during training: