galennolan commited on
Commit
9ed19b1
·
verified ·
1 Parent(s): 4f56c0b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -0
README.md ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - indobert
5
+ - emotion-classification
6
+ - text-classification
7
+ - indonesian
8
+ - torch
9
+ language:
10
+ - id
11
+ datasets:
12
+ - PRDECT-ID
13
+ model-index:
14
+ - name: IndoBERT Emotion Classification (5-Class)
15
+ results:
16
+ - task:
17
+ type: text-classification
18
+ name: Emotion Classification
19
+ dataset:
20
+ name: PRDECT-ID
21
+ type: text
22
+ description: >
23
+ A dataset of Indonesian product reviews labeled with five emotion categories:
24
+ love, happiness, anger, fear, and sadness.
25
+ metrics:
26
+ - name: Accuracy
27
+ type: accuracy
28
+ value: 0.7167
29
+ - name: F1 Score
30
+ type: f1
31
+ value: 0.7125
32
+ - name: Precision
33
+ type: precision
34
+ value: 0.7179
35
+ - name: Recall
36
+ type: recall
37
+ value: 0.7167
38
+ ---
39
+
40
+ # IndoBERT Emotion Classification (5-Class)
41
+
42
+ Model ini merupakan hasil *fine-tuning* dari [`indobenchmark/indobert-base-p1`](https://huggingface.co/indobenchmark/indobert-base-p1) untuk tugas klasifikasi emosi dalam Bahasa Indonesia, dengan 5 label emosi: `love`, `happiness`, `anger`, `fear`, dan `sadness`.
43
+
44
+ ## 🧠 Dataset
45
+
46
+ Model ini dilatih menggunakan **PRDECT-ID Dataset**, yaitu kumpulan ulasan produk berbahasa Indonesia dari e-commerce Tokopedia, yang sudah dianotasi dengan label emosi oleh ahli psikologi klinis.
47
+
48
+ - 29 kategori produk
49
+ - Anotasi emosi oleh tim profesional
50
+ - Setiap entri memiliki 1 label emosi
51
+
52
+ ## 🛠 Fine-tuning Details
53
+
54
+ - **Base model**: `indobenchmark/indobert-base-p1`
55
+ - **Training epochs**: 5 dari total 10 (early stopping dengan `load_best_model_at_end=True`)
56
+ - **Batch size**: 8
57
+ - **Learning rate**: 2e-5
58
+ - **Weight decay**: 0.05
59
+ - **Validation strategy**: per epoch
60
+ - **Evaluation metric**: `eval_accuracy` (dengan `greater_is_better=True`)
61
+ - **Cross-validation**: Stratified K-Fold (n_splits=5)
62
+
63
+ ### Eval Results (Best Model @ Epoch 3)
64
+
65
+ | Metric | Value |
66
+ |-------------|---------|
67
+ | Accuracy | 0.7167 |
68
+ | F1 Score | 0.7125 |
69
+ | Precision | 0.7179 |
70
+ | Recall | 0.7167 |
71
+ | Eval Loss | 0.7614 |
72
+
73
+ ## 🚀 How to Use
74
+
75
+ ```python
76
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
77
+
78
+ model = AutoModelForSequenceClassification.from_pretrained("galennolan/indobert-b-p1-indoemotion-5class")
79
+ tokenizer = AutoTokenizer.from_pretrained("galennolan/indobert-b-p1-indoemotion-5class")
80
+
81
+ emotion_classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
82
+
83
+ emotion_classifier("Produk ini bikin aku senang banget!")