Add comprehensive model card
Browse files
README.md
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- km
|
|
@@ -10,9 +11,9 @@ tags:
|
|
| 10 |
- pytorch
|
| 11 |
- transformers
|
| 12 |
widget:
|
| 13 |
-
- text: ខ្ញុំចង់<mask>ភាសាខ្មែរ
|
| 14 |
-
- text: ប្រទេសកម្ពុជាមាន<mask>ខេត្ត
|
| 15 |
-
- text: រាជធានីភ្នំពេញគឺជ<mask>របស់ប្រទេសកម្ពុជា
|
| 16 |
metrics:
|
| 17 |
- perplexity
|
| 18 |
base_model: xlm-roberta-base
|
|
@@ -31,7 +32,7 @@ This is a Pretrain Language Model using XLM-RoBERTa Architecture for Khmer & Eng
|
|
| 31 |
- **Training Data**: Khmer & English dataset with 31M examples with total 6Billion characters
|
| 32 |
- **Parameters**: 163M trainable parameters
|
| 33 |
- **Training Steps**: 1,122,978
|
| 34 |
-
- **Final Checkpoint**: Step
|
| 35 |
|
| 36 |
## Training Details
|
| 37 |
|
|
@@ -45,10 +46,10 @@ This is a Pretrain Language Model using XLM-RoBERTa Architecture for Khmer & Eng
|
|
| 45 |
- **Training time**: I trained this model for 10 Days
|
| 46 |
|
| 47 |
## Training Metrics
|
| 48 |
-
- **Final Training Loss**: 2.
|
| 49 |
-
- **Final Learning Rate**: 1.
|
| 50 |
-
- **Final Gradient Norm**: 5.
|
| 51 |
-
- **Training Epoch**:
|
| 52 |
|
| 53 |
|
| 54 |
## Usage
|
|
|
|
| 1 |
+
|
| 2 |
---
|
| 3 |
language:
|
| 4 |
- km
|
|
|
|
| 11 |
- pytorch
|
| 12 |
- transformers
|
| 13 |
widget:
|
| 14 |
+
- text: "ខ្ញុំចង់<mask>ភាសាខ្មែរ"
|
| 15 |
+
- text: "ប្រទេសកម្ពុជាមាន<mask>ខេត្ត"
|
| 16 |
+
- text: "រាជធានីភ្នំពេញគឺជ<mask>របស់ប្រទេសកម្ពុជា"
|
| 17 |
metrics:
|
| 18 |
- perplexity
|
| 19 |
base_model: xlm-roberta-base
|
|
|
|
| 32 |
- **Training Data**: Khmer & English dataset with 31M examples with total 6Billion characters
|
| 33 |
- **Parameters**: 163M trainable parameters
|
| 34 |
- **Training Steps**: 1,122,978
|
| 35 |
+
- **Final Checkpoint**: Step 2064000
|
| 36 |
|
| 37 |
## Training Details
|
| 38 |
|
|
|
|
| 46 |
- **Training time**: I trained this model for 10 Days
|
| 47 |
|
| 48 |
## Training Metrics
|
| 49 |
+
- **Final Training Loss**: 2.3435
|
| 50 |
+
- **Final Learning Rate**: 1.72e-05
|
| 51 |
+
- **Final Gradient Norm**: 5.9683
|
| 52 |
+
- **Training Epoch**: 14.23
|
| 53 |
|
| 54 |
|
| 55 |
## Usage
|