NepaliGPT: Nepali Language Generative Pretrained Transformer Model
This is an experiment for developing a language generation model for the Nepali language. Causal Language Model which can predict the next possible tokens given a context in Nepali language.
Dataset Used
A large corpus of 9.3 GB size has been collected from different sources on the internet. The sources include
- Nepali Books found online.
 - Nepali News Article from Nepali news portals.
 - Nepali text collected from different open source Nepali NLP datasets.
 
Hyperparameters Used
Learning rate -> 2e-5 
Weight Decay -> 0.01 
Number of training epochs -> 5 \ 
bf16 -> True 
Base Model Architecture -> GPT-2 \
Training Results
It achieves the following results on the evaluation set:
| Training Loss | Validation Loss | Perplexity | 
|---|---|---|
| 3.3968 | 3.2705 | 26.3245 | 
- Downloads last month
 - 2