File size: 3,127 Bytes
9072738 5d48de7 c76e4d3 5d48de7 c76e4d3 5d48de7 c76e4d3 fc19706 9072738 5d48de7 f210471 5d48de7 7323cb4 5d48de7 1c6ddae 86a7183 1c6ddae 86a7183 5d48de7 90b316d 5d48de7 ff68844 12534f5 12eee11 5d48de7 ff68844 5d48de7 03c4655 5d48de7 cd9cfd3 5d48de7 91c5dac 5d48de7 91c5dac |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
---
language:
- jv
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
- sft
datasets:
- afrizalha/Gatra-2-Javanese
---
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Document Title</title>
<style>
h1 {
font-size: 36px;
color: navy;
font-family: 'Tahoma';
text-align: center;
}
</style>
</head>
<body>
<h1> Open models for indigenous Indonesian languages</h1>
</body>
</html>
<center>
<img src="https://imgur.com/PutckEK.png" alt="Bakpia" width="500" height="250">
<p><em>Bakpia is a family of open language models capable of responding in Javanese language. Version one of Bakpia is the first generative Javanese LLM gain functional instruction performance using solely synthetic data.</em></p>
<p><em style="color: black; font-weight: bold;">Beta preview</em></p>
</center>
Bakpia V1 is a family of Javanese language models. It is fine-tuned from available open models using massive synthetic data for Krama Javanese, where the prompts are generated by GPT-4o and the responses are generated by Claude 3 Haiku.
This repository contains the fp16 version of Bakpia V1 9B.
| Version | Base Model | URL | Training |
|---------|------------|-----|----------|
| V1 0.5B | Qwen 2 0.5B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-0.5B-Javanese/) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule|
| V1 1.5B | Qwen 2 1.5B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-1.5B-Javanese) | Epoch = 1, Batch = 16\*8, lr = 5e-5, linear schedule|
| V1 9B | Gemma 2 9B Instruct | [fp16](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-fp16)/[4bit](https://huggingface.co/afrizalha/Bakpia-V1-9B-Javanese-4bit/) |Batch size = 16\*8, lr = 4e-5, linear schedule|
Training data is accessible [here](https://huggingface.co/datasets/afrizalha/Gatra-2-Javanese).
## Version 1.0
This is the first version of Bakpia.
✨ Training
- 36K input-output pairs
- 64/128 lora r/alpha
- Rank-stabilized lora
✨ Features
- Single-turn QA across various domains.
- Ngoko Javanese not currently supported.
## Generate with template
```
# Update transformers for Gemma 2 compatibility
!pip install -q git+https://github.com/huggingface/transformers.git
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
tokenizer = AutoTokenizer.from_pretrained("afrizalha/Bakpia-V1-9B-Javanese-fp16")
model = AutoModelForCausalLM.from_pretrained("afrizalha/Bakpia-V1-9B-Javanese-fp16")
model.to("cuda")
template = """<start_of_turn>user
{prompt}<end_of_turn>
<start_of_turn>model
"""
input = template.format(prompt="Kados pundi kulo saged nyinaoni Basa Jawa kanthi sae?")
input = tokenizer([input], return_tensors = "pt").to("cuda")
outputs = model.generate(**input, max_new_tokens = 1024, streamer= TextStreamer(tokenizer), temperature=.5, use_cache=True, do_sample=True)
```
## Acknowledgments
- **Developed by:** Afrizal Hasbi Azizy
- **License:** Apache-2.0 |