Hanuman
π Model Details
Overview
- Name: Hanuman
- Language: Thai (th)
- Task: Text Generation (Causal LM)
- Framework: PyTorch + π€ Transformers
- License: CC BY-NC 4.0 (Non-commercial use only)
Training Datasets
Architecture
- Custom tokenizer for Thai language (handles whitespace, newline, tab,
<NL>,<SPACE>,<TAB>etc.)
β Intended Use
Primary Use Cases
- Thai text generation (blogs, articles, captions, chatbots)
- Creative and reasoning-oriented text assistance
- Thai NLP research
Limitations
- This model is research-oriented and may require additional fine-tuning for production use.
- May generate incorrect or biased outputs. Human verification is recommended.
π§° Tokenizer & Context
- Custom fast tokenizer (no
trust_remote_codeneeded) - Ensures round-trip encode/decode correctness
- Unicode NFC normalization included
- Handles ThaiβLatin spacing consistently
π Usage Examples
Basic Text Generation
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_ID = "ZombitX64/Hanuman"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(MODEL_ID)
def generate_thai_text(prompt, max_length=100):
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_length=max_length,
temperature=0.7,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generate_thai_text("Artificial intelligence technology"))
Batch Processing
prompts = ["Hello", "Thailand has an area of", "Education in the digital era"]
for p in prompts:
print(generate_thai_text(p, max_length=80))
print("-"*50)
ποΈ Training Process
Dataset Preparation
- Source: Wikipedia Thai and reasoning-style datasets
- Preprocessing: Cleaning, Unicode normalization, tokenization
- Training mode: Streaming
Example Training Configuration
training_args = {
"per_device_train_batch_size": 2,
"per_device_eval_batch_size": 2,
"gradient_accumulation_steps": 4,
"num_train_epochs": 2,
"learning_rate": 5e-5,
"warmup_steps": 10,
"logging_steps": 10,
"eval_steps": 50,
"save_steps": 50,
"fp16": False, # CPU training
"dataloader_num_workers": 0
}
π Evaluation
The model is currently in research phase. Formal evaluation results (perplexity, Thai downstream benchmarks) will be added in the future.
π€ Contributing
This project is part of ongoing Thai NLP research. Feedback, issues, and contributions are welcome!
π Citation
@misc{Hanuman2025,
title = {Hanuman: Thai Small Language Model},
author = {JonusNattapong and Koichi Yasuoka},
year = {2025},
howpublished = {\url{https://huggingface.co/ZombitX64/Hanuman}},
note = {Tokenizer advisor: Koichi Yasuoka}
}
β οΈ Disclaimer: This model is intended for research and educational purposes only. Use in commercial applications requires prior permission under the CC BY-NC 4.0 license.
- Downloads last month
- 32