korean-gpt-150m-ko

한국어 Foundation Model (GPT-style, From Scratch)

Model Description

이 모델은 처음부터(from scratch) 학습된 한국어 생성 모델입니다.

  • Language: Korean (한국어)
  • Model Type: Autoregressive Language Model
  • Architecture: Custom GPT (Transformer Decoder)
  • Training: Self-supervised causal language modeling
  • Dataset: Korean text corpus


⚠️ Limitations

  • 긴 문맥 처리에 제한이 있음 (512 tokens)
  • 학습 데이터의 편향이 반영될 수 있음
  • 사실적 정확성이 완벽히 보장되지 않음

🚀 How to Use

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("oz1115/korean-gpt-150m-ko")
model = AutoModel.from_pretrained("oz1115/korean-gpt-150m-ko", trust_remote_code=True)

prompt = "인공지능의 미래는"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(inputs["input_ids"], max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))


## Citation
@misc{korean-gpt-150m-ko,
  author = {oz1115},
  title = {korean-gpt-150m-ko: Korean Foundation Model},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/oz1115/korean-gpt-150m-ko}
}

## Contact

HuggingFace: @oz1115

## License
MIT
Downloads last month
162
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train oz1115/korean-gpt-150m-ko