Safetensors
English
gpt2
codebyzeb's picture
Create README.md
e375563 verified
metadata
datasets:
  - phonemetransformers/IPA-BabyLM
language:
  - en
base_model:
  - openai-community/gpt2

GPT2 trained on the BabyLM 2024 training set (in IPA) using a BPE tokenizer with word boundaries removed.

Model trained for From Babble to Words: Pre-Training Language Models on Continuous Streams of Phonemes.