Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction Paper • 2510.20411 • Published 15 days ago • 2
BLiSS 1.0: Evaluating Bilingual Learner Competence in Second Language Small Language Models Paper • 2510.19419 • Published 16 days ago • 1
Looking to Learn: Token-wise Dynamic Gating for Low-Resource Vision-Language Modelling Paper • 2510.08470 • Published 28 days ago • 1
BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data Paper • 2510.10159 • Published 27 days ago • 3
BabyBabelLM Collection A multilingual collection of datasets modeling the language a person observes from birth until they acquire a native language. • 45 items • Updated 8 days ago • 7
Meta-Pretraining for Zero-Shot Cross-Lingual Named Entity Recognition in Low-Resource Philippine Languages Paper • 2509.02160 • Published Sep 2 • 1
Pico: A Modular Framework for Hypothesis-Driven Small Language Model Research Paper • 2509.16413 • Published Sep 19 • 1
view article Article There is no such thing as a tokenizer-free lunch By catherinearnett • Sep 25 • 86