The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models Paper • 2510.13996 • Published 12 days ago • 6
Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian Paper • 2509.05668 • Published Sep 6 • 5
Data Centric Domain Adaptation for Historical Text with OCR Errors Paper • 2107.00927 • Published Jul 2, 2021
Towards Robust Named Entity Recognition for Historic German Paper • 1906.07592 • Published Jun 18, 2019
Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0 Paper • 2204.05211 • Published Apr 11, 2022
FLERT: Document-Level Features for Named Entity Recognition Paper • 2011.06993 • Published Nov 13, 2020
hmBERT: Historical Multilingual Language Models for Named Entity Recognition Paper • 2205.15575 • Published May 31, 2022