--- datasets: - togethercomputer/RedPajama-Data-V2 language: - de library_name: transformers license: other pipeline_tag: feature-extraction tags: - fill-mask - masked-lm - long-context - modernbert --- # ModernGBERT 1B This is a German ModernBERT 1B language model trained from scratch using the ModernBERT [codebase](https://github.com/AnswerDotAI/ModernBERT) and the same German portion of [RedPajama V2](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2) as our [LLäMmlein](https://huggingface.co/collections/LSX-UniWue/llammlein-6732ff41f3705c686e605762) family. Find more details in our [preprint](https://arxiv.org/abs/2505.13136)! ### Usage ```python from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained("LSX-UniWue/ModernGBERT_1B") tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/ModernGBERT_1B") ``` ### Performance We evaluated our model on the [SuperGLEBer](https://lsx-uniwue.github.io/SuperGLEBer-site/) benchmark.