ModernGBERT Redewiedergabe Tagger
This model is a token classifier that recognizes German speech, thought and writing representation (STWR), that is being used in LLpro. Besides the medium (speech, thought, writing) the model also predicts the type (direct, free indirect, indirect, reported) by providing 36 classification outputs (3 media × 4 types × B,I,O).
| STWR type | Example | Translation |
|---|---|---|
| direct | Dann schrieb er: "Ich habe Hunger." | Then he wrote: "I'm hungry." |
| free indirect ('erlebte Rede') | Er war ratlos. Woher sollte er denn hier bloß ein Mittagessen bekommen? | He was at a loss. Where should he ever find lunch here? |
| indirect | Sie fragte, wo das Essen sei. | She asked where the food was. |
| reported | Sie dachte über das Mittagessen. | She thought about lunch. |
This model is a fine-tuned version of LSX-UniWue/ModernGBERT_1B on the REDEWIEDERGABE corpus (Annotation guidelines).
Performance
We report simplified F1 scores on a binarized variant (O vs B/I) for each speech type resp. medium.
| Type, Medium | F1 Score | support |
|---|---|---|
| direct.speech | 0.96 | 13598 |
| direct.thought | 0.79 | 715 |
| direct.writing | 0.19 | 996 |
| indirect.speech | 0.77 | 1226 |
| indirect.thought | 0.71 | 802 |
| indirect.writing | 0.00 | 11 |
| freeIndirect.speech | 0.73 | 198 |
| freeIndirect.thought | 0.45 | 251 |
| freeIndirect.writing | – | – |
| reported.speech | 0.69 | 1684 |
| reported.thought | 0.59 | 799 |
| reported.writing | 0.56 | 135 |
| micro avg | 0.86 | 20415 |
| macro avg | 0.54 | 20415 |
Demo Usage
from transformers import AutoModelForTokenClassification, AutoTokenizer
speech_labels = [
"direct.speech",
"direct.thought",
"direct.writing",
"indirect.speech",
"indirect.thought",
"indirect.writing",
"freeIndirect.speech",
"freeIndirect.thought",
"freeIndirect.writing",
"reported.speech",
"reported.thought",
"reported.writing",
]
text = """
Dann schrieb er: 'Ich habe Hunger."
Er war ratlos. Woher sollte er denn hier bloß ein Mittagessen bekommen?
Sie fragte, wo das Essen sei.
Sie dachte über das Mittagessen."""
model_id = 'aehrm/moderngbert-redewiedergabe'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id, num_labels=3*len(speech_labels))
inputs = tokenizer(text, return_tensors='pt')
out = model(**inputs)
batch_size, seq_len, _ = out.logits.shape
prediction = out.logits.reshape(batch_size, seq_len, 12, 3).argmax(-1)
for i, speech_label in enumerate(speech_labels):
for tok, pred in zip(tokenizer.convert_ids_to_tokens(inputs['input_ids'][0]), prediction[0,:,i]):
pred = 'OBI'[pred]
print(tok, pred, speech_label if pred != 'O' else '')
Training results
F1 Score refers to the micro average over all 36 classification outputs.
| Training Loss | Epoch | Step | Validation Loss | F1 Score |
|---|---|---|---|---|
| 0.1951 | 1.0 | 193 | 0.1332 | 0.1180 |
| 0.0885 | 2.0 | 386 | 0.2474 | 0.2724 |
| 0.0417 | 3.0 | 579 | 0.1455 | 0.4604 |
| 0.0753 | 4.0 | 772 | 0.1399 | 0.5522 |
| 0.0277 | 5.0 | 965 | 0.1447 | 0.6170 |
| 0.0238 | 6.0 | 1158 | 0.1770 | 0.6200 |
| 0.0153 | 7.0 | 1351 | 0.2257 | 0.6930 |
| 0.009 | 8.0 | 1544 | 0.6031 | 0.7336 |
| 0.0108 | 9.0 | 1737 | 0.4965 | 0.7428 |
| 0.0066 | 10.0 | 1930 | 0.4575 | 0.7492 |
| 0.0058 | 11.0 | 2123 | 0.7781 | 0.7983 |
| 0.006 | 12.0 | 2316 | 0.8648 | 0.8062 |
| 0.0043 | 13.0 | 2509 | 1.0377 | 0.8148 |
| 0.0033 | 14.0 | 2702 | 1.3040 | 0.8217 |
| 0.0025 | 15.0 | 2895 | 1.2637 | 0.8359 |
| 0.003 | 16.0 | 3088 | 1.3230 | 0.8477 |
| 0.0019 | 17.0 | 3281 | 1.9811 | 0.8439 |
| 0.0014 | 18.0 | 3474 | 2.1191 | 0.8482 |
| 0.0011 | 19.0 | 3667 | 2.3599 | 0.8510 |
| 0.0009 | 20.0 | 3860 | 2.4453 | 0.8528 |
Framework versions
- PEFT 0.17.0
- Transformers 4.55.2
- Pytorch 2.8.0+cu128
- Datasets 2.21.0
- Tokenizers 0.21.4
- Downloads last month
- 17
Model tree for aehrm/moderngbert-redewiedergabe
Base model
LSX-UniWue/ModernGBERT_1B