ModernGBERT Redewiedergabe Tagger

This model is a token classifier that recognizes German speech, thought and writing representation (STWR), that is being used in LLpro. Besides the medium (speech, thought, writing) the model also predicts the type (direct, free indirect, indirect, reported) by providing 36 classification outputs (3 media × 4 types × B,I,O).

STWR type Example Translation
direct Dann schrieb er: "Ich habe Hunger." Then he wrote: "I'm hungry."
free indirect ('erlebte Rede') Er war ratlos. Woher sollte er denn hier bloß ein Mittagessen bekommen? He was at a loss. Where should he ever find lunch here?
indirect Sie fragte, wo das Essen sei. She asked where the food was.
reported Sie dachte über das Mittagessen. She thought about lunch.

This model is a fine-tuned version of LSX-UniWue/ModernGBERT_1B on the REDEWIEDERGABE corpus (Annotation guidelines).

Training Script.

Performance

We report simplified F1 scores on a binarized variant (O vs B/I) for each speech type resp. medium.

Type, Medium F1 Score support
direct.speech 0.96 13598
direct.thought 0.79 715
direct.writing 0.19 996
indirect.speech 0.77 1226
indirect.thought 0.71 802
indirect.writing 0.00 11
freeIndirect.speech 0.73 198
freeIndirect.thought 0.45 251
freeIndirect.writing
reported.speech 0.69 1684
reported.thought 0.59 799
reported.writing 0.56 135
micro avg 0.86 20415
macro avg 0.54 20415

Demo Usage

from transformers import AutoModelForTokenClassification, AutoTokenizer

speech_labels = [
    "direct.speech",
    "direct.thought",
    "direct.writing",
    "indirect.speech",
    "indirect.thought",
    "indirect.writing",
    "freeIndirect.speech",
    "freeIndirect.thought",
    "freeIndirect.writing",
    "reported.speech",
    "reported.thought",
    "reported.writing",
]

text = """
Dann schrieb er: 'Ich habe Hunger." 
Er war ratlos. Woher sollte er denn hier bloß ein Mittagessen bekommen?
Sie fragte, wo das Essen sei.
Sie dachte über das Mittagessen."""

model_id = 'aehrm/moderngbert-redewiedergabe'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id, num_labels=3*len(speech_labels))

inputs = tokenizer(text, return_tensors='pt')
out = model(**inputs)

batch_size, seq_len, _ = out.logits.shape
prediction = out.logits.reshape(batch_size, seq_len, 12, 3).argmax(-1)

for i, speech_label in enumerate(speech_labels):
    for tok, pred in zip(tokenizer.convert_ids_to_tokens(inputs['input_ids'][0]), prediction[0,:,i]):
        pred = 'OBI'[pred]
        print(tok, pred, speech_label if pred != 'O' else '')

Training results

F1 Score refers to the micro average over all 36 classification outputs.

Training Loss Epoch Step Validation Loss F1 Score
0.1951 1.0 193 0.1332 0.1180
0.0885 2.0 386 0.2474 0.2724
0.0417 3.0 579 0.1455 0.4604
0.0753 4.0 772 0.1399 0.5522
0.0277 5.0 965 0.1447 0.6170
0.0238 6.0 1158 0.1770 0.6200
0.0153 7.0 1351 0.2257 0.6930
0.009 8.0 1544 0.6031 0.7336
0.0108 9.0 1737 0.4965 0.7428
0.0066 10.0 1930 0.4575 0.7492
0.0058 11.0 2123 0.7781 0.7983
0.006 12.0 2316 0.8648 0.8062
0.0043 13.0 2509 1.0377 0.8148
0.0033 14.0 2702 1.3040 0.8217
0.0025 15.0 2895 1.2637 0.8359
0.003 16.0 3088 1.3230 0.8477
0.0019 17.0 3281 1.9811 0.8439
0.0014 18.0 3474 2.1191 0.8482
0.0011 19.0 3667 2.3599 0.8510
0.0009 20.0 3860 2.4453 0.8528

Framework versions

  • PEFT 0.17.0
  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 2.21.0
  • Tokenizers 0.21.4
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aehrm/moderngbert-redewiedergabe

Adapter
(1)
this model