ModernGBERT Redewiedergabe Tagger

This model is a token classifier that recognizes German speech, thought and writing representation (STWR), that is being used in LLpro. Besides the medium (speech, thought, writing) the model also predicts the type (direct, free indirect, indirect, reported) by providing 36 classification outputs (3 media × 4 types × B,I,O).

STWR type	Example	Translation
direct	Dann schrieb er: "Ich habe Hunger."	Then he wrote: "I'm hungry."
free indirect ('erlebte Rede')	Er war ratlos. Woher sollte er denn hier bloß ein Mittagessen bekommen?	He was at a loss. Where should he ever find lunch here?
indirect	Sie fragte, wo das Essen sei.	She asked where the food was.
reported	Sie dachte über das Mittagessen.	She thought about lunch.

This model is a fine-tuned version of LSX-UniWue/ModernGBERT_1B on the REDEWIEDERGABE corpus (Annotation guidelines).

Training Script.

Performance

We report simplified F1 scores on a binarized variant (O vs B/I) for each speech type resp. medium.

Type, Medium	F1 Score	support
direct.speech	0.96	13598
direct.thought	0.79	715
direct.writing	0.19	996
indirect.speech	0.77	1226
indirect.thought	0.71	802
indirect.writing	0.00	11
freeIndirect.speech	0.73	198
freeIndirect.thought	0.45	251
freeIndirect.writing	–	–
reported.speech	0.69	1684
reported.thought	0.59	799
reported.writing	0.56	135
micro avg	0.86	20415
macro avg	0.54	20415

Demo Usage

from transformers import AutoModelForTokenClassification, AutoTokenizer

speech_labels = [
    "direct.speech",
    "direct.thought",
    "direct.writing",
    "indirect.speech",
    "indirect.thought",
    "indirect.writing",
    "freeIndirect.speech",
    "freeIndirect.thought",
    "freeIndirect.writing",
    "reported.speech",
    "reported.thought",
    "reported.writing",
]

text = """
Dann schrieb er: 'Ich habe Hunger." 
Er war ratlos. Woher sollte er denn hier bloß ein Mittagessen bekommen?
Sie fragte, wo das Essen sei.
Sie dachte über das Mittagessen."""

model_id = 'aehrm/moderngbert-redewiedergabe'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id, num_labels=3*len(speech_labels))

inputs = tokenizer(text, return_tensors='pt')
out = model(**inputs)

batch_size, seq_len, _ = out.logits.shape
prediction = out.logits.reshape(batch_size, seq_len, 12, 3).argmax(-1)

for i, speech_label in enumerate(speech_labels):
    for tok, pred in zip(tokenizer.convert_ids_to_tokens(inputs['input_ids'][0]), prediction[0,:,i]):
        pred = 'OBI'[pred]
        print(tok, pred, speech_label if pred != 'O' else '')

Training results

F1 Score refers to the micro average over all 36 classification outputs.

Training Loss	Epoch	Step	Validation Loss	F1 Score
0.1951	1.0	193	0.1332	0.1180
0.0885	2.0	386	0.2474	0.2724
0.0417	3.0	579	0.1455	0.4604
0.0753	4.0	772	0.1399	0.5522
0.0277	5.0	965	0.1447	0.6170
0.0238	6.0	1158	0.1770	0.6200
0.0153	7.0	1351	0.2257	0.6930
0.009	8.0	1544	0.6031	0.7336
0.0108	9.0	1737	0.4965	0.7428
0.0066	10.0	1930	0.4575	0.7492
0.0058	11.0	2123	0.7781	0.7983
0.006	12.0	2316	0.8648	0.8062
0.0043	13.0	2509	1.0377	0.8148
0.0033	14.0	2702	1.3040	0.8217
0.0025	15.0	2895	1.2637	0.8359
0.003	16.0	3088	1.3230	0.8477
0.0019	17.0	3281	1.9811	0.8439
0.0014	18.0	3474	2.1191	0.8482
0.0011	19.0	3667	2.3599	0.8510
0.0009	20.0	3860	2.4453	0.8528

Framework versions

PEFT 0.17.0
Transformers 4.55.2
Pytorch 2.8.0+cu128
Datasets 2.21.0
Tokenizers 0.21.4

Downloads last month: 17

Model tree for aehrm/moderngbert-redewiedergabe

Base model

LSX-UniWue/ModernGBERT_1B

Adapter

(1)

this model