A multi-label text classification model based on TURKCELL/roberta-base-turkish-uncased, fine-tuned to detect 14 unsafe content categories in Turkish texts.
The model is designed to serve as a guardrail safety filter for chatbots and other LLM-powered systems.

🧠 Model Overview

Property	Value
Base model	`TURKCELL/roberta-base-turkish-uncased`
Task	Multi-label classification (safety moderation)
Language	Turkish
Labels (unsafe topics)	siyaset, toplumsal cinsiyet, şiddet, din, suç, cinsellik, göç, kimlik, uluslararası ilişkiler, toplumsal eleştiri, bahis, ruhsal, zararlı madde, kişisel haklar
Output	One or multiple unsafe topics triggered, or `SAFE`
Thresholds	Class-specific, tuned on validation set using F2 optimization

⚙️ Usage

You can load the model directly with a single pipeline:

from transformers import pipeline

# PART 2 — LOAD WITH A SINGLE pipeline CALL AND INFER

from transformers import pipeline

REPO_ID = "yeniguno/roberta-turkish-bantopic-uncased"

clf = pipeline(
    task="text-classification",
    model=REPO_ID,
    tokenizer=REPO_ID,
    trust_remote_code=True
)

print(clf("Bu akşam arkadaşlarımla film izleyeceğim."))
print(clf("Ekrem İmamoğlu'nun mevcut iktidarı yenecek olması bazılarını korkutuyor."))

Option 2 — Pure Transformers (no remote code), apply thresholds yourself

import json, numpy as np
from transformers import pipeline
from huggingface_hub import hf_hub_download

REPO_ID = "yeniguno/roberta-turkish-bantopic-uncased"

# return all label probabilities (sigmoid) for multi-label use
clf = pipeline(
    task="text-classification",
    model=REPO_ID,
    tokenizer=REPO_ID,
    top_k=None,
    function_to_apply="sigmoid"
)

# load per-label thresholds + label mapping from the repo
th_path = hf_hub_download(repo_id=REPO_ID, filename="thresholds.json")
lb_path = hf_hub_download(repo_id=REPO_ID, filename="labels.json")
thresholds = np.array(json.load(open(th_path)), dtype=float)
id2label = {int(k): v for k, v in json.load(open(lb_path)).items()}
label2id = {v:k for k,v in id2label.items()}

def guard_predict(text: str, return_scores: bool=False):
    out = clf(text)  # list of {'label': name or 'LABEL_i', 'score': float} for all labels
    scores = np.zeros(len(id2label), dtype=float)
    for d in out:
        lab = d["label"]
        idx = int(lab.split("_")[1]) if lab.startswith("LABEL_") else label2id[lab]
        scores[idx] = float(d["score"])

    fired = [(id2label[i], float(scores[i])) for i in range(len(scores)) if scores[i] >= thresholds[i]]
    if not fired:
        return {"status": "SAFE"} if not return_scores else {
            "status": "SAFE",
            "scores": {id2label[i]: float(scores[i]) for i in range(len(scores))}
        }
    fired.sort(key=lambda x: x[1], reverse=True)
    return {"status": "UNSAFE", "triggered": fired} if not return_scores else {
        "status": "UNSAFE",
        "triggered": fired,
        "scores": {id2label[i]: float(scores[i]) for i in range(len(scores))}
    }

print(clf("Bu akşam arkadaşlarımla film izleyeceğim."))
print(clf("Ekrem İmamoğlu'nun mevcut iktidarı yenecek olması bazılarını korkutuyor."))

🧩 Intended Use

This model acts as a pre-filter or guardrail before sending user inputs to an LLM. It helps detect and block or flag text that contains or relates to sensitive categories such as violence, crime, drugs, sexual content, or discrimination.

It is not a hate-speech classifier or a legal moderation system. It simply detects topic-level presence of unsafe domains.

📊 Training Details

Training data size: ~300k Turkish text samples
Positive (unsafe) examples: ~95k
Negative (safe) examples: ~205k
Loss function: BCEWithLogitsLoss with positive class weighting
Optimizer: AdamW (lr=2e-5)
Epochs: 3
Batch size: 16 (train), 32 (eval)
Hardware: NVIDIA RTX 5090 (32GB)

🧪 Evaluation Results

Metric	Validation	Test
Micro Precision	0.35	0.34
Micro Recall	0.83	0.83
Micro F1	0.49	0.48
Macro Precision	0.29	0.24
Macro Recall	0.63	0.60
Macro F1	0.38	0.34

Downloads last month: 35

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for yeniguno/roberta-turkish-bantopic-uncased

Base model

TURKCELL/roberta-base-turkish-uncased

Finetuned

(3)

this model