A multi-label text classification model based on TURKCELL/roberta-base-turkish-uncased, fine-tuned to detect 14 unsafe content categories in Turkish texts.
The model is designed to serve as a guardrail safety filter for chatbots and other LLM-powered systems.

🧠 Model Overview

Property Value
Base model TURKCELL/roberta-base-turkish-uncased
Task Multi-label classification (safety moderation)
Language Turkish
Labels (unsafe topics) siyaset, toplumsal cinsiyet, şiddet, din, suç, cinsellik, göç, kimlik, uluslararası ilişkiler, toplumsal eleştiri, bahis, ruhsal, zararlı madde, kişisel haklar
Output One or multiple unsafe topics triggered, or SAFE
Thresholds Class-specific, tuned on validation set using F2 optimization

⚙️ Usage

You can load the model directly with a single pipeline:

from transformers import pipeline

# PART 2 — LOAD WITH A SINGLE pipeline CALL AND INFER

from transformers import pipeline

REPO_ID = "yeniguno/roberta-turkish-bantopic-uncased"

clf = pipeline(
    task="text-classification",
    model=REPO_ID,
    tokenizer=REPO_ID,
    trust_remote_code=True
)

print(clf("Bu akşam arkadaşlarımla film izleyeceğim."))
print(clf("Ekrem İmamoğlu'nun mevcut iktidarı yenecek olması bazılarını korkutuyor."))

Option 2 — Pure Transformers (no remote code), apply thresholds yourself

import json, numpy as np
from transformers import pipeline
from huggingface_hub import hf_hub_download

REPO_ID = "yeniguno/roberta-turkish-bantopic-uncased"

# return all label probabilities (sigmoid) for multi-label use
clf = pipeline(
    task="text-classification",
    model=REPO_ID,
    tokenizer=REPO_ID,
    top_k=None,
    function_to_apply="sigmoid"
)

# load per-label thresholds + label mapping from the repo
th_path = hf_hub_download(repo_id=REPO_ID, filename="thresholds.json")
lb_path = hf_hub_download(repo_id=REPO_ID, filename="labels.json")
thresholds = np.array(json.load(open(th_path)), dtype=float)
id2label = {int(k): v for k, v in json.load(open(lb_path)).items()}
label2id = {v:k for k,v in id2label.items()}

def guard_predict(text: str, return_scores: bool=False):
    out = clf(text)  # list of {'label': name or 'LABEL_i', 'score': float} for all labels
    scores = np.zeros(len(id2label), dtype=float)
    for d in out:
        lab = d["label"]
        idx = int(lab.split("_")[1]) if lab.startswith("LABEL_") else label2id[lab]
        scores[idx] = float(d["score"])

    fired = [(id2label[i], float(scores[i])) for i in range(len(scores)) if scores[i] >= thresholds[i]]
    if not fired:
        return {"status": "SAFE"} if not return_scores else {
            "status": "SAFE",
            "scores": {id2label[i]: float(scores[i]) for i in range(len(scores))}
        }
    fired.sort(key=lambda x: x[1], reverse=True)
    return {"status": "UNSAFE", "triggered": fired} if not return_scores else {
        "status": "UNSAFE",
        "triggered": fired,
        "scores": {id2label[i]: float(scores[i]) for i in range(len(scores))}
    }

print(clf("Bu akşam arkadaşlarımla film izleyeceğim."))
print(clf("Ekrem İmamoğlu'nun mevcut iktidarı yenecek olması bazılarını korkutuyor."))

🧩 Intended Use

This model acts as a pre-filter or guardrail before sending user inputs to an LLM. It helps detect and block or flag text that contains or relates to sensitive categories such as violence, crime, drugs, sexual content, or discrimination.

It is not a hate-speech classifier or a legal moderation system. It simply detects topic-level presence of unsafe domains.

📊 Training Details

  • Training data size: ~300k Turkish text samples
  • Positive (unsafe) examples: ~95k
  • Negative (safe) examples: ~205k
  • Loss function: BCEWithLogitsLoss with positive class weighting
  • Optimizer: AdamW (lr=2e-5)
  • Epochs: 3
  • Batch size: 16 (train), 32 (eval)
  • Hardware: NVIDIA RTX 5090 (32GB)

🧪 Evaluation Results

Metric Validation Test
Micro Precision 0.35 0.34
Micro Recall 0.83 0.83
Micro F1 0.49 0.48
Macro Precision 0.29 0.24
Macro Recall 0.63 0.60
Macro F1 0.38 0.34
Downloads last month
35
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yeniguno/roberta-turkish-bantopic-uncased

Finetuned
(3)
this model