Sambodhan MultiTask XLM-RoBERTa

Fine-tuned XLM-RoBERTa for multi-task classification of citizen grievances in Nepali:

  • Department classification: 4 classes (e.g., Water, Electricity, Roads, Others)
  • Urgency classification: 3 classes (Low, Medium, High)

This model is designed for grievance routing and prioritization, part of the Sambodhan AI grievance redressal system.


Model Details

  • Architecture: XLM-RoBERTa-base shared encoder with 2 classification heads
  • Training dataset: Custom-labeled Nepali grievances
  • Loss: Sum of CrossEntropy for department + urgency
  • Evaluation metrics: Weighted F1 score
Task Weighted F1
Department Classification 0.853
Urgency Classification 0.684
Overall 0.768

Usage

from transformers import AutoTokenizer
from src.modeling_multitask import MultiTaskForSequenceClassification, MultiTaskConfig
import torch

# Hugging Face repo ID
repo = "mr-kush/sambodhan-multitask-xlmroberta"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(repo)
config = MultiTaskConfig.from_pretrained(repo)
model = MultiTaskForSequenceClassification.from_pretrained(repo, config=config)
model.eval()

# Example input
text = "Water supply cut off in my area for 3 days"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)

# Inference
with torch.no_grad():
    out = model(**inputs)

# Extract predictions
dept_logits, urg_logits = out["logits"]
dept_pred = torch.argmax(dept_logits, dim=-1).item()
urg_pred = torch.argmax(urg_logits, dim=-1).item()

print("Department:", dept_pred, "Urgency:", urg_pred)
Downloads last month
8
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support