Sambodhan MultiTask XLM-RoBERTa
Fine-tuned XLM-RoBERTa for multi-task classification of citizen grievances in Nepali:
- Department classification: 4 classes (e.g., Water, Electricity, Roads, Others)
- Urgency classification: 3 classes (Low, Medium, High)
This model is designed for grievance routing and prioritization, part of the Sambodhan AI grievance redressal system.
Model Details
- Architecture: XLM-RoBERTa-base shared encoder with 2 classification heads
- Training dataset: Custom-labeled Nepali grievances
- Loss: Sum of CrossEntropy for department + urgency
- Evaluation metrics: Weighted F1 score
| Task | Weighted F1 | 
|---|---|
| Department Classification | 0.853 | 
| Urgency Classification | 0.684 | 
| Overall | 0.768 | 
Usage
from transformers import AutoTokenizer
from src.modeling_multitask import MultiTaskForSequenceClassification, MultiTaskConfig
import torch
# Hugging Face repo ID
repo = "mr-kush/sambodhan-multitask-xlmroberta"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(repo)
config = MultiTaskConfig.from_pretrained(repo)
model = MultiTaskForSequenceClassification.from_pretrained(repo, config=config)
model.eval()
# Example input
text = "Water supply cut off in my area for 3 days"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
# Inference
with torch.no_grad():
    out = model(**inputs)
# Extract predictions
dept_logits, urg_logits = out["logits"]
dept_pred = torch.argmax(dept_logits, dim=-1).item()
urg_pred = torch.argmax(urg_logits, dim=-1).item()
print("Department:", dept_pred, "Urgency:", urg_pred)
- Downloads last month
- 8
Evaluation results
- f1self-reported0.853
- f1self-reported0.684
