Ibracadabra13's picture
Upload Arabic BERT hate speech detection model
cbc9684 verified
metadata
language: ar
license: mit
tags:
  - arabic
  - hate-speech-detection
  - bert
  - text-classification
  - pytorch
datasets:
  - arabic-levantine-hate-speech-detection
metrics:
  - accuracy
  - f1
model-index:
  - name: arabic-bert-hate-speech-detection
    results:
      - task:
          type: text-classification
          name: Hate Speech Detection
        dataset:
          type: arabic-levantine-hate-speech-detection
          name: Arabic Levantine Hate Speech Detection
        metrics:
          - type: accuracy
            value: 0.845
            name: Accuracy
          - type: f1
            value: 0.84
            name: F1 Score

Arabic BERT Hate Speech Detection

This model is a fine-tuned version of aubmindlab/bert-base-arabertv2 for Arabic hate speech detection.

Model Description

  • Base Model: aubmindlab/bert-base-arabertv2
  • Task: Binary text classification (Normal vs Hate Speech)
  • Language: Arabic
  • Accuracy: 84.5%

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "Ibracadabra13/arabic-bert-hate-speech-detection"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Function to predict hate speech
def predict_hate_speech(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
    
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
        predicted_class = torch.argmax(predictions, dim=-1).item()
        confidence = predictions[0][predicted_class].item()
    
    label_map = {0: 'Normal', 1: 'Hate Speech'}
    return {
        'prediction': label_map[predicted_class],
        'confidence': confidence,
        'is_hate_speech': predicted_class == 1
    }

# Example usage
result = predict_hate_speech("أنت حيوان حقير")
print(result)  # {'prediction': 'Hate Speech', 'confidence': 0.97, 'is_hate_speech': True}

Training Details

  • Training Data: Arabic Levantine Hate Speech Detection Dataset
  • Training Method: Fine-tuning with manual training loop
  • Epochs: 2
  • Batch Size: 4
  • Learning Rate: 2e-5
  • Optimizer: AdamW

Performance

  • Accuracy: 84.5%
  • Normal Text: 83% precision, 96% recall
  • Hate Speech: 90% precision, 65% recall

Limitations

This model is trained on a specific dataset and may not generalize well to all Arabic dialects or contexts. Use with caution in production environments.