mbert: Emotion Recognition for Vietnamese Text

This model is a fine-tuned version of bert-base-multilingual-cased on the VSMEC dataset for emotion recognition in Vietnamese text.

Model Details

  • Base Model: bert-base-multilingual-cased
  • Description: Multilingual BERT
  • Dataset: VSMEC (Vietnamese Social Media Emotion Corpus)
  • Fine-tuning Framework: HuggingFace Transformers
  • Task: Emotion Classification (7 classes)

Hyperparameters

  • Batch size: 32
  • Learning rate: 2e-5
  • Epochs: 100
  • Max sequence length: 256
  • Weight decay: 0.01
  • Warmup steps: 500

Dataset

The model was trained on the VSMEC dataset, which contains 6,927 Vietnamese social media text samples annotated with emotion labels. The dataset includes the following emotion categories:

  • Enjoyment (0): Positive emotions, joy, happiness
  • Sadness (1): Sad, disappointed, gloomy feelings
  • Anger (2): Angry, frustrated, irritated
  • Fear (3): Scared, anxious, worried
  • Disgust (4): Disgusted, repelled
  • Surprise (5): Surprised, shocked, amazed
  • Other (6): Neutral or unclassified emotions

Results

The model was evaluated using the following metrics:

  • Accuracy: 0.5455
  • Macro-F1: 0.5064
  • Macro-Precision: 0.6097
  • Macro-Recall: 0.4803

Usage

You can use this model for emotion recognition in Vietnamese text. Below is an example of how to use it with the HuggingFace Transformers library:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(f"visolex/{model_key}")
model = AutoModelForSequenceClassification.from_pretrained(f"visolex/{model_key}")

# Example text
text = "Tôi rất vui vì hôm nay trời đẹp!"

# Tokenize
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)

# Predict
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()

# Map to emotion name
emotion_map = {{
    0: "Enjoyment",
    1: "Sadness",
    2: "Anger",
    3: "Fear",
    4: "Disgust",
    5: "Surprise",
    6: "Other"
}}

predicted_emotion = emotion_map[predicted_class]
print(f"Text: {{text}}")
print(f"Predicted emotion: {{predicted_emotion}}")

Citation

If you use this model, please cite:

@misc{{visolex_emotion_{model_key},
  title={{ {description} for Vietnamese Emotion Recognition}},
  author={{ViSoLex Team}},
  year={{2024}},
  url={{https://huggingface.co/visolex/{model_key}}}
}}

License

This model is released under the Apache-2.0 license.

Acknowledgments

  • Base model: {base_model}
  • Dataset: VSMEC (Vietnamese Social Media Emotion Corpus)
  • ViSoLex Toolkit
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for visolex/emotion-mbert

Finetuned
(890)
this model

Evaluation results