Model Card for Multi‑Label Emotion Classification on Reddit Comments
This repository contains training and inference code for multi‑label emotion classification of Reddit comments using the GoEmotions dataset (27 emotions + neutral) with a RoBERTa‑base encoder. It includes a configuration‑driven training script, evaluation, decision‑threshold tuning, and a lightweight inference entrypoint.
Repository: https://github.com/amirhossein-yousefi/multi-label-emotion-classification-reddit-comments
Model Details
Model Description
This project fine‑tunes a Transformer encoder for multi‑label emotion detection on Reddit comments. The default configuration uses roberta-base, binary cross‑entropy loss (optionally focal loss), and grid‑search threshold tuning on the validation set.
- Developed by: GitHub @amirhossein-yousefi
 - Model type: Multi‑label text classification (Transformer encoder)
 - Language(s) (NLP): English
 - License: No explicit license file was found in the repository; treat as “all rights reserved” unless the author adds a license.
 - Finetuned from model : 
roberta-base 
Model Sources
- Repository: https://github.com/amirhossein-yousefi/multi-label-emotion-classification-reddit-comments
 - Paper [dataset]: GoEmotions: A Dataset of Fine‑Grained Emotions (Demszky et al., 2020)
 
Uses
Direct Use
- Tagging short English texts (e.g., social posts, comments) with multiple emotions from the GoEmotions taxonomy (e.g., joy, sadness, anger, admiration, gratitude, etc.).
 - Exploratory analytics and visualization of emotion distributions in corpora similar to Reddit.
 
Downstream Use
- Fine‑tuning or domain adaptation to platforms beyond Reddit (forums, support tickets, app reviews).
 - Serving as a baseline component in moderation pipelines or empathetic response systems (with careful human oversight).
 
Out‑of‑Scope Use
- Medical, psychological, or diagnostic use; mental‑health inference.
 - High‑stakes decisions (employment, lending, safety) without rigorous, domain‑specific validation.
 - Non‑English or heavily code‑switched text without additional training/testing.
 
Bias, Risks, and Limitations
- Dataset origin: GoEmotions is built from Reddit comments; models may inherit Reddit‑specific discourse, slang, and toxicity patterns and may underperform on other domains.
 - Annotation noise: Third‑party analyses have raised concerns about mislabels in GoEmotions; treat labels as imperfect and consider human review for critical use cases.
 - Multi‑label uncertainty: Threshold choice materially affects precision/recall trade‑offs. The repo tunes the threshold on validation data; you should recalibrate for your domain.
 
Recommendations
- Calibrate thresholds on in‑domain validation data (the repo grid‑searches 0.05–0.95).
 - Report per‑label metrics, especially for minority emotions.
 - Consider bias audits and human‑in‑the‑loop review before deployment.
 
How to Get Started with the Model
Environment
- Python ≥ 3.13
 - Install dependencies:
pip install -r requirements.txt 
Train
The Makefile provides a default train target:
python -m emoclass.train --config configs/base.yaml
Inference
After training (or pointing to a trained directory), run:
python -m emoclass.inference --model_dir outputs/goemotions_roberta --text "I love this!" "This is awful."
Training Details
Training Data
- Dataset: GoEmotions (27 emotions + neutral). The default config uses the 
simplifiedvariant. - Text column: 
text - Labels column: 
labels - Max sequence length: 192
 
Training Procedure
Preprocessing
- Standard Transformer tokenization for 
roberta-base. - Multi‑hot label encoding for emotions.
 
Training Hyperparameters
- Base model: 
roberta-base - Batch size: 16 (train), 32 (eval)
 - Learning rate: 2e‑5
 - Epochs: 5
 - Weight decay: 0.01
 - Warmup ratio: 0.06
 - Gradient accumulation: 1
 - Precision: bf16/fp16 if available
 - Loss: Binary Cross‑Entropy (optionally focal loss with γ=2.0, α=0.25)
 - Threshold tuning: grid 0.05 → 0.95 (step 0.01); best val micro‑F1 ≈ 0.84
 - LoRA/PEFT: available in config (default off)
 
Speeds, Sizes, Times
- See 
results.txtfor an example run’s timing & throughput logs. 
Evaluation
Testing Data, Factors & Metrics
- Test split: GoEmotions 
simplifiedtest. - Metrics: micro/macro/sample F1, micro/macro Average Precision (AP), micro/macro ROC‑AUC.
 
Results (example run)
- Threshold (val‑tuned): 0.84
 - F1 (micro): 0.5284
 - F1 (macro): 0.4995
 - F1 (samples): 0.5301
 - AP (micro): 0.5352
 - AP (macro): 0.5087
 - ROC‑AUC (micro): 0.9517
 - ROC‑AUC (macro): 0.9310
 
(See results.txt for the full log and any updates.)
Model Examination
- Inspect per‑label thresholds and confusion patterns; minority emotions (e.g., grief, pride, nervousness) often suffer lower F1 and need more tuning or class‑balancing strategies.
 
Environmental Impact
- Not measured. If desired, log GPU type, hours, region, and estimate emissions using the ML CO2 calculator.
 
Technical Specifications
Model Architecture and Objective
- Transformer encoder (
roberta-base) fine‑tuned with a sigmoid multi‑label head and BCE (or focal) loss. 
Compute Infrastructure
- Frameworks: 
transformers,datasets,accelerate,evaluate,scikit-learn, optionalpeft. - Hardware/software specifics are user‑dependent.
 
Citation
GoEmotions (dataset/paper):
Demszky, D., Movshovitz-Attias, D., Ko, J., Cowen, A., Nemade, G., & Ravi, S. (2020). GoEmotions: A Dataset of Fine‑Grained Emotions. ACL 2020. https://arxiv.org/abs/2005.00547
BibTeX:
@inproceedings{demszky2020goemotions,
  title={GoEmotions: A Dataset of Fine-Grained Emotions},
  author={Demszky, Dorottya and Movshovitz-Attias, Dana and Ko, Jeongwoo and Cowen, Alan and Nemade, Gaurav and Ravi, Sujith},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
  year={2020}
}
Glossary
- AP: Average Precision (area under precision–recall curve).
 - AUC: Area under ROC curve.
 - Micro/Macro F1: Micro aggregates over all labels; macro averages per‑label F1.
 
More Information
- The configuration file at 
configs/base.yamldocuments tweakable knobs (loss type, LoRA, precision, etc.). - Artifacts are saved under 
outputs/by default. 
Model Card Authors
- Original code: @amirhossein-yousefi
 - Model card: generated programmatically for documentation purposes.
 
Model Card Contact
- Open an issue in the GitHub repository.
 
- Downloads last month
 - 3
 
Model tree for Amirhossein75/multi-label-emotion-classification-reddit-comments-roberta
Base model
FacebookAI/roberta-baseDataset used to train Amirhossein75/multi-label-emotion-classification-reddit-comments-roberta
Evaluation results
- F1 (micro) on GoEmotionstest set self-reported0.528
 - F1 (macro) on GoEmotionstest set self-reported0.500
 - F1 (samples) on GoEmotionstest set self-reported0.530
 - Average Precision (micro) on GoEmotionstest set self-reported0.535
 - Average Precision (macro) on GoEmotionstest set self-reported0.509
 - ROC AUC (micro) on GoEmotionstest set self-reported0.952
 - ROC AUC (macro) on GoEmotionstest set self-reported0.931