BERTu_taxi1500-mlt / README.md
KurtMica's picture
Update README.md
d2fe5d6 verified
metadata
library_name: transformers
language:
  - mt
license: cc-by-nc-sa-4.0
base_model: MLRS/BERTu
model-index:
  - name: BERTu_taxi1500-mlt
    results:
      - task:
          type: text-classification
          name: Topic Classification
        dataset:
          type: taxi1500-mlt_Latn
          name: taxi1500
          config: mlt_Latn
        metrics:
          - type: f1
            args: macro
            value: 78.83
            name: Macro-averaged F1
        source:
          name: MELABench Leaderboard
          url: https://huggingface.co/spaces/MLRS/MELABench
extra_gated_fields:
  Name: text
  Surname: text
  Date of Birth: date_picker
  Organisation: text
  Country: country
  I agree to use this model in accordance to the license and for non-commercial use ONLY: checkbox

BERTu (Taxi1500 Maltese)

This model is a fine-tuned version of MLRS/BERTu on the Taxi1500 dataset. It achieves the following results on the test set:

  • Loss: 0.9589
  • F1: 0.7883

Intended uses & limitations

The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.

Training procedure

The model was fine-tuned using a customised script.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 2
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: inverse_sqrt
  • lr_scheduler_warmup_ratio: 0.005
  • num_epochs: 200.0
  • early_stopping_patience: 20

Training results

Training Loss Epoch Step Validation Loss F1
No log 1.0 54 1.3509 0.3101
No log 2.0 108 0.9772 0.5176
No log 3.0 162 0.8594 0.5698
No log 4.0 216 0.8265 0.5955
No log 5.0 270 0.8582 0.6333
No log 6.0 324 0.9635 0.6319
No log 7.0 378 1.0398 0.6322
No log 8.0 432 1.1267 0.6378
No log 9.0 486 1.2027 0.6437
0.4687 10.0 540 1.2412 0.6258
0.4687 11.0 594 1.2330 0.6517
0.4687 12.0 648 1.2771 0.6376
0.4687 13.0 702 1.3023 0.6315
0.4687 14.0 756 1.3134 0.6432
0.4687 15.0 810 1.3096 0.6498
0.4687 16.0 864 1.3191 0.6517
0.4687 17.0 918 1.3278 0.6725
0.4687 18.0 972 1.3554 0.6791
0.004 19.0 1026 1.3726 0.6641
0.004 20.0 1080 1.3710 0.6791
0.004 21.0 1134 1.3812 0.7017
0.004 22.0 1188 1.4026 0.6967
0.004 23.0 1242 1.4099 0.6950
0.004 24.0 1296 1.4162 0.6858
0.004 25.0 1350 1.4240 0.6932
0.004 26.0 1404 1.4322 0.6932
0.004 27.0 1458 1.3757 0.6903
0.0017 28.0 1512 1.4508 0.7084
0.0017 29.0 1566 1.4679 0.7084
0.0017 30.0 1620 1.4591 0.7084
0.0017 31.0 1674 1.4587 0.7084
0.0017 32.0 1728 1.4830 0.7084
0.0017 33.0 1782 1.4778 0.7084
0.0017 34.0 1836 1.4931 0.7084
0.0017 35.0 1890 1.5002 0.7084
0.0017 36.0 1944 1.5073 0.7084
0.0017 37.0 1998 1.5027 0.7084
0.0011 38.0 2052 1.4485 0.7004
0.0011 39.0 2106 1.5335 0.6895
0.0011 40.0 2160 1.5393 0.7084
0.0011 41.0 2214 1.5426 0.7084
0.0011 42.0 2268 1.5430 0.7084
0.0011 43.0 2322 1.5405 0.7017
0.0011 44.0 2376 1.5581 0.7084
0.0011 45.0 2430 1.5534 0.7011
0.0011 46.0 2484 1.5674 0.7084
0.0008 47.0 2538 1.5647 0.7084
0.0008 48.0 2592 1.5741 0.7084

Framework versions

  • Transformers 4.51.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.2.0
  • Tokenizers 0.21.1

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.

CC BY-NC-SA 4.0

Citation

This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:

@inproceedings{micallef-borg-2025-melabenchv1,
    title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
    author = "Micallef, Kurt  and
      Borg, Claudia",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.1053/",
    doi = "10.18653/v1/2025.findings-acl.1053",
    pages = "20505--20527",
    ISBN = "979-8-89176-256-5",
}