You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

roberta-fnf-taglish-v1

RoBERTa-Tagalog fine-tuned for binary fake-news detection (real, fake).

Training setup

Cluster-disjoint splits on cleaned FNF corpus
Train-only Taglish paraphrase augmentation
Base tokenizer: jcblaise/roberta-tagalog-base

Quickstart

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
repo = 'renshhhh/roberta-fnf-taglish-v1'
tok = AutoTokenizer.from_pretrained(repo, use_fast=True)
mdl = AutoModelForSequenceClassification.from_pretrained(repo)
text = 'Ito ay balitang halimbawa lang, hindi totoong artikulo.'
batch = tok(text, return_tensors='pt', truncation=True, max_length=256)
with torch.no_grad():
    probs = mdl(**batch).logits.softmax(-1).tolist()[0]
id2label = mdl.config.id2label
print({id2label[i]: float(probs[i]) for i in range(len(probs))})

Eval (cluster-disjoint test)

Accuracy = 0.947
Weighted F1 = 0.947

Downloads last month: 26

Safetensors

Model size

0.1B params

Tensor type

F32

Evaluation results

f1 on FNF (cluster-disjoint test + Taglish train-only augmentation)
test set self-reported

0.947
accuracy on FNF (cluster-disjoint test + Taglish train-only augmentation)
test set self-reported

0.947

View on Papers With Code