|
|
--- |
|
|
language: |
|
|
- cy |
|
|
license: mit |
|
|
library_name: peft |
|
|
tags: |
|
|
- peft |
|
|
- feature-extraction |
|
|
datasets: |
|
|
- webnlg-challenge/web_nlg |
|
|
metrics: |
|
|
- precision |
|
|
- recall |
|
|
base_model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7 |
|
|
--- |
|
|
|
|
|
# MonoLR_cym_Latn_PR |
|
|
|
|
|
This is the Welsh (cym_Latn) Monolingual LoRA adapter from [Semantic Evaluation of Multilingual Data-to-Text Generation via NLI Fine-Tuning: Precision, Recall and F1 scores |
|
|
](https://hal.science/hal-05138142v1) used to compute Semantic Precision and Semantic Recall scores for RDF-to-Text generation. |
|
|
|
|
|
# Use |
|
|
|
|
|
The following is minimal code to compute the Semantic Precision, Semantic Recall, and Semantic F1 of a generated Welsh text: |
|
|
``` |
|
|
from sentence_transformers import CrossEncoder |
|
|
|
|
|
model = CrossEncoder('MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7') |
|
|
model.config.num_labels = 1 |
|
|
model.default_activation_function = torch.nn.Sigmoid() |
|
|
|
|
|
model.model.load_adapter('WilliamSotoM/MonoLR_cym_Latn_PR',) |
|
|
|
|
|
graph = '[S]Buzz_Aldrin[P]mission[O]Apollo_12[T][S]Buzz_Aldrin[P]birthPlace[O]Glen_Ridge,_New_Jersey' |
|
|
text = 'Roedd Buzz Aldrin yn rhan o griw Apollo 12.' |
|
|
|
|
|
precision = model.predict([(graph, text)])[0] |
|
|
recall = model.predict([(text, graph)])[0] |
|
|
|
|
|
f1 = (2*precision*recall)/(precision+recall) |
|
|
|
|
|
print(f'Precision: {precision:.4f}') |
|
|
print(f'Recall: {recall:.4f}') |
|
|
print(f'F1: {f1:.4f}') |
|
|
``` |
|
|
|
|
|
Expected outpu: |
|
|
``` |
|
|
Precision: 0.9986 |
|
|
Recall: 0.4279 |
|
|
F1: 0.5991 |
|
|
``` |
|
|
|
|
|
Analysis: |
|
|
High precision means all the content in the text comes from the graph (i.e. No additions / hallucinations). |
|
|
Half recall means half the conente from the graph is missing (i.e. Some omissions). |
|
|
|