Yoruba
Feature Description
Name yo_yordep
Version 0.1.0
spaCy >=3.8.7,<3.9.0
Default Pipeline tok2vec, tagger, parser, morphologizer
Components tok2vec, tagger, parser, morphologizer
Vectors 151125 keys, 151125 unique vectors (300 dimensions)
Sources Dataset: https://github.com/UniversalDependencies/UD_Yoruba-YTB Embeddings: [FastText](https://fasttext.cc/docs/en/crawl-vectors.html)
License apache-2.0
Author Kolawole Lawal

Label Scheme

View label scheme (154 labels for 3 components)
Component Labels
tagger ADJ, ADJ__Case=Acc|Number=Sing|Person=1|PronType=Prs, ADJ__Case=Nom|Number=Sing|Person=3|PronType=Prs, ADJ__NumType=Ord, ADJ__Typo=Yes, ADP, ADP__Case=Acc|Number=Sing|Person=1|PronType=Prs, ADP__NumType=Card, ADP__Typo=Yes, ADV, ADV__Typo=Yes, AUX, AUX__Case=Nom|Number=Plur|Person=1|PronType=Prs, AUX__Case=Nom|Number=Sing|Person=1|PronType=Prs, AUX__Case=Nom|Number=Sing|Person=3|PronType=Prs, AUX__Typo=Yes, CCONJ, CCONJ__Case=Acc|Number=Sing|Person=1|PronType=Prs, CCONJ__PronType=Ind, CCONJ__Typo=Yes, DET, DET__Number=Plur|PronType=Dem, NOUN, NOUN__Case=Acc|Number=Sing|Person=1|PronType=Prs, NOUN__Case=Nom|Number=Sing|Person=1|PronType=Prs, NOUN__Typo=Yes, NUM__Case=Acc|Number=Sing|Person=1|PronType=Prs, NUM__NumType=Card, PART, PART__Typo=Yes, PRON, PRON__Case=Acc|Number=Plur|Person=1|PronType=Prs, PRON__Case=Acc|Number=Plur|Person=2|PronType=Prs, PRON__Case=Acc|Number=Plur|Person=3|PronType=Prs, PRON__Case=Acc|Number=Sing|Person=1|PronType=Prs, PRON__Case=Acc|Number=Sing|Person=2|PronType=Prs, PRON__Case=Acc|Number=Sing|Person=3|PronType=Prs, PRON__Case=Gen|Number=Plur|Person=2|PronType=Prs, PRON__Case=Gen|Number=Plur|Person=3|PronType=Prs, PRON__Case=Gen|Number=Sing|Person=2|PronType=Prs, PRON__Case=Gen|Number=Sing|Person=2|PronType=Prs|Typo=Yes, PRON__Case=Gen|Number=Sing|Person=3|PronType=Prs, PRON__Case=Nom|Number=Plur|Person=1|PronType=Prs, PRON__Case=Nom|Number=Plur|Person=2|PronType=Prs, PRON__Case=Nom|Number=Plur|Person=3|PronType=Prs, PRON__Case=Nom|Number=Sing|Person=1|PronType=Prs, PRON__Case=Nom|Number=Sing|Person=2|PronType=Prs, PRON__Case=Nom|Number=Sing|Person=3|PronType=Prs, PRON__PronType=Dem, PRON__PronType=Emp, PRON__PronType=Ind, PRON__PronType=Int, PRON__PronType=Int|Typo=Yes, PRON__PronType=Rel, PRON__PronType=Rel|Typo=Yes, PROPN, PROPN__Case=Nom|Number=Plur|Person=1|PronType=Prs, PROPN__Typo=Yes, PUNCT, SCONJ, SCONJ__Typo=Yes, SYM, VERB, VERB__Typo=Yes, X
parser ROOT, acl, advcl, advmod, amod, aux, case, cc, ccomp, compound, compound:svc, conj, cop, dep, det, expl, fixed, mark, nmod, nsubj, obj, obl, parataxis, punct
morphologizer POS=ADP, POS=NOUN, POS=DET, POS=VERB, Number=Plur|POS=DET|PronType=Dem, POS=CCONJ, POS=PUNCT, Case=Nom|Number=Sing|POS=PRON|Person=3|PronType=Prs, POS=ADJ, POS=AUX, POS=SCONJ, Case=Acc|Number=Sing|POS=PRON|Person=3|PronType=Prs, POS=ADV, NumType=Ord|POS=ADJ, POS=PRON|PronType=Rel, POS=PRON, Case=Gen|Number=Sing|POS=PRON|Person=3|PronType=Prs, Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs, Case=Nom|Number=Sing|POS=AUX|Person=3|PronType=Prs, NumType=Card|POS=NUM, Case=Gen|Number=Plur|POS=PRON|Person=3|PronType=Prs, Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs, POS=PART, Case=Nom|Number=Plur|POS=PRON|Person=2|PronType=Prs, Case=Nom|Number=Plur|POS=PRON|Person=1|PronType=Prs, POS=PRON|PronType=Emp, Case=Acc|Number=Plur|POS=PRON|Person=1|PronType=Prs, Case=Nom|Number=Sing|POS=PRON|Person=1|PronType=Prs, Case=Gen|Number=Plur|POS=PRON|Person=2|PronType=Prs, POS=PRON|PronType=Ind, POS=NOUN|Typo=Yes, Case=Acc|Number=Sing|POS=PRON|Person=2|PronType=Prs, POS=PROPN, Case=Acc|Number=Sing|POS=PRON|Person=1|PronType=Prs, POS=PRON|PronType=Int, Case=Gen|Number=Sing|POS=PRON|Person=2|PronType=Prs, POS=X, Case=Nom|Number=Sing|POS=PRON|Person=2|PronType=Prs, POS=ADP|Typo=Yes, POS=PRON|PronType=Dem, Case=Acc|Number=Plur|POS=PRON|Person=2|PronType=Prs, POS=PROPN|Typo=Yes, POS=AUX|Typo=Yes, POS=ADJ|Typo=Yes, Case=Gen|Number=Sing|POS=PRON|Person=2|PronType=Prs|Typo=Yes, POS=CCONJ|Typo=Yes, POS=ADV|Typo=Yes, POS=PRON|PronType=Rel|Typo=Yes, POS=SCONJ|Typo=Yes, POS=VERB|Typo=Yes, POS=CCONJ|PronType=Ind, POS=PRON|PronType=Int|Typo=Yes, POS=PART|Typo=Yes, Case=Nom|Number=Plur|POS=AUX|Person=1|PronType=Prs, Case=Nom|Number=Sing|POS=ADJ|Person=3|PronType=Prs, NumType=Card|POS=ADP, Case=Nom|Number=Sing|POS=NOUN|Person=1|PronType=Prs, Case=Acc|Number=Sing|POS=ADP|Person=1|PronType=Prs, Case=Acc|Number=Sing|POS=NOUN|Person=1|PronType=Prs, Case=Nom|Number=Plur|POS=PROPN|Person=1|PronType=Prs, Case=Nom|Number=Sing|POS=AUX|Person=1|PronType=Prs, POS=SYM, Case=Acc|Number=Sing|POS=NUM|Person=1|PronType=Prs, Case=Acc|Number=Sing|POS=ADJ|Person=1|PronType=Prs, Case=Acc|Number=Sing|POS=CCONJ|Person=1|PronType=Prs

METRICS

These metrics were gotten using the spacy evaluate CLI

Type Score
TAG_ACC 88.51
POS_ACC 89.84
TAG_MICRO_P 0.00
TAG_MICRO_R 0.00
TAG_MICRO_F 0.00
DEP_UAS 70.61
DEP_LAS 59.17
SENTS_P 82.86
SENTS_R 91.58
SENTS_F 87.00
MORPH_ACC 96.46
TOK2VEC_LOSS 94585.70
TAGGER_LOSS 5570.00
PARSER_LOSS 63924.84
MORPHOLOGIZER_LOSS 5570.00

FURTHER READINGS:

NOTE:

  • This model was trained using the dataset referenced above which consists of 318 sentences, although incorporated with the FastText.
  • https://yordepan.streamlit.app/
  • Future development will include lemmatizer.

USAGE:

  • Download with pip install https://huggingface.co/Kola9INE/yordep/resolve/main/yo_yordep-0.1.0.tar.gz into your virtual environment.
  • import spacy.
  • Load model with yor_nlp = spacy.load("yo_yordep")

CAVEAT!!!

  • This model was initialized in the following pipelines as displayed in the Components or Default Pipelines. It is not trained in the lemmatizer pipeline and therefore cannot lemmatize!
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Kola9INE/yordep