Discussion: Coverage of `fatihburakkaragoz/ottoman-ner-latin`

#2
by fatihburakkaragoz - opened

Hi everyone – Fatih here. I wanted to share a quick summary of what I’m seeing with the current public checkpoint and outline the plan for improving it.

What’s working

  • Loading the model from the Hugging Face Hub works via both the CLI (ottoman-ner predict) and the Python API.
  • Person entities with Ottoman-era honorifics are picked up reliably. Examples like “Sultan Abdülhamid Han İstanbul’da yaşıyordu.” and “Sait Halim Paşa 1913 yılında Sadrazam olarak göreve başladı.” are tagged as expected (PER).

Where it falls short

  • Organization (ORG), location (LOC), and miscellaneous (MISC) labels are rarely emitted.
    For instance, when I run:

    ottoman-ner predict --model-path fatihburakkaragoz/ottoman-ner-latin \
      --text "İttihat ve Terakki Cemiyeti 23 Ocak 1913’te Bâb-ı Âli Baskını'nı düzenledi."
    

    I currently get:

    Text: İttihat ve Terakki Cemiyeti 23 Ocak 1913’te Bâb-ı Âli Baskını'nı düzenledi.
      No entities found
    

    So even canonical ORG/LOC/MISC examples are being left untagged.

Likely reason

In the original training corpus, person annotations were far more abundant than institutional names, locations, or dated events. The model has therefore learned to default to O for most tokens outside of PER.

Impact for you

  • If you need ORG/LOC/MISC entities, you’ll see undercounting unless you add extra post-processing.
  • The absence of labels isn’t a CLI bug; it’s a coverage gap in the current checkpoint.

My plan to address it

  1. Expand the dataset with more labelled sentences featuring organizations, geographic references, historic events, and dates.

  2. Retrain the model once that data is ready, aiming for balanced F1 across all four entity types.

  3. Benchmarks: I’ll publish evaluation numbers (precision/recall/F1 per label) using a held-out CoNLL-style test set. Example command:

    ottoman-ner eval \
      --model-path fatihburakkaragoz/ottoman-ner-latin \
      --test-file data/test.conll \
      --output-dir reports/current-model
    
  4. Release & docs: The improved weights will be pushed as a new version (tentatively v2.1.0). I’ll update the toolkit’s defaults, README, and release notes accordingly.

How you can help

  • If you have labelled examples (especially ORG/LOC/MISC), please share them—either via PR or an issue with raw sentences + annotations.
  • Report misclassifications or provide evaluation snippets from your projects; that feedback helps prioritise the next training round.

I’ll keep this thread updated as work progresses and will drop the new metrics + download link once the improved model is live. Thanks for bearing with the current limitations and for any help you can provide!

Sign up or log in to comment