---
license: apache-2.0
datasets:
- eriktks/conll2003
language:
- en
base_model:
- stefan-it/ModernBERT-large-tokenizer-fix
tags:
- ner
---

# ✨ ModernBERT Large for NER

This repository hosts an ModernBERT Large model that was fine-tuned on the CoNLL-2003 NER dataset with the awesome Flair libary.

Please notice the following caveats:

* ⚠️ To workaround a tokenizer problem in ModernBERT, this model was fine-tuned on a [forked and modified](https://huggingface.co/stefan-it/ModernBERT-large-tokenizer-fix) ModernBERT Large model.
* ⚠️ At the moment, don't expect "uber" BERT-like performance, more experiments are needed. (Is RoPE causing this?)

## 📝 Implementation

The model was trained using my [ModernBERT experiments](https://github.com/stefan-it/modern-bert-ner) repo.

## 📊 Performance

A very basic hyper-parameter search is performanced for five different seeds, with reported averaged micro F1-Score on the development set of CoNLL-2003:

| Configuration          | Subword Pooling      |   Run 1 |     Run 2 |   Run 3 |   Run 4 |   Run 5 | Avg.         |
|:-----------------------|:---------------------|:--------|:----------|:--------|:--------|:--------|-------------:|
| `bs16-e10-cs0-lr2e-05` | `first`              |   96.13 |     96.44 |   96.20 |   95.93 |   96.65 | 96.27 ± 0.25 |
| `bs16-e10-cs0-lr2e-05` | `first_last`         |   96.36 | **96.58** |   96.14 |   96.19 |   96.35 | 96.32 ± 0.15 |

The performance of the current uploaded model is marked in bold.

## 📣 Usage

The following code can be used to test the model and recognize named entities for a given sentence:

```python
from flair.data import Sentence
from flair.models import SequenceTagger

# Load the model
tagger = SequenceTagger.load("stefan-it/flair-modernbert-large-ner-conll03")

# Define an example sentence
sentence = Sentence("George Washington went to Washington very fast.")

# Now let's predict named entities...
tagger.predict(sentence)

# Print-out the recognized named entities
print("The following named entities are found:")
for entity in sentence.get_spans('ner'):
    print(entity)
```

This outputs:

```text
Span[0:2]: "George Washington" → PER (1.0000)
Span[4:5]: "Washington" → LOC (1.0000)
```