language:-nllicense:apache-2.0base_model:distilbert/distilbert-base-multilingual-casedtags:-token-classification-ner-pii-pii-detection-de-identification-privacy-healthcare-medical-clinical-phi-dutch-pytorch-transformers-openmedpipeline_tag:token-classificationlibrary_name:transformersmetrics:-f1-precision-recallmodel-index:-name:OpenMed-PII-Dutch-mLiteClinical-135M-v1results:-task:type:token-classificationname:NamedEntityRecognitiondataset:name:AI4Privacy(Dutchsubset)type:ai4privacy/pii-masking-400ksplit:testmetrics:-type:f1value:0.8794name:F1(micro)-type:precisionvalue:0.8775name:Precision-type:recallvalue:0.8813name:Recallwidget:-text:>- Dr. Jan de Vries (BSN: 123456789) is bereikbaar via jan.devries@ziekenhuis.nl of +31 6 12345678. Hij woont op Keizersgracht 42, 1015 CS Amsterdam.example_title:ClinicalNotewithPII(Dutch)
OpenMed-PII-Dutch-mLiteClinical-135M-v1
Dutch PII Detection Model | 135M Parameters | Open Source
Model Description
OpenMed-PII-Dutch-mLiteClinical-135M-v1 is a transformer-based token classification model fine-tuned for Personally Identifiable Information (PII) detection in Dutch text. This model identifies and classifies 54 types of sensitive information including names, addresses, social security numbers, medical record numbers, and more.
Key Features
Dutch-Optimized: Specifically trained on Dutch text for optimal performance
High Accuracy: Achieves strong F1 scores across diverse PII categories
Comprehensive Coverage: Detects 54 entity types spanning personal, financial, medical, and contact information
Privacy-Focused: Designed for de-identification and compliance with GDPR and other privacy regulations
Production-Ready: Optimized for real-world text processing pipelines
Performance
Evaluated on the Dutch subset of AI4Privacy dataset: