--- title: WimBERT Synth v0 emoji: 🏛️ colorFrom: blue colorTo: indigo sdk: gradio sdk_version: 5.49.1 app_file: app.py pinned: false license: apache-2.0 short_description: Dutch multi-label classifier for signal messages --- # WimBERT Synth v0: Dutch Multi-Label Signal Classifier Demo of a dual-head BERT classifier trained on synthetic Dutch government signals. Predicts relevant topics (**onderwerp**, 64 labels) and sentiment/experience (**beleving**, 33 labels) for each input message. ## 🚀 Usage 1. Enter Dutch text (e.g., a citizen feedback message about government services) 2. Click **Voorspel** to classify 3. Adjust **Drempel** (threshold) to change prediction sensitivity 4. View results in three tabs: - **Samenvatting**: Top-K predictions per head with color-coded probabilities - **Alle labels**: Complete list of all labels sorted by probability - **JSON**: Raw predictions in machine-readable format ## 🎯 Features - **Dual-head classification**: Simultaneously predicts topic (onderwerp) and experience (beleving) - **Interactive threshold**: Adjust which labels are considered "predicted" - **Color-coded visualization**: Probability intensity shown via color (darker = higher probability) - **Accessible**: All probabilities shown numerically, colors are enhancements - **Fast**: Optimized for CPU inference (~2-5s) with optional GPU acceleration ## 🤖 Model - **Base model**: `bert-base-multilingual-cased` - **Architecture**: Dual classification heads with 64 onderwerp + 33 beleving labels - **Training**: Synthetic data via Argilla + distillation pipeline - **License**: Apache-2.0 - **Full model card**: [UWV/wimbert-synth-v0](https://huggingface.co/UWV/wimbert-synth-v0) ### Labels **Onderwerp (64 topics)**: Advies, Algemene veiligheid, Begeleiding, Bijstand, Bouwoverlast, COVID-19, Criminaliteit, Documentaanvraag, Energiekosten, Evenementen, Financiële regelingen, Geluidsoverlast, Gemeentelijke heffingen, Hangjongeren, Huisdierenoverlast, Hulp aan dak- en thuislozen, Infrastructuur, Kwijtschelding, Migratie, Onderhoud omgeving, Parkeren, Schade en claims, Verkeersmaatregelen, Verkeersveiligheid, Wijkteam, and more... **Beleving (33 experiences)**: Afspraakmogelijkheden, Algemene ervaring, Behulpzaamheid, Bereikbaarheid, Bezwaar & bewijs, Communicatie, Deskundigheid, Duidelijkheid, Efficiëntie, Faciliteiten, Gebruiksgemak, Informatievoorziening, Integriteit, Kwaliteit klantenservice, Snelheid van afhandeling, Vriendelijkheid, Wachttijd, and more... ## 🔒 Privacy - Input text is processed **in-memory only** - No data is logged or stored beyond standard Gradio telemetry - Model runs entirely within this Space (no external API calls) ## ⚙️ Hardware - **CPU**: Works on free tier (~3-5s inference) - **GPU (T4)**: Recommended for production (<1s inference) Current Space is running on: **CPU** with FP32 ## 🛠️ Local Development ```bash # Clone and setup git clone https://huggingface.co/spaces/UWV/wimbert-synth-v0 cd wimbert-synth-v0 python3 -m venv venv source venv/bin/activate pip install -r requirements.txt # Run python app.py ``` ## 📊 Example Use Cases - **Citizen feedback routing**: Automatically categorize incoming messages - **Sentiment analysis**: Understand citizen experience with government services - **Analytics**: Aggregate trends across topics and experiences - **Triage**: Prioritize urgent or negative feedback ⚠️ **Note**: This is a research/demo tool. Not intended for automated decision-making. --- **Built with**: Gradio • Transformers • PyTorch **Developed by**: UWV **License**: Apache-2.0