--- library_name: transformers tags: - contrastive-learning - Spanish-UMLS - Hierarchical-enrichment - entity-linking - biomedical - spanish license: mit language: - es base_model: - PlanTL-GOB-ES/roberta-base-biomedical-clinical-es --- # HERBERT: Leveraging UMLS Hierarchical Knowledge to Enhance Clinical Entity Normalization in Spanish **HERBERT-P** is a contrastive-learning-based bi-encoder for medical entity normalization in Spanish, leveraging synonym and parent relationships from UMLS to enhance candidate retrieval for entity linking in clinical texts. **Key features:** - Base model: [PlanTL-GOB-ES/roberta-base-biomedical-clinical-es](https://huggingface.co/PlanTL-GOB-ES/roberta-base-biomedical-clinical-es) - Trained with 30 positive pairs per anchor (synonyms + parents) - Task: Normalization of disease, procedure, and symptom mentions to SNOMED-CT/UMLS codes. - Domain: Spanish biomedical/clinical texts. - Corpora: DisTEMIST, MedProcNER, SympTEMIST. --- ## Benchmark Results | Corpus | Top-1 | Top-5 | Top-25 | Top-200 | |-------------|--------|--------|--------|---------| | DisTEMIST | 0.588 | 0.723 | 0.803 | 0.867 | | SympTEMIST | 0.635 | 0.784 | 0.882 | 0.946 | | MedProcNER | 0.651 | 0.765 | 0.838 | 0.892 |