--- license: mit datasets: - newmindai/RAGTruth-TR language: - tr - en metrics: - precision - recall - f1 - roc_auc base_model: - EuroBERT/EuroBERT-210m pipeline_tag: token-classification --- # lettucedect-210m-eurobert-tr-v1 ## Model Description **lettucedct-210m-eurobert-tr-v1** is a multilingual hallucination detection model based on the EuroBERT architecture, fine-tuned for Turkish hallucination detection tasks. This model is part of the Turk-LettuceDetect suite and demonstrates strong cross-lingual generalization capabilities for detecting hallucinations in Turkish Retrieval-Augmented Generation (RAG) applications. ## Model Details - **Model Type:** Token-level binary classifier for hallucination detection - **Base Architecture:** EuroBERT-base - **Language:** Turkish (tr) with multilingual capabilities - **Training Dataset:** Machine-translated RAGTruth dataset (17,790 training instances) - **Context Length:** Up to 8,192 tokens - **Model Size:** ~210M parameters ## Intended Use ### Primary Use Cases - Hallucination detection in Turkish RAG systems - Cross-lingual hallucination detection applications - Data-to-text generation verification (strongest performance area) - Multilingual NLP pipelines requiring Turkish support ### Supported Tasks - Question Answering (QA) hallucination detection - Data-to-text generation verification (**strongest performance**) - Text summarization fact-checking ## Performance ### Overall Performance (F1-Score) - **Whole Dataset:** 0.7777 - **Question Answering:** 0.7317 - **Data-to-text Generation:** 0.8030 (**best in suite**) - **Summarization:** 0.6057 ### Key Strengths - **Best performance in data-to-text generation** - Robust multilingual transfer learning capabilities ## Training Details ### Training Data - **Dataset:** Machine-translated RAGTruth benchmark - **Size:** 17,790 training instances, 2,700 test instances - **Tasks:** Question answering (MS MARCO), data-to-text (Yelp), summarization (CNN/Daily Mail) - **Translation Model:** Google Gemma-3-27b-it ### Training Configuration - **Epochs:** 6 - **Learning Rate:** 1e-5 - **Batch Size:** 4 - **Hardware:** NVIDIA A100 40GB GPU - **Training Time:** ~2 hours - **Optimization:** Cross-entropy loss with token masking ### Multilingual Foundation - Built on EuroBERT architecture supporting multiple European languages - Demonstrates effective multilingual transfer learning - No full in-language retraining required due to strong cross-lingual capabilities ## Technical Specifications ### Architecture Features - **Base Model:** EuroBERT multilingual encoder - **Maximum Sequence Length:** 8,192 tokens - **Classification Head:** Binary token-level classifier - **Multilingual Support:** European languages with strong Turkish adaptation - **Parameter Count:** 210M parameters ### Input Format ``` Input: [CONTEXT] [QUESTION] [GENERATED_ANSWER] Output: Token-level binary labels (0=supported, 1=hallucinated) ``` ## Limitations and Biases ### Known Limitations - Reduced effectiveness in summarization compared to structured tasks - Performance dependent on translation quality of training data - Optimized primarily for European language patterns ### Potential Biases - Translation artifacts from machine-translated training data - Multilingual transfer bias favoring European linguistic patterns - May perform differently on Turkish dialects or informal text ## Usage ### Installation ```bash pip install lettucedetect ``` ### Basic Usage ```python from lettucedetect.models.inference import HallucinationDetector # Initialize the Turkish-specific hallucination detector detector = HallucinationDetector( method="transformer", model_path="newmindai/modernbert-tr-uncased-stsb-HD" ) # Turkish context, question, and answer context = "İstanbul Türkiye'nin en büyük şehridir. Şehir 15 milyonluk nüfusla Avrupa'nın en kalabalık şehridir." question = "İstanbul'un nüfusu nedir? İstanbul Avrupa'nın en kalabalık şehri midir?" answer = "İstanbul'un nüfusu yaklaşık 16 milyondur ve Avrupa'nın en kalabalık şehridir." # Get span-level predictions (start/end indices, confidence scores) predictions = detector.predict( context=context, question=question, answer=answer, output_format="spans" ) print("Tespit Edilen Hallusinasyonlar:", predictions) # Örnek çıktı: # [{'start': 34, 'end': 57, 'confidence': 0.92, 'text': 'yaklaşık 16 milyondur'}] ``` ## Evaluation ### Benchmark Results Evaluated on machine-translated Turkish RAGTruth test set, demonstrating the effectiveness of multilingual transfer learning for Turkish hallucination detection, particularly excelling in data-to-text generation tasks. **Example-level Results** **Token-level Results** ## Citation ```bibtex @inproceedings{turklettucedetect2025, title={Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications}, author={NewMind AI Team}, booktitle={9th International Artificial Intelligence and Data Processing Symposium (IDAP'25)}, year={2025}, address={Malatya, Turkey} } ``` ## Original LettuceDetect Framework This model extends the LettuceDetect methodology: ```bibtex @article{lettucedetect2025, title={LettuceDetect: a hallucination detection framework for RAG applications}, author={Kovács, Á. and Ács, B. and Kovács, D. and Szendi, S. and Kadlecik, Z. and Dávid, S.}, journal={arXiv preprint arXiv:2502.17125}, year={2025} } ``` ## License This model is released under an open-source license to support research and development in Turkish and multilingual NLP applications. ## Contact For questions about this model or other Turkish hallucination detection models, please refer to the original paper or contact the authors. ---