Ladino Text-to-Speech Model

Model Description

This is a Glow-TTS model for Ladino (Judeo-Spanish) text-to-speech synthesis, fine-tuned from an English pre-trained model. The model was trained on a 3.5 hour single-speaker corpus recorded by a native speaker from Istanbul.

Model Details

Model Type: Glow-TTS (Generative Flow for Text-to-Speech)
Language: Ladino (Judeo-Spanish) [lad]
Speaker: Single speaker (native Ladino speaker from Istanbul)
Training Data: 3.5 hours of speech (1,987 segments)
Sampling Rate: 16 kHz
Fine-tuned from: English ljspeech/glow-tts model
Vocoder: Griffin-Lim algorithm

Training Configuration

Iterations: 5,000
Batch Size: 32
Optimizer: Adam with Noam learning rate schedule
Standard Deviation: 1
Mel-spectrogram: Following Prenger et al. (2019) settings
Phonemes: Using English phonemizer from pre-trained model
Training Time: ~4 days on 8GB NVIDIA GPU

Performance

Human evaluation (Mean Opinion Score) from 12 native Ladino speakers:

Intelligibility: 4.04/5.0 (Good)
Naturalness: 3.61/5.0 (Between Fair and Good)

This model achieved the best performance among three approaches tested:

✅ Fine-tuning from English (this model) - Best intelligibility and naturalness
Fine-tuning from Spanish - Better naturalness for Ladino phonemes but lower intelligibility
Training from scratch - Excellent naturalness but very poor intelligibility

Usage

These instructions were automatically generated by an LLM and are not tested. See Coqui TTS documentation.

Requirements

pip install coqui-tts

Basic Usage

from TTS.api import TTS

# Initialize the model
tts = TTS(model_path="collectivat/ladino-tts-model")

# Generate speech
text = "Shalom! Komo estas?"
tts.tts_to_file(text=text, file_path="output.wav")

Using with Coqui TTS

from TTS.utils.manage import ModelManager
from TTS.utils.synthesizer import Synthesizer

# Load the model
synthesizer = Synthesizer(
    tts_checkpoint="eng_lad_v2_checkpoint_770000.pth",
    tts_config_path="eng_lad_v2_config.json",
    use_cuda=False
)

# Synthesize
wav = synthesizer.tts("Me plaze meldar livros en Ladino.")
synthesizer.save_wav(wav, "output.wav")

Files in this Repository

eng_lad_v2_checkpoint_770000.pth - Model weights at 770,000 iterations
eng_lad_v2_config.json - Model configuration file

Background

Ladino (also called Judeo-Spanish or Judezmo, ISO 639-3: lad) is the historical language of Sephardic Jews, descended from Old Castilian Spanish of the 15th century. After the Spanish Inquisition (1492), Sephardic Jews were expelled and welcomed into the Ottoman Empire, where they retained and evolved the language for over 530 years.

Today, Ladino is classified as a severely endangered language by UNESCO. This TTS model is part of efforts to preserve and revitalize the language through modern technology.

Training Data

The model was trained on the Ladino TTS Training Dataset, which contains:

1,987 audio segments
3.5 hours of speech
Single speaker (native Ladino speaker from Istanbul)
Content: 30 articles from El Amaneser newspaper
Topics: Historical issues, current affairs, cultural events, politics, community news

Limitations

Single Speaker: The model can only generate speech in the voice of the training speaker
Griffin-Lim Vocoder: Using Griffin-Lim instead of a neural vocoder may result in some "metallic" sound quality, particularly in consonants
Limited Training Data: With only 3.5 hours of data, the model may struggle with:
- Out-of-vocabulary words
- Complex sentence structures not present in training data
- Certain phoneme combinations
Language Specificity: Trained specifically for Ladino; will not work well for other languages

Ethical Considerations

The speaker provided informed consent for the recordings to be used for research and language technology development
Non-Commercial Use Only: This dataset is released under CC-BY-NC-4.0 license, which means:
- ✓ You may use it for research and educational purposes
- ✓ You may use it to develop non-commercial language preservation tools
- ✗ You may NOT use it for commercial products or services
- ✗ You may NOT use it to create voice impersonations or deepfakes
- ✗ You may NOT use it for any purpose that could harm the speaker or the Ladino community
This resource is intended to support Ladino language revitalization and documentation efforts
Any synthetic speech generated from models trained on this data should be clearly identified as synthetic
Users must respect the cultural significance of this endangered language and use the data responsibly

Citation

If you use this dataset, please cite:

Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish

Related Resources

Web Demo: translate.sefarad.com.tr
Training Dataset: collectivat/ladino-karen-TTS
Project Data Hub: data.sefarad.com.tr
GitHub Repository: judeo-espanyol-resources
Research Paper: ACL Anthology

Curation

This dataset was curated as part of the project "Judeo-Spanish: Connecting the Two Ends of the Mediterranean" carried out by Col·lectivaT and Sephardic Center of Istanbul within the framework of the "Grant Scheme for Common Cultural Heritage: Preservation and Dialogue between Turkey and the EU–II (CCH-II)" implemented by the Ministry of Culture and Tourism of the Republic of Turkey with the financial support of the European Union.

Contact

Sephardic Center of Istanbul: [email protected]

Acknowledgments

The content of this dataset is the sole responsibility of Col·lectivaT and does not necessarily reflect the views of the European Union.

License

This dataset is released under the Creative Commons Attribution-NonCommercial 4.0 International License (CC-BY-NC-4.0). This means you are free to share and adapt the material for non-commercial purposes only, with appropriate attribution.

This repository is developed as part of project "Judeo-Spanish: Connecting the two ends of the Mediterranean" carried out by Col·lectivaT and Sephardic Center of Istanbul within the framework of the “Grant Scheme for Common Cultural Heritage: Preservation and Dialogue between Turkey and the EU–II (CCH-II)” implemented by the Ministry of Culture and Tourism of the Republic of Turkey with the financial support of the European Union. The content of this website is the sole responsibility of Col·lectivaT and does not necessarily reflect the views of the European Union.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train collectivat/ladino-tts

Collection including collectivat/ladino-tts

Ladino Data Hub

Collection

Data and models collected and created by Col·lectivaT for Judeospanish (Ladino) language • 10 items • Updated 10 days ago