--- language: - yue tags: - text-to-speech - tts - cantonese - gpt-sovits - audio - speech-synthesis license: mit pipeline_tag: text-to-speech spaces: - laubonghaudoi/zoengjyutgaai_tts datasets: - CanCLID/zoengjyutgaai base_model: - lj1995/GPT-SoVITS --- # 張悦楷 GPT-SoVITS 本模型係 [GPT-SoVITS v2ProPlus](https://github.com/RVC-Boss/GPT-SoVITS) 用咗全部張悦楷講古語音數據集 [CanCLID/zoengjyutgaai](https://huggingface.co/datasets/CanCLID/zoengjyutgaai),即總共 188.67 個鐘數據微調出嚟嘅。語音合成效果請見[laubonghaudoi/zoengjyutgaai_tts](https://huggingface.co/spaces/laubonghaudoi/zoengjyutgaai_tts)。 ## 模型文件 模型用嘅係 v2ProPlus 版,詳情請見 [GPT‐SoVITS‐features (各版本特性)](https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90features-(%E5%90%84%E7%89%88%E6%9C%AC%E7%89%B9%E6%80%A7)) ### SoVITS - **`sovits/e1_e50_s5950.pth`** - Epoch: 50 - Steps: 5950 ### GPT - **`gpt/dpo1-e200.ckpt`** - 用咗 DPO - Epoch: 200 - top_3_acc_epoch 大概 0.8038 - total_loss_epoch 大概 3214 - **`gpt/dpo1-e600.ckpt`** - 用咗 DPO - Epoch: 600 - top_3_acc_epoch 大概 0.8619 - total_loss_epoch 大概 4671 (唔知點解比上面仲高) - **`gpt/dpo1-e1000.ckpt`** - 用咗 DPO - Epoch: 1000 - top_3_acc_epoch 大概 0.8975 - total_loss_epoch 大概 1774 ## 使用 ```python from huggingface_hub import hf_hub_download # Download GPT model gpt_model = hf_hub_download( repo_id="laubonghaudoi/zoengjyutgaai_tts", filename="gpt/dpo1-e1000.ckpt" ) # Download SoVITS model sovits_model = hf_hub_download( repo_id="laubonghaudoi/zoengjyutgaai_tts", filename="sovits/e1_e50_s5950.pth" ) ```