[Question] About fine-tuning method for specific languages (Full fine-tuning vs LoRA)
#1
by
hwanython
- opened
Hi! I have a question regarding the fine-tuning strategy used for Whisper models on specific languages.
β What I observed
- For languages like Igbo and Arabic, there are separate fine-tuned Whisper checkpoints available depending on the model size.
- Based on the training parameters, it seems like these were done using full fine-tuning, not LoRA.
(This is just my assumption β not fully confirmed.) - In many multilingual scenarios, parameter-efficient methods like LoRA are preferred instead of updating all weights.
π‘ Questions
- Does full fine-tuning offer a clear performance advantage over LoRA, especially for low-resource languages?
- In terms of hallucination or overfitting,
- Could full fine-tuning cause the model to lose its general multilingual capability or become too specialized in a single language?
- Are there any known examples or references where LoRA was specifically used for single-language Whisper fine-tuning instead of full fine-tuning?
- If so, I would love to check them out.