Update README.md
Browse files
README.md
CHANGED
|
@@ -8,10 +8,11 @@ library_name: transformers
|
|
| 8 |
pipeline_tag: automatic-speech-recognition
|
| 9 |
tags:
|
| 10 |
- spanish
|
|
|
|
| 11 |
- speech
|
| 12 |
- recognition
|
| 13 |
- whisper
|
| 14 |
-
-
|
| 15 |
---
|
| 16 |
|
| 17 |
# distil-whisper-large-v3-es
|
|
@@ -155,7 +156,7 @@ print(result["text"])
|
|
| 155 |
```
|
| 156 |
## Training
|
| 157 |
|
| 158 |
-
The model was trained for
|
| 159 |
```
|
| 160 |
--teacher_model_name_or_path "openai/whisper-large-v3"
|
| 161 |
--train_dataset_name "mozilla-foundation/common_voice_16_1"
|
|
@@ -166,14 +167,14 @@ The model was trained for 40,000 optimisation steps (or 0.98 epochs), on a singl
|
|
| 166 |
--eval_dataset_config_name "es"
|
| 167 |
--eval_split_name "validation"
|
| 168 |
--eval_text_column_name "sentence"
|
| 169 |
-
--eval_steps
|
| 170 |
-
--save_steps
|
| 171 |
--warmup_steps 500
|
| 172 |
--learning_rate 1e-4
|
| 173 |
--lr_scheduler_type "linear"
|
| 174 |
--logging_steps 25
|
| 175 |
--save_total_limit 1
|
| 176 |
-
--max_steps
|
| 177 |
--wer_threshold 10
|
| 178 |
--per_device_train_batch_size 8
|
| 179 |
--per_device_eval_batch_size 8
|
|
@@ -192,7 +193,7 @@ The model was trained for 40,000 optimisation steps (or 0.98 epochs), on a singl
|
|
| 192 |
|
| 193 |
## Results
|
| 194 |
|
| 195 |
-
The distilled model performs with a 5.
|
| 196 |
|
| 197 |
## License
|
| 198 |
|
|
|
|
| 8 |
pipeline_tag: automatic-speech-recognition
|
| 9 |
tags:
|
| 10 |
- spanish
|
| 11 |
+
- español
|
| 12 |
- speech
|
| 13 |
- recognition
|
| 14 |
- whisper
|
| 15 |
+
- distil-whisper
|
| 16 |
---
|
| 17 |
|
| 18 |
# distil-whisper-large-v3-es
|
|
|
|
| 156 |
```
|
| 157 |
## Training
|
| 158 |
|
| 159 |
+
The model was trained for 60,000 optimisation steps (or around 1.47 epochs), on a single RTX3090 for ~60 hours, using the following training parameters:
|
| 160 |
```
|
| 161 |
--teacher_model_name_or_path "openai/whisper-large-v3"
|
| 162 |
--train_dataset_name "mozilla-foundation/common_voice_16_1"
|
|
|
|
| 167 |
--eval_dataset_config_name "es"
|
| 168 |
--eval_split_name "validation"
|
| 169 |
--eval_text_column_name "sentence"
|
| 170 |
+
--eval_steps 10000
|
| 171 |
+
--save_steps 10000
|
| 172 |
--warmup_steps 500
|
| 173 |
--learning_rate 1e-4
|
| 174 |
--lr_scheduler_type "linear"
|
| 175 |
--logging_steps 25
|
| 176 |
--save_total_limit 1
|
| 177 |
+
--max_steps 60000
|
| 178 |
--wer_threshold 10
|
| 179 |
--per_device_train_batch_size 8
|
| 180 |
--per_device_eval_batch_size 8
|
|
|
|
| 193 |
|
| 194 |
## Results
|
| 195 |
|
| 196 |
+
The distilled model performs with a 5.11% WER (10.15% orthogonal WER).
|
| 197 |
|
| 198 |
## License
|
| 199 |
|