federicocosta1989 commited on
Commit
8e28083
·
verified ·
1 Parent(s): 69560b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -8
README.md CHANGED
@@ -40,13 +40,13 @@ base_model:
40
 
41
  This is a HuBERT Base model pre-trained using 1,778 hours of Catalan speech data.
42
  The model architecture is the same as the [original HuBERT Base model](https://huggingface.co/facebook/hubert-base-ls960), which contains 12 transformer layers.
43
- Pre-training was done by [Barcelona Supercomputing Center](https://bsc.es/)
44
 
45
  # 2-Intended Uses and Limitations
46
 
47
- This pre-trained model generates rich Speech Representations that can be used for any Catalan speech-related task.
48
  This model does not have a tokenizer as it was pretrained on audio alone.
49
- In order to use this model for Speech Recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data.
50
  Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more in-detail explanation of how to fine-tune the model for Speech Recognition.
51
  For an explanation of how to fine-tune the model for Audio Classification, check out [this tutorial](https://huggingface.co/docs/transformers/main/en/tasks/audio_classification).
52
 
@@ -64,13 +64,13 @@ For pre-training, a 1,778 hours dataset was created using subsets from training
64
 
65
  # 4-Indirect evaluation results
66
 
67
- To assess the pre-trained Catalan Speech Representations' quality, we evaluated them using two indirect tasks: Catalan Speech Recognition and Catalan Accent Classification.
68
 
69
- ## 4.1 - Catalan Speech Recognition
70
 
71
  COMPLETAR
72
 
73
- ## 4.2 - Catalan Accent Classification
74
 
75
  COMPLETAR
76
 
@@ -78,6 +78,7 @@ COMPLETAR
78
  # 5-How to use the model
79
 
80
  ## 5.1-Speech Representations
 
81
  To obtain Speech Representations (HuBERT outputs) from audio in Catalan using this model, you can follow this example:
82
 
83
  ```python
@@ -130,8 +131,9 @@ def map_to_speech_representations(batch):
130
  speech_representations = dataset.map(map_to_speech_representations)
131
  ```
132
 
133
- ## 5.1-Discrete Speech Representations
134
- To obtain Discrete Speech Representations (HuBERT's k-means outputs) from audio in Catalan using this model, you can follow this example:
 
135
 
136
  ```python
137
 
@@ -197,6 +199,15 @@ discrete_units = dataset.map(map_to_discrete_units)
197
 
198
  ```
199
 
 
 
 
 
 
 
 
 
 
200
  # 6-Citation
201
 
202
  #TODO arreglar esto
 
40
 
41
  This is a HuBERT Base model pre-trained using 1,778 hours of Catalan speech data.
42
  The model architecture is the same as the [original HuBERT Base model](https://huggingface.co/facebook/hubert-base-ls960), which contains 12 transformer layers.
43
+ Pre-training was done by [Barcelona Supercomputing Center](https://bsc.es/).
44
 
45
  # 2-Intended Uses and Limitations
46
 
47
+ This pre-trained model generates Speech Representations that can be used for any Catalan speech-related task.
48
  This model does not have a tokenizer as it was pretrained on audio alone.
49
+ In order to use this model for Automatic Speech Recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data.
50
  Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more in-detail explanation of how to fine-tune the model for Speech Recognition.
51
  For an explanation of how to fine-tune the model for Audio Classification, check out [this tutorial](https://huggingface.co/docs/transformers/main/en/tasks/audio_classification).
52
 
 
64
 
65
  # 4-Indirect evaluation results
66
 
67
+ To assess the pre-trained Catalan Speech Representations' quality, we evaluated them using two indirect tasks: Catalan Automatic Speech Recognition (ASR) and Catalan Accent Classification.
68
 
69
+ ## 4.1-Catalan Automatic Speech Recognition
70
 
71
  COMPLETAR
72
 
73
+ ## 4.2-Catalan Accent Classification
74
 
75
  COMPLETAR
76
 
 
78
  # 5-How to use the model
79
 
80
  ## 5.1-Speech Representations
81
+
82
  To obtain Speech Representations (HuBERT outputs) from audio in Catalan using this model, you can follow this example:
83
 
84
  ```python
 
131
  speech_representations = dataset.map(map_to_speech_representations)
132
  ```
133
 
134
+ ## 5.2-Discrete Speech Representations
135
+
136
+ To obtain Discrete Speech Representations (HuBERT's k-means centroids) from audio in Catalan using this model, you can follow this example:
137
 
138
  ```python
139
 
 
199
 
200
  ```
201
 
202
+ ## 5.3-Automatic Speech Recognition
203
+
204
+ In order to use this model for Speech Recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data.
205
+ Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more in-detail explanation of how to fine-tune the model for Speech Recognition.
206
+
207
+ ## 5.4-Audio Classification
208
+
209
+ For an explanation of how to fine-tune the model for Audio Classification, check out [this tutorial](https://huggingface.co/docs/transformers/main/en/tasks/audio_classification).
210
+
211
  # 6-Citation
212
 
213
  #TODO arreglar esto