bakrianoo
/

sinai-voice-ar-stt

@@ -41,6 +41,8 @@ Please install:
 We evaluated the model against different Arabic-STT Wav2Vec models.
 |    | Model                                 | [using transliteration](https://pypi.org/project/lang-trans/)   |      WER |      Training Datasets |
 |---:|:--------------------------------------|:---------------------|---------:|---------:|
 |  1 | bakrianoo/sinai-voice-ar-stt          | True                 | 0.238001 |Common Voice 6|
@@ -80,8 +82,8 @@ resamplers = {  # all three sampling rates exist in test split
 transformation = jiwer.Compose([
     # normalize some diacritics, remove punctuation, and replace Persian letters with Arabic ones
     jiwer.SubstituteRegexes({
-        r'[auiFNKo\\\\\\\\\\\\\\\\~_،؟»\\\\\\\\\\\\\\\\?;:\\\\\\\\\\\\\\\\-,\\\\\\\\\\\\\\\\.؛«!"]': "", "\\\\\\\\\\\\\\\\u06D6": "",
-        r"[\\\\\\\\\\\\\\\\|\\\\\\\\\\\\\\\\{]": "A", "p": "h", "ک": "k", "ی": "y"}),
     # default transformation below
     jiwer.RemoveMultipleSpaces(),
     jiwer.Strip(),
@@ -274,8 +276,8 @@ test_split = test_split.map(predict, batched=True, batch_size=16, remove_columns
 transformation = jiwer.Compose([
     # normalize some diacritics, remove punctuation, and replace Persian letters with Arabic ones
     jiwer.SubstituteRegexes({
-        r'[auiFNKo\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\~_،؟»\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\?;:\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\.؛«!"]': "", "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\u06D6": "",
-        r"[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\|\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\{]": "A", "p": "h", "ک": "k", "ی": "y"}),
     # default transformation below
     jiwer.RemoveMultipleSpaces(),
     jiwer.Strip(),
@@ -293,6 +295,8 @@ print(f"WER: {metrics['wer']:.2%}")
 ```
 **Test Result**: 23.80%
 ## Other Arabic Voice recognition Models

 We evaluated the model against different Arabic-STT Wav2Vec models.
+[**WER**: Word Error Rate] The Lowest score you get, the best model you have
 |    | Model                                 | [using transliteration](https://pypi.org/project/lang-trans/)   |      WER |      Training Datasets |
 |---:|:--------------------------------------|:---------------------|---------:|---------:|
 |  1 | bakrianoo/sinai-voice-ar-stt          | True                 | 0.238001 |Common Voice 6|
 transformation = jiwer.Compose([
     # normalize some diacritics, remove punctuation, and replace Persian letters with Arabic ones
     jiwer.SubstituteRegexes({
+        r'[auiFNKo\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\~_،؟»\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\?;:\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\.؛«!"]': "", "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\u06D6": "",
+        r"[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\|\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\{]": "A", "p": "h", "ک": "k", "ی": "y"}),
     # default transformation below
     jiwer.RemoveMultipleSpaces(),
     jiwer.Strip(),
 transformation = jiwer.Compose([
     # normalize some diacritics, remove punctuation, and replace Persian letters with Arabic ones
     jiwer.SubstituteRegexes({
+        r'[auiFNKo\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\~_،؟»\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\?;:\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-,\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\.؛«!"]': "", "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\u06D6": "",
+        r"[\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\|\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\{]": "A", "p": "h", "ک": "k", "ی": "y"}),
     # default transformation below
     jiwer.RemoveMultipleSpaces(),
     jiwer.Strip(),
 ```
 **Test Result**: 23.80%
+[**WER**: Word Error Rate] The Lowest score you get, the best model you have
 ## Other Arabic Voice recognition Models