kaysrubio commited on
Commit
0f22783
·
verified ·
1 Parent(s): 3662cdf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -7
README.md CHANGED
@@ -24,30 +24,25 @@ The goal of this project is to create an accent classifier for people who learne
24
 
25
  ## How to use this model on an audio file
26
 
 
27
  from huggingface_hub import notebook_login
28
-
29
  notebook_login()
30
 
31
  from transformers import pipeline
32
-
33
  pipe = pipeline("audio-classification", model="kaysrubio/accent-id-distilhubert-finetuned-l2-arctic2")
34
 
35
  import torch
36
-
37
  import torchaudio
38
 
39
  audio, sr = torchaudio.load('path_to_file/audio.wav') # Load audio, make sure it is mono, not stereo
40
-
41
  audio = torchaudio.transforms.Resample(orig_freq=sr, new_freq=16000)(audio)
42
-
43
  audio = audio.squeeze().numpy()
44
 
45
  result = pipe(audio, top_k=6)
46
 
47
  print(result)
48
-
49
  print('First language of this speaker is predicted to be ' + result[0]['label'] + ' with ' + str(result[0]['score']*100) + '% confidence')
50
-
51
  ## Intended uses & limitations
52
 
53
  The model is very accurate for novel recordings from the original dataset that were not used for train/test. However, the model is not accurate for voices from outside the dataset. Unfortunetely with only 24 speakers represented, it seems like the model memorized other characteristics of these voices besides accent, thus not creating a model very generalizable to the real world.
 
24
 
25
  ## How to use this model on an audio file
26
 
27
+ ```
28
  from huggingface_hub import notebook_login
 
29
  notebook_login()
30
 
31
  from transformers import pipeline
 
32
  pipe = pipeline("audio-classification", model="kaysrubio/accent-id-distilhubert-finetuned-l2-arctic2")
33
 
34
  import torch
 
35
  import torchaudio
36
 
37
  audio, sr = torchaudio.load('path_to_file/audio.wav') # Load audio, make sure it is mono, not stereo
 
38
  audio = torchaudio.transforms.Resample(orig_freq=sr, new_freq=16000)(audio)
 
39
  audio = audio.squeeze().numpy()
40
 
41
  result = pipe(audio, top_k=6)
42
 
43
  print(result)
 
44
  print('First language of this speaker is predicted to be ' + result[0]['label'] + ' with ' + str(result[0]['score']*100) + '% confidence')
45
+ ```
46
  ## Intended uses & limitations
47
 
48
  The model is very accurate for novel recordings from the original dataset that were not used for train/test. However, the model is not accurate for voices from outside the dataset. Unfortunetely with only 24 speakers represented, it seems like the model memorized other characteristics of these voices besides accent, thus not creating a model very generalizable to the real world.