bpe-model mismatches with the released model
The released model's tokens should include CN-char and EN-bpe and total 6524, yet the attached bpe-model in this repo is still the pure en one, bpe500, causing I can't do decoding. Cloud you please kindly release the matched bpe-model?
Thanks
- the bpe-model had released in https://huggingface.co/pfluo/k2fsa-zipformer-chinese-english-mixed/blob/main/data/lang_char_bpe/bpe.model
- you are free to use this model in sherpa-onnx : https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english
@monkey369
The released model's tokens should include CN-char and EN-bpe and total 6524, yet the attached bpe-model in this repo is still the pure en one
RuntimeError: Error(s) in loading state_dict for Transducer:
size mismatch for encoder.encoders.2.encoder.layers.0.feed_forward1.in_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([2048, 384]).
...
size mismatch for decoder.embedding.weight: copying a param with shape torch.Size([6254, 512]) from checkpoint, the shape in current model is torch.Size([500, 512]).
...
希望能够检查上传的模型 checkpoint["model"] 的 encoder_dim、vocab_size 等关键参数是否与当前模型一致。尤其是bep.model
Add validation for checkpoint architecture (encoder_dim, vocab_size, etc.) (especially for bep.model)
Thanks
- the bpe-model had released in https://huggingface.co/pfluo/k2fsa-zipformer-chinese-english-mixed/blob/main/data/lang_char_bpe/bpe.model
- you are free to use this model in sherpa-onnx : https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english
@monkey369
@pfluo How to finetune this model? Can I find recipes in https://github.com/k2-fsa/icefall?
Hi
I see! Author use sherpa-onnx decode wavs like "https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english", this way uses tokens.txt instead of bbpe.mode.
yet, I follow instructions here "https://k2-fsa.github.io/icefall/recipes/Streaming-ASR/librispeech/zipformer_transducer.html" , using decode.py decode wavs, this way must use bbpe.model.
the bbpe.model in "https://huggingface.co/pfluo/k2fsa-zipformer-chinese-english-mixed/blob/main/data/lang_char_bpe/bpe.model" has vocab_size=500, but the released model's vocab_size=6254, could you please kindly check and upload the vocab_size=6254 one.
Thanks