bpe-model mismatches with the released model

by monkey369 - opened Jul 15

Jul 15

The released model's tokens should include CN-char and EN-bpe and total 6524, yet the attached bpe-model in this repo is still the pure en one, bpe500, causing I can't do decoding. Cloud you please kindly release the matched bpe-model?
Thanks

monkey369 changed discussion title from bpe-model mismatch with the released model to bpe-model mismatches with the released model Jul 15

pfluo

Owner Jul 23

•

edited Jul 23

the bpe-model had released in https://huggingface.co/pfluo/k2fsa-zipformer-chinese-english-mixed/blob/main/data/lang_char_bpe/bpe.model
you are free to use this model in sherpa-onnx : https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english
@monkey369

BAI002

Aug 11

The released model's tokens should include CN-char and EN-bpe and total 6524, yet the attached bpe-model in this repo is still the pure en one
RuntimeError: Error(s) in loading state_dict for Transducer:
size mismatch for encoder.encoders.2.encoder.layers.0.feed_forward1.in_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([2048, 384]).
...
size mismatch for decoder.embedding.weight: copying a param with shape torch.Size([6254, 512]) from checkpoint, the shape in current model is torch.Size([500, 512]).
...

希望能够检查上传的模型 checkpoint["model"] 的 encoder_dim、vocab_size 等关键参数是否与当前模型一致。尤其是bep.model
Add validation for checkpoint architecture (encoder_dim, vocab_size, etc.) (especially for bep.model)

Thanks

dongse

Aug 11

•

edited Aug 11

the bpe-model had released in https://huggingface.co/pfluo/k2fsa-zipformer-chinese-english-mixed/blob/main/data/lang_char_bpe/bpe.model

you are free to use this model in sherpa-onnx : https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english
@monkey369

@pfluo How to finetune this model? Can I find recipes in https://github.com/k2-fsa/icefall?

monkey369

Aug 20

•

edited Aug 20

Hi
I see! Author use sherpa-onnx decode wavs like "https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english", this way uses tokens.txt instead of bbpe.mode.
yet, I follow instructions here "https://k2-fsa.github.io/icefall/recipes/Streaming-ASR/librispeech/zipformer_transducer.html" , using decode.py decode wavs, this way must use bbpe.model.
the bbpe.model in "https://huggingface.co/pfluo/k2fsa-zipformer-chinese-english-mixed/blob/main/data/lang_char_bpe/bpe.model" has vocab_size=500, but the released model's vocab_size=6254, could you please kindly check and upload the vocab_size=6254 one.

Thanks

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment