When using xinference to launch I hit the error "Cannot handle batch sizes > 1 if no padding token is defined."

#4
by langbj - opened

Xinference supports jinaai/jina-reranker-v2, refer to this https://inference.readthedocs.io/en/latest/models/builtin/rerank/jina-reranker-v2.html

But when using v3, I hit the error "Cannot handle batch sizes > 1 if no padding token is defined." I guess this is Xinference's problem and I have opened an issue against them https://github.com/xorbitsai/inference/issues/4144

But I wonder if there any quick fix from v3, e.g. any special model configuration so I can make it work for Xinference ?

Thanks

Jina AI org

Thanks for reaching out. It seems xinference has fixed this issue https://github.com/xorbitsai/inference/pull/4156 Thank you again for your effort. Now, Xinference user could use jina-reranker-v3 now ๐Ÿ˜Š .

Hi thank you for your reply, I tried their latest update, which said support jina-reranker-v3. But I still hit the error " Model jina-reranker-v3 cannot be run on engine sentence_transformers."

I check their implementation and find they did implemented on sentence_transformers. So is this a bug ?

Jina AI org

I have no idea. You could raise an issue in Xinference.

numb3r3 changed discussion status to closed

Sign up or log in to comment