When using xinference to launch I hit the error "Cannot handle batch sizes > 1 if no padding token is defined."

by langbj - opened Oct 17

Oct 17

Xinference supports jinaai/jina-reranker-v2, refer to this https://inference.readthedocs.io/en/latest/models/builtin/rerank/jina-reranker-v2.html

But when using v3, I hit the error "Cannot handle batch sizes > 1 if no padding token is defined." I guess this is Xinference's problem and I have opened an issue against them https://github.com/xorbitsai/inference/issues/4144

But I wonder if there any quick fix from v3, e.g. any special model configuration so I can make it work for Xinference ?

Thanks

numb3r3

Jina AI org Oct 23

Thanks for reaching out. It seems xinference has fixed this issue https://github.com/xorbitsai/inference/pull/4156 Thank you again for your effort. Now, Xinference user could use jina-reranker-v3 now 😊 .

langbj

18 days ago

Hi thank you for your reply, I tried their latest update, which said support jina-reranker-v3. But I still hit the error " Model jina-reranker-v3 cannot be run on engine sentence_transformers."

I check their implementation and find they did implemented on sentence_transformers. So is this a bug ?

numb3r3

Jina AI org 18 days ago

I have no idea. You could raise an issue in Xinference.

numb3r3 changed discussion status to closed 10 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment