Deployment support

by Arkduke - opened Jul 4

Jul 4

I’m trying to run the model on GPU but facing some version compatibility issues. Could you please provide the compatible versions for the following components: cuDNN, NumPy, PyTorch, ONNX Runtime (GPU), CUDA, Transformers, and Accelerate?

this is with the latest version of the libraries, I am running in a docker container with the image: nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04

this is happening when I mentioned the onnxruntime-gpu version to 1.17.0:

pronoobie

Jul 4

I converted these model to onnx version here https://github.com/deepanshu-yadav/Quantize_speech_Recognition_For_Hindi
This is for hindi. But performing infernce is easy for other languages is easy. Just figure out the offset number out of total 5632 tokens with each language has 256 tokens.

Arkduke

Jul 8

I want to deploy the multilingual model, and facing issue with gpu inferencing. Please help me regarding this.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment