CUDA requirement?

by stqc - opened 19 days ago

is it possible to run this model without CUDA requirements? say on apple silicon? would love to get some feedback on this

szuweifu

NVIDIA org 16 days ago

To my knowledge, Mamba uses custom Triton CUDA kernels for acceleration, so NVIDIA GPUs provide the best performance. On Apple Silicon, it is only runnable, and maybe you can refer to this repository for setup and execution: https://github.com/RoyChao19477/RE-USE-MPS

altunenes

13 days ago

•

edited 13 days ago

is it possible to run this model without CUDA requirements, say on Apple Silicon? would love to get some feedback on this

I exported it as an ONNX to see how it's going to work in cpu && (metal via webgpu) in Rust (in MacBook M3 Air 16 GB). And it's slow, very slow. Not practical usage without a decent gpu.
though:
https://github.com/microsoft/onnxruntime/issues/27796
It seems they merged that (I didnt test it yet with that new version). So I believe in the future versions, it could work a little bit faster.

szuweifu changed discussion status to closed 2 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment