CUDA requirement?

#1
by stqc - opened

is it possible to run this model without CUDA requirements? say on apple silicon? would love to get some feedback on this

NVIDIA org

To my knowledge, Mamba uses custom Triton CUDA kernels for acceleration, so NVIDIA GPUs provide the best performance. On Apple Silicon, it is only runnable, and maybe you can refer to this repository for setup and execution: https://github.com/RoyChao19477/RE-USE-MPS

is it possible to run this model without CUDA requirements, say on Apple Silicon? would love to get some feedback on this

I exported it as an ONNX to see how it's going to work in cpu && (metal via webgpu) in Rust (in MacBook M3 Air 16 GB). And it's slow, very slow. Not practical usage without a decent gpu.
though:
https://github.com/microsoft/onnxruntime/issues/27796
It seems they merged that (I didnt test it yet with that new version). So I believe in the future versions, it could work a little bit faster.

szuweifu changed discussion status to closed

Sign up or log in to comment