This should fit on 2x3090s on windows with a 18-19,24 gpu split with 6k-8k context. Uses the new exlv2 quant

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support