Please add Q3_K_M or Q3_K_S quants
#1
by
ThiloteE
- opened
I only have 12GB VRAM (Nvidia Geforce RTX 3060) available right now. If you upload a Q3, i will try this model tomorrow with your pull-request in the server version of llama.cpp. (But don't expect much coding feedback from me. C++ and Python ain't my forte at present time.)
Thank you!
ThiloteE
changed discussion status to
closed