Please add Q3_K_M or Q3_K_S quants

#1
by ThiloteE - opened

I only have 12GB VRAM (Nvidia Geforce RTX 3060) available right now. If you upload a Q3, i will try this model tomorrow with your pull-request in the server version of llama.cpp. (But don't expect much coding feedback from me. C++ and Python ain't my forte at present time.)

Thank you!

ThiloteE changed discussion status to closed

Sign up or log in to comment