Please add Q3_K_M or Q3_K_S quants

by ThiloteE - opened Jul 27

Jul 27

•

I only have 12GB VRAM (Nvidia Geforce RTX 3060) available right now. If you upload a Q3, i will try this model tomorrow with your pull-request in the server version of llama.cpp. (But don't expect much coding feedback from me. C++ and Python ain't my forte at present time.)

ThiloteE

Jul 28

Thank you!

ThiloteE changed discussion status to closed Jul 28

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment