Compatible small models for speculative decoding?
β
1
#9 opened 12 months ago
by
treehugg3
How many GPU ram needed?
1
#8 opened about 1 year ago
by
RaidXD
q8 with 8 part
#7 opened about 1 year ago
by
sdyy
Q6_K vs. Q5_K_L
3
#6 opened about 1 year ago
by
AIGUYCONTENT
Unable to pull in from Ollama
5
#3 opened about 1 year ago
by
AIGUYCONTENT
Observation: 4-bit quantization can't answer the Strawberry prompt
π
1
12
#2 opened about 1 year ago
by
ThePabli
Nemotron 51B too please
π
8
4
#1 opened about 1 year ago
by
nacs