ZeroWw
/

gemma-2b-it-GGUF

ZeroWw commited on Jul 31, 2024

Commit

8e6f2b7

verified ·

1 Parent(s): 1f9d0aa

Upload folder using huggingface_hub

Files changed (1) hide show

README.md ADDED Viewed

+---
+license: mit
+language:
+- en
+pipeline_tag: text-generation
+---
+My own (ZeroWw) quantizations.
+output and embed tensors quantized to f16.
+all other tensors quantized to q5_k or q6_k.
+Result:
+both f16.q6 and f16.q5 are smaller than q8_0 standard quantization
+and they perform as well as the pure f16.
+Updated on: Wed Jul 31, 21:04:14