Crataco
/

RWKV-4-World-Series-GGML

Text Generation

Model card Files Files and versions

Crataco commited on Oct 1, 2023

Commit

b622395

·

1 Parent(s): 58d0487

Update README.md

Files changed (1) hide show

README.md +0 -9

README.md CHANGED Viewed

@@ -34,15 +34,6 @@ inference: false
 **Description:**
 - The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
-**RAM usage (WIP):**
-Model | Startup RAM usage (KoboldCpp)
-:--:|:--:
-pygmalion-6b-dev.q4_0.bin | 3.7 GiB
-pygmalion-6b-dev.q4_1.bin | 4.1 GiB
-pygmalion-6b-dev.q5_0.bin | 4.4 GiB
-pygmalion-6b-dev.q5_1.bin | 4.8 GiB
-pygmalion-6b-dev.q8_0.bin | 6.5 GiB
 **Notes:**
 - rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.

 **Description:**
 - The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
 **Notes:**
 - rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.