Update README.md
Browse files
README.md
CHANGED
|
@@ -34,15 +34,6 @@ inference: false
|
|
| 34 |
**Description:**
|
| 35 |
- The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
|
| 36 |
|
| 37 |
-
**RAM usage (WIP):**
|
| 38 |
-
Model | Startup RAM usage (KoboldCpp)
|
| 39 |
-
:--:|:--:
|
| 40 |
-
pygmalion-6b-dev.q4_0.bin | 3.7 GiB
|
| 41 |
-
pygmalion-6b-dev.q4_1.bin | 4.1 GiB
|
| 42 |
-
pygmalion-6b-dev.q5_0.bin | 4.4 GiB
|
| 43 |
-
pygmalion-6b-dev.q5_1.bin | 4.8 GiB
|
| 44 |
-
pygmalion-6b-dev.q8_0.bin | 6.5 GiB
|
| 45 |
-
|
| 46 |
**Notes:**
|
| 47 |
- rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
|
| 48 |
|
|
|
|
| 34 |
**Description:**
|
| 35 |
- The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
|
| 36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
**Notes:**
|
| 38 |
- rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
|
| 39 |
|