Melvin56 commited on
Commit
cf6765b
·
verified ·
1 Parent(s): 36db42d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -3
README.md CHANGED
@@ -13,16 +13,42 @@ tags:
13
  ---
14
  # Melvin56/Qwen3-4B-ik_GGUF
15
 
16
- Quant for [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp)
17
 
18
- Original Model : [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B)
19
 
20
- Llama.cpp build: 36e6e888 (3681)
21
 
22
  I used imatrix to create all these quants using this [Dataset](https://gist.github.com/tristandruyen/9e207a95c7d75ddf37525d353e00659c/#file-calibration_data_v5_rc-txt).
23
 
24
  <img src="Qwen3-4B.png" alt="Perplexity" width="600">
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  ---
27
 
28
  | | CPU (AVX2) | CPU (ARM NEON) | Metal | cuBLAS | rocBLAS | SYCL | CLBlast | Vulkan | Kompute |
 
13
  ---
14
  # Melvin56/Qwen3-4B-ik_GGUF
15
 
16
+ # Quant for [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp)
17
 
18
+ Build: 3680 (a2d24c97)
19
 
20
+ Original Model : [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B)
21
 
22
  I used imatrix to create all these quants using this [Dataset](https://gist.github.com/tristandruyen/9e207a95c7d75ddf37525d353e00659c/#file-calibration_data_v5_rc-txt).
23
 
24
  <img src="Qwen3-4B.png" alt="Perplexity" width="600">
25
 
26
+ <details>
27
+ <summary>Perplexity measurement methodology.</summary>
28
+
29
+ I tested all quants using ik_llama.cpp build 3680 (a2d24c97)
30
+ ```
31
+ ik_llama.cpp/build/bin/llama-perplexity \
32
+ -m .gguf \
33
+ --ctx-size 512 \
34
+ --ubatch-size 512 \
35
+ -f wikitext-2-raw/wiki.test.raw \
36
+ -fa \
37
+ -ngl 999
38
+ ```
39
+
40
+ # Raw data
41
+
42
+ | Quant | Size (GB) | PPL |
43
+ |---------|-----------|---------------------|
44
+ | BF16 | 8.05 | 14.3308 +/- 0.13259 |
45
+ | IQ6_K | 3.34 | 14.2810 +/- 0.13159 |
46
+ | IQ5_K | 2.82 | 14.5004 +/- 0.13465 |
47
+ | IQ4_K | 2.38 | 14.5280 +/- 0.13414 |
48
+ | IQ4_KS | 2.22 | 15.2121 +/- 0.14294 |
49
+
50
+ </details>
51
+
52
  ---
53
 
54
  | | CPU (AVX2) | CPU (ARM NEON) | Metal | cuBLAS | rocBLAS | SYCL | CLBlast | Vulkan | Kompute |