Experimental global target bits‑per‑weight quantization of coder3101/Qwen3.5-0.8B-heretic

  • Using non-standard (forked) LLaMA C++ branch for quantization.
  • Using a CLI tool to build KLD evaluation and imatrix calibration datasets for GGUF models, sourced from eaddario/imatrix-calibration.
  • Using dataset sources: tools, text_en, text_ru.
  • Using dataset chunks: 250.
  • Small set of patches added.
  • Tensors quantinization F16 instead of BF16, Nvidia Pascal architecture friendly like P100.
  • Multimodal work perfectly, use:

Many thanks to Ed Addario for an impressive job.

Quantization comparison

BPW PPL correlation PPL mean ratio ΔPPL Mean KLD Maximum KLD 99.9% KLD Mean Δp RMS Δp
5.00 99.29% 1.059526 ± 0.001643 1.756728 ± 0.051467 0.058415 ± 0.000403 4.997534 1.319774 -1.158 ± 0.023 % 5.922 ± 0.069 %
5.25 99.45% 1.046483 ± 0.001431 1.371810 ± 0.044801 0.045465 ± 0.000307 8.604090 0.824039 -0.816 ± 0.020 % 5.073 ± 0.058 %
5.30 99.48% 1.050294 ± 0.001394 1.484259 ± 0.045502 0.042079 ± 0.000276 4.073319 0.865040 -0.599 ± 0.019 % 4.851 ± 0.055 %
5.50 99.69% 1.031403 ± 0.001050 0.926770 ± 0.033149 0.025594 ± 0.000166 2.991449 0.464138 -0.443 ± 0.015 % 3.851 ± 0.043 %
5.75 99.76% 1.029774 ± 0.000931 0.878684 ± 0.029887 0.020243 ± 0.000123 1.438466 0.375554 -0.376 ± 0.014 % 3.398 ± 0.037 %
5.80 99.76% 1.028585 ± 0.000921 0.843599 ± 0.029546 0.019801 ± 0.000114 1.265909 0.345507 -0.351 ± 0.013 % 3.350 ± 0.035 %
6.00 99.77% 1.027145 ± 0.000907 0.801091 ± 0.028947 0.018648 ± 0.000133 2.043392 0.375141 -0.285 ± 0.013 % 3.258 ± 0.041 %
6.25 99.82% 1.024318 ± 0.000804 0.717663 ± 0.025706 0.014839 ± 0.000099 1.785646 0.298516 -0.284 ± 0.011 % 2.874 ± 0.036 %
6.30 99.82% 1.023706 ± 0.000792 0.699616 ± 0.025285 0.014302 ± 0.000095 1.709756 0.276922 -0.259 ± 0.011 % 2.805 ± 0.035 %
6.50 99.89% 1.016242 ± 0.000610 0.479325 ± 0.019372 0.008103 ± 0.000062 1.610427 0.162712 -0.127 ± 0.009 % 2.133 ± 0.029 %
6.75 99.92% 1.013618 ± 0.000516 0.401886 ± 0.016359 0.005670 ± 0.000041 1.083693 0.092621 -0.094 ± 0.007 % 1.774 ± 0.023 %
6.80 99.93% 1.012755 ± 0.000512 0.376414 ± 0.016089 0.005588 ± 0.000041 1.147180 0.094581 -0.095 ± 0.007 % 1.770 ± 0.023 %
7.00 99.93% 1.011247 ± 0.000477 0.331926 ± 0.014920 0.004759 ± 0.000029 0.523031 0.078734 -0.061 ± 0.007 % 1.635 ± 0.018 %
7.25 99.94% 1.008928 ± 0.000447 0.263474 ± 0.013783 0.004105 ± 0.000026 0.582664 0.065920 -0.055 ± 0.006 % 1.515 ± 0.020 %
7.30 99.94% 1.008643 ± 0.000448 0.255083 ± 0.013766 0.004053 ± 0.000027 0.765216 0.060921 -0.062 ± 0.006 % 1.500 ± 0.022 %
7.50 99.94% 1.010851 ± 0.000454 0.320225 ± 0.014246 0.004111 ± 0.000026 0.404290 0.067985 -0.076 ± 0.006 % 1.461 ± 0.017 %
7.75 99.95% 1.008024 ± 0.000422 0.236818 ± 0.012907 0.003528 ± 0.000022 0.408612 0.063072 -0.082 ± 0.005 % 1.355 ± 0.017 %
7.80 99.95% 1.008085 ± 0.000419 0.238604 ± 0.012823 0.003465 ± 0.000023 0.528191 0.062764 -0.087 ± 0.005 % 1.353 ± 0.021 %
8.00 99.96% 1.005559 ± 0.000390 0.164064 ± 0.011777 0.002856 ± 0.000016 0.220663 0.044699 -0.052 ± 0.005 % 1.206 ± 0.014 %
8.25 99.96% 1.004695 ± 0.000363 0.138546 ± 0.010916 0.002411 ± 0.000016 0.373060 0.040841 -0.052 ± 0.004 % 1.109 ± 0.013 %
8.30 99.96% 1.003806 ± 0.000350 0.112314 ± 0.010490 0.002171 ± 0.000015 0.265114 0.039519 -0.050 ± 0.004 % 1.068 ± 0.015 %
8.50 99.97% 1.002709 ± 0.000320 0.079948 ± 0.009532 0.001665 ± 0.000012 0.213020 0.031954 -0.047 ± 0.004 % 0.949 ± 0.014 %
8.75 99.98% 1.002058 ± 0.000278 0.060726 ± 0.008256 0.001066 ± 0.000006 0.129191 0.015715 -0.026 ± 0.003 % 0.759 ± 0.008 %
8.80 99.98% 1.001981 ± 0.000279 0.058457 ± 0.008289 0.001061 ± 0.000006 0.084854 0.016091 -0.024 ± 0.003 % 0.757 ± 0.008 %
9.00 99.98% 1.001800 ± 0.000278 0.053131 ± 0.008258 0.001039 ± 0.000006 0.098692 0.014809 -0.016 ± 0.003 % 0.748 ± 0.008 %
9.25 99.98% 1.002021 ± 0.000273 0.059643 ± 0.008124 0.001001 ± 0.000005 0.113834 0.013527 -0.017 ± 0.003 % 0.729 ± 0.007 %
9.30 99.98% 1.001910 ± 0.000272 0.056366 ± 0.008083 0.000991 ± 0.000005 0.065356 0.013138 -0.020 ± 0.003 % 0.723 ± 0.007 %
9.50 99.98% 1.001854 ± 0.000270 0.054706 ± 0.008044 0.000972 ± 0.000005 0.079184 0.013451 -0.015 ± 0.003 % 0.712 ± 0.007 %
9.75 99.98% 1.001944 ± 0.000265 0.057374 ± 0.007900 0.000890 ± 0.000005 0.050911 0.012873 -0.008 ± 0.003 % 0.682 ± 0.007 %
9.80 99.98% 1.001689 ± 0.000266 0.049859 ± 0.007921 0.000893 ± 0.000005 0.102190 0.012910 -0.005 ± 0.003 % 0.690 ± 0.008 %
10.00 99.98% 1.001807 ± 0.000263 0.053316 ± 0.007831 0.000857 ± 0.000005 0.090375 0.011359 -0.008 ± 0.003 % 0.667 ± 0.007 %
10.25 99.98% 1.001932 ± 0.000260 0.057012 ± 0.007755 0.000826 ± 0.000004 0.057231 0.011596 -0.008 ± 0.003 % 0.658 ± 0.007 %
10.30 99.98% 1.001970 ± 0.000260 0.058130 ± 0.007752 0.000809 ± 0.000004 0.095869 0.010581 -0.004 ± 0.003 % 0.649 ± 0.006 %
10.50 99.98% 1.001825 ± 0.000257 0.053865 ± 0.007652 0.000778 ± 0.000004 0.089146 0.010600 -0.008 ± 0.003 % 0.640 ± 0.006 %
10.75 99.98% 1.001755 ± 0.000253 0.051794 ± 0.007546 0.000735 ± 0.000004 0.082066 0.009425 -0.010 ± 0.003 % 0.625 ± 0.007 %
10.80 99.98% 1.001747 ± 0.000253 0.051550 ± 0.007534 0.000742 ± 0.000004 0.114091 0.009784 -0.008 ± 0.003 % 0.627 ± 0.007 %
11.00 99.98% 1.001871 ± 0.000254 0.055222 ± 0.007565 0.000734 ± 0.000006 0.309870 0.009875 -0.007 ± 0.003 % 0.627 ± 0.010 %
11.25 99.98% 1.001985 ± 0.000251 0.058592 ± 0.007481 0.000706 ± 0.000004 0.054151 0.009861 -0.007 ± 0.002 % 0.611 ± 0.006 %
11.30 99.98% 1.001820 ± 0.000251 0.053715 ± 0.007489 0.000697 ± 0.000004 0.094932 0.009620 -0.007 ± 0.002 % 0.610 ± 0.008 %
11.50 99.98% 1.001698 ± 0.000249 0.050126 ± 0.007425 0.000681 ± 0.000003 0.055867 0.009582 -0.008 ± 0.002 % 0.601 ± 0.006 %
11.75 99.98% 1.001800 ± 0.000248 0.053135 ± 0.007380 0.000655 ± 0.000003 0.061040 0.008505 -0.005 ± 0.002 % 0.592 ± 0.007 %
11.80 99.98% 1.001737 ± 0.000248 0.051273 ± 0.007370 0.000663 ± 0.000003 0.055573 0.009097 -0.008 ± 0.002 % 0.591 ± 0.006 %
Downloads last month
18,854
GGUF
Model size
0.8B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ENOSYS/Qwen3.5-0.8B-heretic-250-v1

Quantized
(1)
this model

Dataset used to train ENOSYS/Qwen3.5-0.8B-heretic-250-v1