Experimental global target bits‑per‑weight quantization of Octen/Octen-Embedding-8B

Using non-standard (forked) LLaMA C++ branch for quantization.
Using a CLI tool to build KLD evaluation and imatrix calibration datasets for GGUF models, sourced from eaddario/imatrix-calibration.
Using dataset sources: text_en, text_ru.
Using dataset chunks: 750.
Small set of patches added.
Tensors quantinization F16 instead of BF16, Nvidia Pascal architecture friendly like P100.
Small set of patches added.

Many thanks to Ed Addario for an impressive job.

Quantization comparison

BPW/TGS	PPL correlation	PPL mean ratio	ΔPPL	Mean KLD	Maximum KLD	99.9% KLD	Mean Δp	RMS Δp
3.50	31.88%	163.312009 ± 2.603029	1316751250.467953 ± 22175721.097179	0.865687 ± 0.002557	40.508530	13.199468	-0.000 ± 0.001 %	0.240 ± 0.035 %
4.00	27.97%	894.496097 ± 17.658576	7248459977.767529 ± 148545165.662543	0.462401 ± 0.002284	56.080093	14.465672	-0.001 ± 0.000 %	0.147 ± 0.022 %
4.50	29.87%	536.818391 ± 10.018571	4346810434.048196 ± 84695221.429227	0.247896 ± 0.001484	39.661839	8.264471	0.000 ± 0.000 %	0.130 ± 0.018 %
5.00	30.26%	448.634201 ± 8.243512	3631418870.079112 ± 69768738.026683	0.189924 ± 0.001342	36.415863	7.533142	-0.000 ± 0.000 %	0.097 ± 0.016 %
5.50	29.94%	475.154499 ± 8.855907	3846563982.921083 ± 74878878.690613	0.175310 ± 0.001361	33.947906	7.652236	-0.000 ± 0.000 %	0.132 ± 0.037 %
6.00	30.18%	535.916733 ± 10.050226	4339495766.813949 ± 85010916.130295	0.093320 ± 0.000936	31.576149	5.749511	-0.000 ± 0.000 %	0.085 ± 0.019 %
6.50	30.33%	513.696057 ± 9.587249	4159231201.265534 ± 81127272.345352	0.076551 ± 0.000824	33.049152	5.351891	-0.000 ± 0.000 %	0.049 ± 0.009 %
7.00	30.48%	487.499691 ± 9.042077	3946713977.342775 ± 76548947.044238	0.069732 ± 0.000789	31.959265	5.224213	-0.000 ± 0.000 %	0.060 ± 0.014 %
7.50	30.45%	485.864997 ± 9.013780	3933452574.640460 ± 76305069.637214	0.066390 ± 0.000758	27.289934	5.135751	-0.000 ± 0.000 %	0.049 ± 0.009 %
8.00	30.56%	480.323684 ± 8.884633	3888498839.452749 ± 75233350.710666	0.064214 ± 0.000758	22.896826	4.984374	0.000 ± 0.000 %	0.045 ± 0.006 %
8.50	30.59%	468.816726 ± 8.658897	3795148991.398126 ± 73329353.181311	0.063394 ± 0.000767	27.460506	4.974599	-0.000 ± 0.000 %	0.039 ± 0.005 %
9.00	30.59%	472.128288 ± 8.725107	3822013941.457990 ± 73888001.523125	0.061325 ± 0.000754	26.071749	4.991961	0.000 ± 0.000 %	0.032 ± 0.004 %
9.50	30.57%	477.493384 ± 8.834961	3865538118.737411 ± 74813478.721664	0.061779 ± 0.000778	28.092499	5.049781	-0.000 ± 0.000 %	0.038 ± 0.006 %
10.00	30.58%	473.251580 ± 8.749327	3831126611.272995 ± 74092083.168597	0.060787 ± 0.000750	27.365194	5.046810	-0.000 ± 0.000 %	0.032 ± 0.004 %
10.50	30.58%	473.369865 ± 8.754410	3832086195.704186 ± 74134265.313673	0.061487 ± 0.000778	29.115179	5.049273	-0.000 ± 0.000 %	0.031 ± 0.005 %
11.00	30.58%	469.947653 ± 8.686996	3804323606.512961 ± 73563714.202142	0.060947 ± 0.000761	26.897139	4.949221	-0.000 ± 0.000 %	0.032 ± 0.004 %
11.50	30.59%	469.702016 ± 8.680818	3802330885.149252 ± 73513517.264363	0.060967 ± 0.000756	24.905037	4.991287	-0.000 ± 0.000 %	0.042 ± 0.006 %
12.00	30.59%	469.007636 ± 8.666011	3796697743.108781 ± 73388654.821674	0.060841 ± 0.000757	29.013231	4.902389	-0.000 ± 0.000 %	0.034 ± 0.004 %
12.50	30.60%	468.247009 ± 8.650971	3790527181.157271 ± 73262486.731016	0.061428 ± 0.000774	26.518728	5.096298	-0.000 ± 0.000 %	0.039 ± 0.007 %
13.00	30.59%	468.485073 ± 8.656076	3792458472.236744 ± 73305035.806184	0.060770 ± 0.000756	27.815191	4.977703	-0.000 ± 0.000 %	0.040 ± 0.006 %
13.50	30.60%	468.608802 ± 8.658301	3793462215.247329 ± 73324801.786474	0.060845 ± 0.000748	25.343117	5.012136	-0.000 ± 0.000 %	0.034 ± 0.006 %
14.00	30.59%	470.353813 ± 8.694041	3807618563.064641 ± 73625192.396193	0.060969 ± 0.000763	27.384163	5.017433	0.000 ± 0.000 %	0.033 ± 0.004 %
14.50	30.59%	469.238406 ± 8.669486	3798569859.379515 ± 73417558.393644	0.060245 ± 0.000763	25.959768	4.983978	0.000 ± 0.000 %	0.030 ± 0.004 %
15.00	30.59%	470.262969 ± 8.688094	3806881593.724875 ± 73576296.537943	0.060078 ± 0.000773	29.312548	5.103179	0.000 ± 0.000 %	0.029 ± 0.004 %

Downloads last month: 12,336

GGUF

Model size

8B params

Architecture

qwen3

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for ENOSYS/Octen-Embedding-8B-750-v1-GGUF

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-Embedding-8B

Finetuned

Octen/Octen-Embedding-8B

Quantized

(6)

this model

ENOSYS
/

Octen-Embedding-8B-750-v1-GGUF

Experimental global target bits‑per‑weight quantization of Octen/Octen-Embedding-8B

Quantization comparison

Model tree for ENOSYS/Octen-Embedding-8B-750-v1-GGUF

Dataset used to train ENOSYS/Octen-Embedding-8B-750-v1-GGUF