gghfez
/

GLM-4.6-REAP-266B-A32B-Q4_K

 ---
 base_model:
 - AesSedai/GLM-4.6-REAP-266B-A32B
+- zai-org/GLM-4.6
+pipeline_tag: text-generation
+library_name: transformers
+---
+This is Q4_K_M gguf quant of [AesSedai/GLM-4.6-REAP-266B-A32B](https://huggingface.co/AesSedai/GLM-4.6-REAP-266B-A32B)
+# What Is This?
+[AesSedai/GLM-4.6-REAP-266B-A32B](https://huggingface.co/AesSedai/GLM-4.6-REAP-266B-A32B) was created using REAP (Router-weighted Expert Activation Pruning), a novel expert pruning method that selectively removes redundant experts while preserving the router's independent control over remaining experts.
+See the GLM-4.5-Air version by Cerebras for more details [cerebras/GLM-4.5-Air-REAP-82B-A12B](https://huggingface.co/cerebras/GLM-4.5-Air-REAP-82B-A12B)
+The MTP tensors were *not* included in this quant (though llama.cpp hasn't implemented this feature anyway)
+** Imatrix **
+[GLM-4.6-REAP-266B-A32B-imatrix.dat](https://huggingface.co/gghfez/GLM-4.6-REAP-266B-A32B-Q4_K/resolve/main/GLM-4.6-REAP-266B-A32B-imatrix.dat)
+# Original Model Card for GLM-4.6-REAP
+Note: currently non-functional because of missing `mtp.safetensors` file and entry in `model.safetensors.index.json`
+Forked from https://github.com/CerebrasResearch/reap to https://github.com/AesSedai/reap to hack in GLM-4.6 support.
+Produced with:
+```
+bash experiments/pruning-cli.sh 0,1,2,3,4,5,6,7 zai-org/GLM-4.6 reap 42 0.25 theblackcat102/evol-codealpaca-v1 true true true false false
+```