gghfez commited on
Commit
ba51a84
·
verified ·
1 Parent(s): a3cfc70

Add a model card

Browse files

Adding a quick model card since people are downloading this.

Files changed (1) hide show
  1. README.md +29 -1
README.md CHANGED
@@ -1,4 +1,32 @@
1
  ---
2
  base_model:
3
  - AesSedai/GLM-4.6-REAP-266B-A32B
4
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  base_model:
3
  - AesSedai/GLM-4.6-REAP-266B-A32B
4
+ - zai-org/GLM-4.6
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
+ ---
8
+
9
+ This is Q4_K_M gguf quant of [AesSedai/GLM-4.6-REAP-266B-A32B](https://huggingface.co/AesSedai/GLM-4.6-REAP-266B-A32B)
10
+
11
+ # What Is This?
12
+
13
+ [AesSedai/GLM-4.6-REAP-266B-A32B](https://huggingface.co/AesSedai/GLM-4.6-REAP-266B-A32B) was created using REAP (Router-weighted Expert Activation Pruning), a novel expert pruning method that selectively removes redundant experts while preserving the router's independent control over remaining experts.
14
+
15
+ See the GLM-4.5-Air version by Cerebras for more details [cerebras/GLM-4.5-Air-REAP-82B-A12B](https://huggingface.co/cerebras/GLM-4.5-Air-REAP-82B-A12B)
16
+
17
+ The MTP tensors were *not* included in this quant (though llama.cpp hasn't implemented this feature anyway)
18
+
19
+ ** Imatrix **
20
+
21
+ [GLM-4.6-REAP-266B-A32B-imatrix.dat](https://huggingface.co/gghfez/GLM-4.6-REAP-266B-A32B-Q4_K/resolve/main/GLM-4.6-REAP-266B-A32B-imatrix.dat)
22
+
23
+ # Original Model Card for GLM-4.6-REAP
24
+
25
+ Note: currently non-functional because of missing `mtp.safetensors` file and entry in `model.safetensors.index.json`
26
+
27
+ Forked from https://github.com/CerebrasResearch/reap to https://github.com/AesSedai/reap to hack in GLM-4.6 support.
28
+
29
+ Produced with:
30
+ ```
31
+ bash experiments/pruning-cli.sh 0,1,2,3,4,5,6,7 zai-org/GLM-4.6 reap 42 0.25 theblackcat102/evol-codealpaca-v1 true true true false false
32
+ ```