Request for GLM-4.6-REAP-218B-A32B quantization

#1
by flymyd - opened

On a device with RTX A6000 48G*4, using the W4A16 quantized version of GLM-4.6-REAP-218B-A32B can provide a longer context window. So please make one here for test & research, thank you!

Sign up or log in to comment