Mixtral HQQ Quantized Models - a mobiuslabsgmbh Collection

mobiuslabsgmbh 's Collections

Aana Episodic Finetunes

FP4

Qwen

DeepSeek-R1-ReDistill

Aana

Llama2 HQQ Quantized Models

Mixtral HQQ Quantized Models

ViT HQQ Quantized Models

Mixtral HQQ Quantized Models

updated Mar 29, 2024

4-bit and 2-bit Mixtral models quantized using https://github.com/mobiusml/hqq

mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-4bit_g64-HQQ

Text Generation • Updated Feb 5 • 34 • 9
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-2bit_g16_s128-HQQ

Text Generation • Updated Feb 5 • 56 • 9
mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-2bit_g16_s128-HQQ

Text Generation • Updated Feb 5 • 61 • 4
mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-4bit_g64-HQQ

Text Generation • Updated Feb 5 • 29 • 1
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-HQQ

Text Generation • Updated Feb 5 • 31 • 38

Note If you are considering 2-bit instruct model use this one.
mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-attn-4bit-moe-2bit-HQQ

Text Generation • Updated Feb 5 • 29 • 7

Note If you are considering 2-bit base model use this one.
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-metaoffload-HQQ

Text Generation • Updated Feb 5 • 29 • 16

Note If you are considering 2-bit base model but is GPU pure this is a good option. Requires 13GB of RAM, but it will be slower.
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-3bit-metaoffload-HQQ

Text Generation • Updated Feb 5 • 30 • 13
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bitgs8-metaoffload-HQQ

Text Generation • Updated Feb 5 • 27 • 19