--- license: llama3.1 library_name: transformers pipeline_tag: image-text-to-text tags: - int4 - vllm - llmcompressor base_model: - meta-llama/Llama-3.1-8B-Instruct --- # Llama-3.1-8B-Instruct-MR-GPTQ-mxfp ## Model Overview This model was obtained by quantizing the weights of [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) to MXFP4 data type. This optimization reduces the number of bits per parameter from 16 to 4.25, reducing the disk size and GPU memory requirements by approximately 73%. ## Usage *MR-GPTQ* quantized models with [QuTLASS](https://github.com/IST-DASLab/qutlass) kernels are supported in the following integrations: - `transformers` with these features: - Available in `main` ([Documentation](https://huggingface.co/docs/transformers/main/en/quantization/fp_quant#fp-quant)). - RTN on-the-fly quantization. - Pseudo-quantization QAT. - `vLLM` with these features: - Available in [this PR](https://github.com/vllm-project/vllm/pull/24440). - Compatible with real quantization models from `FP-Quant` and the `transformers` integration. ## Evaluation This model was evaluated on a subset of OpenLLM v1 benchmarks and Platinum bench. Model outputs were generated with the `vLLM` engine. *OpenLLM v1 results* | Model | MMLU‑CoT | GSM8k | Hellaswag | Winogrande | **Average** | **Recovery (%)** | |--------------------------------------------------------------------------------------------------|--------:|------:|----------:|-----------:|------------:|-----------------:| | `meta‑llama/Llama 3.1‑8B‑Instruct` | 0.7276 | 0.8506 | 0.8001 | 0.7790 | 0.7893 | – | | `ISTA‑DASLab/Llama‑3.1‑8B‑Instruct‑MR‑GPTQ‑mxfp` | 0.6754 | 0.7892 | 0.7737 | 0.7324 | 0.7427 | 94.09 | *Platinum bench results* Below we report recoveries on individual tasks as well as the average recovery. **Recovery by Task** | Task | Recovery (%) | |------|--------------| | SingleOp | 97.94 | | SingleQ | 95.95 | | MultiArith | 98.22 | | SVAMP | 95.08 | | GSM8K | 93.69 | | MMLU-Math | 80.54 | | BBH-LogicalDeduction-3Obj | 89.87 | | BBH-ObjectCounting | 82.03 | | BBH-Navigate | 90.66 | | TabFact | 86.92 | | HotpotQA | 96.81 | | SQuAD | 98.46 | | DROP | 94.33 | | Winograd-WSC | 89.47 | | Average | **92.14** |