---
base_model:
- meta-llama/Llama-3.2-1B
base_model_relation: quantized
license: llama3.2
---
# Model Card

- Base model: `meta-llama/Llama-3.2-1B`
- Quantization method: Memory constrained MSQ with Q-Palette
- Target bit-width: 3.25
- Backend kernel: Q-Palette kernel
- Calibration data: N/A (data-free)

# How to run
- Follow the instruction in https://github.com/snu-mllab/Q-Palette.

# References
- [Model Paper](https://arxiv.org/abs/2509.20214)