MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization
Mingkai Jia1,2, Wei Yin2*ยง, Xiaotao Hu1,2, Jiaxin Guo3, Xiaoyang Guo2
Qian Zhang2, Xiao-Xiao Long4, Ping Tan1
HKUST1, Horizon Robotics2, CUHK3, NJU4
* Corresponding Author, ยง Project Leader

๐News
[August 2025]Achieve SOTA at paperwithcode leaderboards: Image Reconstruction on ImageNet and UHDBench.

[August 2025]Released Inference Code[August 2025]Released model zoo.[August 2025]Released dataset for ultra-high-definition image reconstruction evaluation. Our proposed super-resolution image reconstruction UHDBench dataset is released.[July 2025]Released paper.
๐จTO DO LIST
- Training code.
- More demos.
- Models & Evaluation code.
- Huggingface models.
- Release zero-shot reconstruction benchmarks.
๐ Model Zoo
| Model | Downsample | Groups | Codebook Size | Training Data | Link |
|---|---|---|---|---|---|
| mgvq-f8c32-g4 | 8 | 4 | 32768 | imagenet | link |
| mgvq-f8c32-g8 | 8 | 8 | 16384 | imagenet | link |
| mgvq-f16c32-g4 | 16 | 4 | 32768 | imagenet | link |
| mgvq-f16c32-g8 | 16 | 8 | 16384 | imagenet | link |
| mgvq-f16c32-g4-mix | 16 | 4 | 32768 | mix | link |
| mgvq-f32c32-g8-mix | 32 | 8 | 16384 | mix | link |
๐ Quick Start
Installation
git clone https://github.com/MKJia/MGVQ.git
cd MGVQ
pip3 install requirements.txt
Download models
Download the pretrained models from our model zoo to your /path/to/your/ckpt.
Data Preparation
Try our UHDBench dataset on huggingface and download to your /path/to/your/dataset.
Evaluation on Reconstruction
Remember to change the paths of ckpt and dataset_root, and make sure you are evaluating the expected model on dataset.
cd evaluation
python3 eval_recon.sh
Generation Demo&Evaluation
You can download the pretrained GPT model for generation on huggingface, and test it with our mgvq-f16c32-g4 tokenizer model for demo image sampling. Remember to change the paths of gpt_ckpt and vq_ckpt.
cd evaluation
python3 demo_gen.sh
We also provide our .npz file on huggingface sampled by sample_c2i_ddp.py for evaluation.
cd evaluation
python3 evaluator.py /path/to/your/VIRTUAL_imagenet256_labeled.npz /path/to/your/GPT_XXL_300ep_topk_12.npz
๐๏ธDemos
- ๐ฅ Qualitative reconstruction images with $16$ x downsampling on $2560$ x $1440$ UHDBench dataset.
- ๐ฅ Qualitative class-to-image generation of Imagenet. The classes are dog(Golden Retriever and Husky), cliff, and bald eagle.
- ๐ฅ Reconstruction evaluation on 256ร256 ImageNet benchmark.
- ๐ฅ Zero-shot reconstruction evaluation with a downsample ratio of 16 on 512ร512 datasets.
- ๐ฅ Zero-shot reconstruction evaluation with a downsample ratio of 16 on 2560ร1440 datasets.

๐๏ธDemos
๐ Citation
If the paper and code from MGVQ help your research, we kindly ask you to give a citation to our paper โค๏ธ. Additionally, if you appreciate our work and find this repository useful, giving it a star โญ๏ธ would be a wonderful way to support our work. Thank you very much.
@article{jia2025mgvq,
title={MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization},
author={Jia, Mingkai and Yin, Wei and Hu, Xiaotao and Guo, Jiaxin and Guo, Xiaoyang and Zhang, Qian and Long, Xiao-Xiao and Tan, Ping},
journal={arXiv preprint arXiv:2507.07997},
year={2025}
}
License
This repository is under the MIT License. For more license questions, please contact Mingkai Jia ([email protected]) and Wei Yin ([email protected]).