Boosting Generative Image Modeling via Joint Image-Feature Synthesis

ReDi learns to generate coherent image-feature pairs from pure noise, significantly enhancing both generative quality and training efficiency.

This model uses SiT as the base model. We train for 4M steps with a batch size of 256 on ImageNet 256x256.

Generative performance on Imagenet Validation Set.

Model	FID	SFID	IS	Prec	Rec
SiT-XL/2 w/ ReDi	1.64	4.63	289.3	0.65	0.77

Sample Usage

You can sample from our pre-trained ReDi models with sample.py.

python sample.py SDE --image-size 256 --seed 42 --ckpt /path/to/ckpt

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support