Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Paper | Project Page | Code
ReDi learns to generate coherent image-feature pairs from pure noise, significantly enhancing both generative quality and training efficiency.
Model Description
This model uses SiT as the base model. We train for 4M steps with a batch size of 256 on ImageNet 256x256.
Metrics
Generative performance on Imagenet Validation Set.
| Model | FID | SFID | IS | Prec | Rec |
|---|---|---|---|---|---|
| SiT-XL/2 w/ ReDi | 1.64 | 4.63 | 289.3 | 0.65 | 0.77 |
Sample Usage
You can sample from our pre-trained ReDi models with sample.py.
python sample.py SDE --image-size 256 --seed 42 --ckpt /path/to/ckpt
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support