SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model

Paper: SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model | Code: https://github.com/xiechun298/SV-DRR

Abstract

X-ray imaging is a rapid and cost-effective tool for visualizing internal human anatomy. While multi-view X-ray imaging provides complementary information that enhances diagnosis, intervention, and education, acquiring images from multiple angles increases radiation exposure and complicates clinical workflows. To address these challenges, we propose a novel view-conditioned diffusion model for synthesizing multi-view X-ray images from a single view. Unlike prior methods, which are limited in angular range, resolution, and image quality, our approach leverages the Diffusion Transformer to preserve fine details and employs a weak-to-strong training strategy for stable high-resolution image generation. Experimental results demonstrate that our method generates higher-resolution outputs with improved control over viewing angles. This capability has significant implications not only for clinical applications but also for medical education and data extension, enabling the creation of diverse, high-quality datasets for training and analysis.

TL;DR

We propose a novel view-conditioned diffusion model for synthesizing multi-view X-ray images up to 1024x1024 resolution from a single view.

Visual Comparison with SOTA Methods

DRR vs. SV-DRR

The name SV-DRR, short for Single-View DRR, is inspired by Digitally Reconstructed Radiography (DRR).

Unlike DRR, which renders X-ray projections from a 3D CT volume, our method synthesizes novel views directly from a single 2D projection.

Usage

🚀 Quick Start

🛠️ Environment Setup

To ensure compatibility and reproducibility, follow these steps to set up the environment:

Clone the Repository:

git clone https://github.com/xiechun-tsukuba/svdrr.git
cd svdrr

Create a Python Virtual Environment:
```
conda create -f environment.yaml
```

⏬ Download Pretrained Models

You can download the pretrained models by either:

Option 1: Automated Download (Recommended)

python scripts/download_models.py

This will download all models into the models/ directory. Shared components will be stored in the shared/ folder, and symbolic links will be created in each model folder accordingly.

Option 2: Manual Download from Hugging Face

256 resolution: https://huggingface.co/xiechun-tsukuba/svdrr-dit-fb-256
512 resolution: https://huggingface.co/xiechun-tsukuba/svdrr-dit-fb-512
1024 resolution: https://huggingface.co/xiechun-tsukuba/svdrr-dit-fb-1024

🔍 Inference

Important Note: The coordinate system of LIDC-IDRI-DRR is opposite to the intuitive one — the polar angle increases downward, and the azimuth angle increases when rotating to the left. To invert the pose coordinate system, use the --flip_pose option.

Single Image Inference

Default views (azimuth angles from -90° to 90° in 5° increments):

python test_svdrr_DiT.py --model_path models/DiT-fb-512 \
    --image_path demo/real_xray.jpg \
    --log_dir outputs/ \
    --image_size 512 \
    --simple_pose

Citation

If you find this work useful, a citation will be appreciated via:

@InProceedings{XieChu_SVDRR_MICCAI2025,
        author = { Xie, Chun AND Yoshii, Yuichi AND Kitahara, Itaru},
        title = { { SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15963},
        month = {September},
        page = {572 -- 582},
        doi = {https://doi.org/10.1007/978-3-032-04965-0_54}
}

@misc{xie2025svdrr,
        title = {SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model}, 
        author = {Chun Xie and Yuichi Yoshii and Itaru Kitahara},
        year = {2025},
        eprint = {2507.05148},
        archivePrefix = {arXiv},
        doi = {https://doi.org/10.48550/arXiv.2507.05148}, 
}

Downloads last month: 33