Spaces:

harsh99
/

Virtual-Cloths-TryOn

Running

App Files Files Community

Virtual-Cloths-TryOn / README.md

harsh99

readme file updates

b9e9532 5 months ago

preview code

raw

history blame

6.29 kB

🎨 Stable Diffusion & CatVTON Implementation

A comprehensive implementation of Stable Diffusion from scratch with CatVTON virtual try-on capabilities

Overview
Project Structure
Features
Setup & Installation
Model Downloads
CatVTON Integration
References
Author
License

Overview

This project implements Stable Diffusion from scratch using PyTorch, extended with CatVTON (Virtual Cloth Try-On) for realistic fashion try-on.

Complete Stable Diffusion pipeline (Branch: main)
CatVTON virtual try-on extension (Branch: CatVTON)
DDPM-based denoising, VAE, and custom attention
Inpainting and text-to-image capabilities

Project Structure

stable-diffusion/
├── Core Components
│   ├── attention.py          # Attention mechanisms
│   ├── clip.py               # CLIP model
│   ├── ddpm.py               # DDPM sampler
│   ├── decoder.py            # VAE decoder
│   ├── encoder.py            # VAE encoder
│   ├── diffusion.py          # Diffusion logic
│   ├── model.py              # Weight loading
│   └── pipeline.py           # Main pipeline logic
│
├── Utilities & Interface
│   ├── interface.py          # Interactive script
│   ├── model_converter.py    # Weight conversion utilities
│   └── requirements.txt      # Python dependencies
│
├── Data & Models
│   ├── vocab.json
│   ├── merges.txt
│   ├── inkpunk-diffusion-v1.ckpt
│   └── sd-v1-5-inpainting.ckpt
│
├── Sample Data
│   ├── person.jpg
│   ├── garment.jpg
│   ├── agnostic_mask.png
│   ├── dog.jpg
│   ├── image.png
│   └── zalando-hd-resized.zip
│
└── Notebooks & Docs
    ├── test.ipynb
    └── README.md

Features

Stable Diffusion Core

From-scratch implementation with modular architecture
Custom CLIP encoder integration
Latent space generation using VAE
DDPM sampling process
Self-attention mechanisms for denoising

CatVTON Capabilities

Virtual try-on using inpainting
Pose-aligned garment fitting
Segmentation mask based garment overlay

Setup & Installation

Prerequisites

Python 3.10.9
CUDA-compatible GPU
Git, Conda or venv

Clone Repository

git clone https://github.com/Harsh-Kesharwani/stable-diffusion.git
cd stable-diffusion
git checkout CatVTON  # for try-on features

Create Environment

conda create -n stable-diffusion python=3.10.9
conda activate stable-diffusion

Install Requirements

pip install -r requirements.txt

Test Installation

python -c "import torch; print(torch.__version__)"
python -c "import torch; print(torch.cuda.is_available())"

Model Downloads

Tokenizer Files (from SD v1.4)

vocab.json
merges.txt

Download from: CompVis/stable-diffusion-v1-4

Model Checkpoints

inkpunk-diffusion-v1.ckpt: Inkpunk Model
sd-v1-5-inpainting.ckpt: Inpainting Weights

Download Script

mkdir -p data
wget -O data/vocab.json "https://huggingface.co/CompVis/stable-diffusion-v1-4/resolve/main/tokenizer/vocab.json"
wget -O data/merges.txt "https://huggingface.co/CompVis/stable-diffusion-v1-4/resolve/main/tokenizer/merges.txt"

CatVTON Integration

The CatVTON extension allows realistic cloth try-on using Stable Diffusion inpainting.

Highlights

sd-v1-5-inpainting.ckpt for image completion
Garment alignment to human pose
Agnostic segmentation mask usage

Run the interface:

python interface.py

References

Articles & Guides

HuggingFace Resources

Papers

Stable Diffusion: Latent Diffusion Models
DDPM: Denoising Diffusion Probabilistic Models
CatVTON: Category-aware Try-On Network

Author

Harsh Kesharwani

Passionate about AI, Computer Vision, and Generative Models

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

CompVis team for Stable Diffusion
HuggingFace for models and APIs
Zalando Research for dataset
Open-source contributors and educators

⭐ Star this repo if you found it helpful!

Built with ❤️ by Harsh Kesharwani