--- license: other base_model: stabilityai/stable-diffusion-3.5-medium tags: - stable-diffusion - stable-diffusion-diffusers - text-to-image - diffusers - dreambooth - redhat - corporate-branding - fine-tuned library_name: diffusers pipeline_tag: text-to-image --- # RedHat Dog SD3 - Fine-tuned Stable Diffusion 3.5 Model ## Model Description This is a fine-tuned version of [Stable Diffusion 3.5 Medium](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) trained using the Dreambooth technique to generate images of a specific Red Hat branded dog character ("rhteddy"). ## Model Details - **Base Model**: stabilityai/stable-diffusion-3.5-medium - **Fine-tuning Method**: Dreambooth - **Training Data**: 5-10 images of Red Hat dog character - **Training Steps**: 800 steps - **Resolution**: 512x512 pixels - **Hardware**: NVIDIA L40S GPU (40GB memory) ## Intended Use This model is designed for: - Generating images of the Red Hat dog character in various contexts - Educational demonstrations of Dreambooth fine-tuning - Corporate branding and marketing content creation - Research into personalized diffusion models ## Example ```python import torch from diffusers import DiffusionPipeline pipeline = DiffusionPipeline.from_pretrained( "cfchase/redhat-dog-sd3", torch_dtype=torch.bfloat16 ) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") pipeline.to(device) # Generate an image image = pipeline("photo of a rhteddy dog in a park").images[0] image.save("redhat_dog_park.png") ``` ### Recommended Prompts The model works best with prompts that include the trigger phrase `rhteddy dog`: - `"photo of a rhteddy dog"` - `"rhteddy dog sitting in an office"` - `"rhteddy dog wearing a Red Hat"` - `"rhteddy dog in a technology conference"` ## Training Details ### Training Configuration - **Instance Prompt**: "photo of a rhteddy dog" - **Class Prompt**: "a photo of dog" - **Learning Rate**: 5e-6 - **Batch Size**: 1 - **Gradient Accumulation Steps**: 2 - **Optimizer**: 8-bit Adam - **Scheduler**: Constant - **Prior Preservation**: Enabled with 200 class images ### Training Environment - **Platform**: Red Hat OpenShift AI (RHOAI) - **Framework**: Hugging Face Diffusers - **Acceleration**: xFormers, gradient checkpointing ## Model Architecture This model inherits the architecture of Stable Diffusion 3.5 Medium: - **Transformer**: SD3Transformer2DModel - **VAE**: AutoencoderKL - **Text Encoders**: - 2x CLIPTextModelWithProjection - 1x T5EncoderModel - **Scheduler**: FlowMatchEulerDiscreteScheduler ## Limitations and Bias - The model is specifically trained on Red Hat branded imagery and may not generalize well to other contexts - Training data was limited to a small dataset, which may result in overfitting - The model inherits any biases present in the base Stable Diffusion 3.5 model - Performance is optimized for the specific "rhteddy dog" concept and may struggle with significant variations ## Training Data The training data consists of approximately 5-10 high-quality images of the Red Hat dog character, featuring: - Various poses and angles - Consistent visual style and branding - Professional photography quality - Clear subject focus ## Technical Specifications - **Model Size**: ~47GB (full precision weights) - **Inference Requirements**: - GPU with 8GB+ VRAM recommended - CUDA-compatible device - Python 3.8+ - PyTorch 2.0+ - Diffusers library ## License This model is based on Stable Diffusion 3.5 Medium and is subject to the same licensing terms. Please refer to the [original model license](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) for details. ## Contact For questions about this model or the training process, please refer to the [Red Hat OpenShift AI documentation](https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed) or the associated training notebooks.