Spaces:

MJaheen
/

Pepe-Meme-Generator

Running

App Files Files Community

MJaheen commited on 19 days ago

Commit

cc5958e

1 Parent(s): fb609fe

Add new features and fixes

Browse files

- Fix some issues , add doucmention

Files changed (8) hide show

LICENSE +1 -1
README.md +327 -56
diffusion_model_finetuning.ipynb +482 -0
docs/TRAINING.md +343 -0
src/app.py +96 -6
src/model/config.py +40 -3
src/model/generator.py +76 -21
src/utils/image_processor.py +74 -4

LICENSE CHANGED Viewed

@@ -1,6 +1,6 @@
 MIT License
-Copyright (c) 2025 MJaheen
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

 MIT License
+Copyright (c) 2025 MJaheen , [email protected]
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

README.md CHANGED Viewed

@@ -9,109 +9,380 @@ app_file: src/app.py
 python_version: "3.11"
 ---
-# 🐸 Pepe the Frog Meme Generator
-AI-powered meme generator using Stable Diffusion and LoRA fine-tuning.
 ---
-## 🎮 Try It Online
-🚀 **[Open in Hugging Face Spaces](https://huggingface.co/spaces/MJaheen/Pepe-Meme-Generator)**
 ---
-## 🌟 Features
-- **Multiple Model Support**: Switch between fine-tuned LoRA and base models
-  - Pepe Fine-tuned (LoRA) - Custom trained model
-  - Base SD 1.5 - Standard Stable Diffusion
-  - Dreamlike Photoreal 2.0 - Photorealistic style
-  - Openjourney v4 - Artistic Midjourney-style
 - **Raw Prompt Mode**: Use exact prompts without automatic enhancements
-- Generate **custom Pepe memes** from text prompts
-- Multiple **style presets** (happy, sad, smug, angry, etc.)
-- **Add meme text overlays** with automatic "MJ" signature
-- **Real-time progress tracking** for each generation step
-- Adjustable generation parameters (CFG, steps, seed, etc.)
-- Batch generation and meme gallery system
-- **GPU & CPU compatible** with automatic optimization
 ---
-## 💡 Example Prompts
-- "pepe the frog as a wizard"
-- "pepe coding on a laptop"
-- "pepe drinking coffee"
-- "smug pepe wearing sunglasses"
----
-## 🚀 Quick Start (GitHub)
 ```bash
-# Clone
 git clone https://github.com/YOUR_USERNAME/pepe-meme-generator.git
 cd pepe-meme-generator
-# Install
 pip install -r requirements.txt
-# Run
 streamlit run src/app.py
 ```
 ---
-## 📚 Project Structure
 pepe-meme-generator/
-├── src/
-│   ├── app.py              # Main Streamlit app
-│   ├── model/
-│   │   ├── generator.py    # Generation logic
-│   │   └── config.py       # Model configuration
-│   └── utils/
-│       └── image_processor.py
-├── models/                 # Model weights (not committed)
-├── outputs/                # Generated memes (not committed)
-├── requirements.txt
-├── .gitignore
-└── README.md
 ---
-## 🛠️ Tech Stack
-Model: Stable Diffusion 1.5 + LoRA
-Framework: PyTorch, Diffusers
-UI: Streamlit
-Processing: PIL, OpenCV
 ---
-This project demonstrates:
-- Diffusion model architecture
-- Transfer learning with LoRA
-- Text-to-image synthesis
 ---
-## 🎓 🙏 Acknowledgments
-- [WorldQuant](https://www.wqu.edu/ai-lab-computer-vision)
-- [Stable Diffusion](https://github.com/CompVis/stable-diffusion)
-- [LoRA](https://github.com/microsoft/LoRA)
-- [Diffusers](https://github.com/huggingface/diffusers)
-- [Streamlit](https://github.com/streamlit/streamlit)
 ## 📜 License
-MIT License — see LICENSE file.

 python_version: "3.11"
 ---
+<div align="center">
+# 🐸 Pepe the Frog AI Meme Generator
+### Create custom Pepe memes using AI-powered Stable Diffusion with LoRA fine-tuning
+[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
+[![Streamlit](https://img.shields.io/badge/Streamlit-1.28+-FF4B4B.svg)](https://streamlit.io)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97-Hugging%20Face-orange)](https://huggingface.co/MJaheen/Pepe_The_Frog_model_v1_lora)
+[Demo](https://huggingface.co/spaces/MJaheen/Pepe-Meme-Generator) • [Documentation](./docs/) • [Training Guide](./docs/TRAINING.md) • [Report Bug](https://github.com/YOUR_USERNAME/pepe-meme-generator/issues)
+</div>
 ---
+## 📖 Table of Contents
+- [Features](#-features)
+- [Quick Start](#-quick-start)
+- [Installation](#-installation)
+- [Usage](#-usage)
+- [Model Information](#-model-information)
+- [Performance Optimization](#-performance-optimization)
+- [Project Structure](#-project-structure)
+- [Training](#-training-your-own-model)
+- [Contributing](#-contributing)
+- [License](#-license)
+- [Acknowledgments](#-acknowledgments)
 ---
+## ✨ Features
+### 🎨 **Multiple AI Models**
+- **Pepe Fine-tuned LoRA** - Custom trained on Pepe dataset (1600 steps)
+- **Pepe + LCM (FAST)** - 8x faster generation with LCM technology
+- **Tiny SD** - Lightweight model for faster CPU generation
+- **Small SD** - Balanced speed and quality
+- **Base SD 1.5** - Standard Stable Diffusion
+- **Dreamlike Photoreal 2.0** - Photorealistic style
+- **Openjourney v4** - Artistic Midjourney-inspired style
+### ⚡ **Performance Features**
+- **LCM Support**: Generate images in 6 steps (~30 seconds on CPU)
+- **GPU Acceleration**: Automatic CUDA detection with xformers support
+- **Memory Efficient**: Attention slicing and VAE slicing enabled
+### 🎭 **Generation Features**
+- **Style Presets**: Happy, sad, smug, angry, crying, and more
 - **Raw Prompt Mode**: Use exact prompts without automatic enhancements
+- **Text Overlays**: Add meme text with Impact font
+- **Batch Generation**: Create multiple variations
+- **Progress Tracking**: Real-time generation progress bar
+- **Seed Control**: Reproducible generations with fixed seeds
+- **Gallery System**: View and manage all generated memes
+### 🎯 **User Experience**
+- **Model Hot-Swapping**: Switch models without restart
+- **Interactive UI**: Clean Streamlit interface
+- **Example Prompts**: Built-in inspiration gallery
+- **Download Support**: Save images with one click
+- **Responsive Design**: Works on desktop and mobile
 ---
+## 🚀 Quick Start
+### Try Online (No Installation)
+🌐 **[Open in Hugging Face Spaces](https://huggingface.co/spaces/MJaheen/Pepe-Meme-Generator)** - Run instantly in your browser!
+### Local Installation
 ```bash
+# 1. Clone the repository
 git clone https://github.com/YOUR_USERNAME/pepe-meme-generator.git
 cd pepe-meme-generator
+# 2. Create virtual environment (recommended)
+python -m venv venv
+source venv/bin/activate  # On Windows: venv\Scripts\activate
+# 3. Install dependencies
 pip install -r requirements.txt
+# 4. Run the app
 streamlit run src/app.py
 ```
+The app will open in your browser at `http://localhost:8501`
+---
+## 📦 Installation
+### System Requirements
+- **Python**: 3.10 or higher
+- **RAM**: 8GB minimum, 16GB recommended
+- **GPU**: Optional (NVIDIA with CUDA for faster generation)
+- **Storage**: ~5GB for models and dependencies
+### Dependencies
+```bash
+# Core dependencies
+pip install torch torchvision  # PyTorch
+pip install diffusers transformers accelerate  # Diffusion models
+pip install streamlit  # Web interface
+pip install pillow numpy scipy  # Image processing
+pip install peft safetensors  # LoRA support
+```
+Or install everything at once:
+```bash
+pip install -r requirements.txt
+```
+### GPU Setup (Optional but Recommended)
+For NVIDIA GPUs with CUDA:
+```bash
+# Install PyTorch with CUDA support
+pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
+# Install xformers for memory-efficient attention
+pip install xformers
+```
+---
+## 🎮 Usage
+### Basic Usage
+1. **Select a Model**: Choose from the dropdown (try "Pepe + LCM (FAST)" for speed)
+2. **Enter a Prompt**: e.g., "pepe the frog as a wizard casting spells"
+3. **Adjust Settings**: Steps (6 for LCM, 25 for normal), guidance scale, etc.
+4. **Generate**: Click "Generate Meme" and wait
+5. **Download**: Save your creation!
+### Example Prompts
+```
+pepe_style_frog, wizard casting magical spells, detailed
+pepe_style_frog, programmer coding on laptop, cyberpunk style
+pepe_style_frog, drinking coffee at sunrise, peaceful
+pepe_style_frog, wearing sunglasses, smug expression
+pepe_style_frog, crying with rain, emotional, dramatic lighting
+```
+### Advanced Features
+#### **Using LCM for Fast Generation**
+1. Select "Pepe + LCM (FAST)" model
+2. Use 6 steps (optimal for LCM)
+3. Set guidance scale to 1.5
+4. Generate in ~30 seconds!
+#### **Adding Text Overlays**
+1. Expand "Add Text" section
+2. Enter top and bottom text
+3. Text automatically styled in Impact font
+4. Signature "MJ" added to corner
+#### **Reproducible Generations**
+1. Enable "Fixed Seed" in Advanced Settings
+2. Set a seed number (e.g., 42)
+3. Same seed + prompt = same image
+---
+## 🤖 Model Information
+### Fine-Tuned LoRA Model
+**Model ID**: `MJaheen/Pepe_The_Frog_model_v1_lora`
+**Training Details**:
+- **Base Model**: Stable Diffusion v1.5
+- **Method**: LoRA (Low-Rank Adaptation)
+- **Dataset**: [iresidentevil/pepe_the_frog](https://huggingface.co/datasets/iresidentevil/pepe_the_frog)
+- **Training Steps**: 2000
+- **Resolution**: 512x512
+- **Batch Size**: 1 (4 gradient accumulation)
+- **Learning Rate**: 1e-4 (cosine schedule)
+- **LoRA Rank**: 16
+- **Precision**: Mixed FP16
+- **Trigger Word**: `pepe_style_frog`
+**Performance**:
+- Quality: ⭐⭐⭐ (Good)
+- Speed (CPU): ~4 minutes (25 steps)
+- Speed (GPU): ~15 seconds (25 steps)
 ---
+## 📁 Project Structure
+```
 pepe-meme-generator/
+├── src/                          # Source code
+│   ├── app.py                    # Main Streamlit application
+│   ├── model/                    # Model management
+│   │   ├── __init__.py
+│   │   ├── config.py             # Model configurations
+│   │   └── generator.py          # Image generation logic
+│   └── utils/                    # Utility functions
+│       ├── __init__.py
+│       └── image_processor.py    # Image processing utilities
+├── docs/                         # Documentation
+│   └──TRAINING.md               # Model training guide
+├── models/                       # Downloaded models (gitignored)
+├── outputs/                      # Generated images (gitignored)
+├── scripts/                      # Utility scripts
+├── tests/                        # Test files
+├── diffusion_model_finetuning.ipynb  # Training notebook
+├── requirements.txt              # Python dependencies
+├── .gitignore                    # Git ignore rules
+├── .dockerignore                 # Docker ignore rules
+├── Dockerfile                    # Docker configuration
+├── LICENSE                       # MIT License
+└── README.md                     # This file
+```
 ---
+## 🎓 Training Your Own Model
+Want to fine-tune your own Pepe model or create a different character?
+### Quick Training Overview
+```bash
+# 1. Prepare your dataset (images + captions)
+# 2. Run the training script
+accelerator launch train_text_to_image_lora.py \
+  --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
+  --train_data_dir="./your-data" \
+  --resolution=512 \
+  --train_batch_size=1 \
+  --gradient_accumulation_steps=4 \
+  --max_train_steps=2000 \
+  --learning_rate=1e-4 \
+  --lr_scheduler="cosine" \
+  --output_dir="./output" \
+  --rank=16
+```
+### Complete Training Guide
+See **[docs/TRAINING.md] for:
+- Dataset preparation
+- Training configuration
+- Hyperparameter tuning
+- Validation and testing
+- Model upload to Hugging Face
+Or check out the **[diffusion_model_finetuning.ipynb](./diffusion_model_finetuning.ipynb)** notebook!
+---
+## 🛠️ Technology Stack
+### Core Technologies
+- **[PyTorch](https://pytorch.org/)** - Deep learning framework
+- **[Diffusers](https://github.com/huggingface/diffusers)** - Diffusion models library
+- **[Transformers](https://github.com/huggingface/transformers)** - NLP models
+- **[PEFT](https://github.com/huggingface/peft)** - Parameter-efficient fine-tuning (LoRA)
+- **[Streamlit](https://streamlit.io/)** - Web UI framework
+### AI/ML Components
+- **Stable Diffusion 1.5** - Base diffusion model
+- **LoRA** - Low-Rank Adaptation for efficient fine-tuning
+- **LCM** - Latent Consistency Model for fast inference
+- **DPM Solver** - Fast diffusion sampling
+### Image Processing
+- **Pillow (PIL)** - Image manipulation
+- **NumPy** - Numerical operations
+- **SciPy** - Scientific computing
 ---
+## 🤝 Contributing
+Contributions are welcome! Here's how you can help:
+### Ways to Contribute
+- 🐛 Report bugs
+- 💡 Suggest new features
+- 📝 Improve documentation
+- 🎨 Add new style presets
+- ⚡ Optimize performance
+- 🧪 Add tests
+### Development Setup
+```bash
+# Clone and setup
+git clone https://github.com/YOUR_USERNAME/pepe-meme-generator.git
+cd pepe-meme-generator
+python -m venv venv
+source venv/bin/activate
+pip install -r requirements.txt
+# Make your changes
+# Test locally
+streamlit run src/app.py
+# Submit pull request
 ---
+## 🐛 Troubleshooting
+### Common Issues
+**Issue**: Out of memory error
+**Solution**: Reduce resolution to 512x512, use CPU mode, or enable memory optimizations
+**Issue**: Slow generation on CPU
+**Solution**: Use "Pepe + LCM (FAST)" model with 6 steps
+**Issue**: Model not loading
+**Solution**: Clear Streamlit cache with "Clear Cache & Reload" button
+**Issue**: Import errors
+**Solution**: Reinstall dependencies: `pip install -r requirements.txt --force-reinstall`
+---
 ## 📜 License
+This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.
+### Model Licenses
+- **Stable Diffusion 1.5**: CreativeML Open RAIL-M License
+- **Pepe LoRA**: MIT License
+- **Training Dataset**: Check [iresidentevil/pepe_the_frog](https://huggingface.co/datasets/iresidentevil/pepe_the_frog)
+---
+## 🙏 Acknowledgments
+### Special Thanks
+- **[WorldQuant University](https://www.wqu.edu/ai-lab-computer-vision)** - AI/ML education and resources
+- **[Hugging Face](https://huggingface.co/)** - Model hosting and diffusers library
+- **[Stability AI](https://stability.ai/)** - Stable Diffusion model
+- **[Microsoft](https://github.com/microsoft/LoRA)** - LoRA technique
+- **[iresidentevil](https://huggingface.co/iresidentevil)** - Pepe dataset
+## 📞 Contact & Support
+- **Issues**: [email protected]
+---
+## 🌟 Star History
+If you find this project useful, please consider giving it a ⭐ star on GitHub!
+---
+<div align="center">
+**Made with ❤️ by MJaheen**
+*Generate Pepes responsibly! 🐸*
+</div>

diffusion_model_finetuning.ipynb ADDED Viewed

	@@ -0,0 +1,482 @@

+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "q4KpnNL4lY6q"
+      },
+      "source": [
+        "### Getting Ready"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "#!pip install datasets\n",
+        "#!pip uninstall -y diffusers\n",
+        "!git clone https://github.com/huggingface/diffusers.git\n",
+        "!pip install git+https://github.com/huggingface/diffusers.git\n",
+        "#!pip install --upgrade transformers accelerate safetensors torch torchvision"
+      ],
+      "metadata": {
+        "id": "yOvCmByVINi7",
+        "collapsed": true
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from google.colab import drive\n",
+        "drive.mount('/content/drive')\n"
+      ],
+      "metadata": {
+        "id": "I4vsjgK2AbgI"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "#Add trigger word to dataset and create the training paramters\n",
+        "\n",
+        "import os\n",
+        "import json\n",
+        "from datasets import load_dataset\n",
+        "from accelerate.utils import write_basic_config\n",
+        "from huggingface_hub import create_repo, upload_folder\n",
+        "\n",
+        "# --- 2. Configuration ---\n",
+        "# This is where you set all the important parameters for the training job.\n",
+        "\n",
+        "# Model and Dataset Parameters\n",
+        "base_model_id = \"runwayml/stable-diffusion-v1-5\"\n",
+        "dataset_name = \"iresidentevil/pepe_the_frog\" # The original dataset\n",
+        "text_column = \"prompt\"\n",
+        "image_column = \"image\"\n",
+        "trigger_word = \"pepe_style_frog\" # The trigger word we decided on\n",
+        "\n",
+        "# Training Parameters\n",
+        "output_dir = \"/content/drive/MyDrive/pepe-lora-sdxl-turbo_2\" # Where the trained LoRA will be saved\n",
+        "resolution = 512 # SDXL-Turbo works well at 512x512. Higher resolutions need more VRAM.\n",
+        "learning_rate = 1e-4\n",
+        "train_batch_size = 1 # Keep this at 1 for a small dataset to see each image.\n",
+        "gradient_accumulation_steps = 4\n",
+        "max_train_steps = 500 # A good starting point for a small dataset. Adjust as needed.\n",
+        "checkpointing_steps = 100 # Save a checkpoint every 100 steps.\n",
+        "\n",
+        "# LoRA Specific Parameters\n",
+        "lora_rank = 16 # Rank (dimension) of the LoRA. 16 is a good balance.\n",
+        "\n",
+        "# Hugging Face Hub Parameters\n",
+        "hf_hub_repo_id = \"your-username/pepe-lora-sdxl-turbo\" # Change to your Hub username and desired repo name\n",
+        "push_to_hub = True # Set to True to automatically upload your LoRA to the Hub\n",
+        "\n",
+        "\n",
+        "# --- 3. Prepare Dataset in \"Image Folder\" format ---\n",
+        "# This section now creates a local folder with images and a metadata.jsonl file,\n",
+        "# which is the format expected by the training script.\n",
+        "\n",
+        "print(\"Loading original dataset...\")\n",
+        "dataset = load_dataset(dataset_name, split=\"train\")\n",
+        "\n",
+        "\n",
+        "image_folder_path = \"/content/drive/MyDrive/pepe-data\"\n",
+        "os.makedirs(image_folder_path, exist_ok=True)\n",
+        "print(f\"Created directory for prepared data: {image_folder_path}\")\n",
+        "\n",
+        "metadata_file_path = os.path.join(image_folder_path, \"metadata.jsonl\")\n",
+        "\n",
+        "with open(metadata_file_path, \"w\") as f:\n",
+        "    for i, example in enumerate(dataset):\n",
+        "        # Get image and caption\n",
+        "        image = example[image_column]\n",
+        "        caption = example[text_column]\n",
+        "\n",
+        "        # Add the trigger word\n",
+        "        full_caption = f\"{trigger_word} {caption}\"\n",
+        "\n",
+        "        # Save the image\n",
+        "        image_filename = f\"image_{i}.png\"\n",
+        "        image.save(os.path.join(image_folder_path, image_filename))\n",
+        "\n",
+        "        # Write the metadata entry\n",
+        "        metadata_entry = {\n",
+        "            \"file_name\": image_filename,\n",
+        "            text_column: full_caption\n",
+        "        }\n",
+        "        f.write(json.dumps(metadata_entry) + \"\\n\")\n",
+        "\n",
+        "print(f\"Dataset prepared and saved in 'image folder' format at: {image_folder_path}\")\n",
+        "\n",
+        "\n",
+        "# --- 4. Set up the Training Command ---\n",
+        "# This command now points to our correctly formatted image folder.\n",
+        "write_basic_config()\n",
+        "\n",
+        "command = [\n",
+        "    \"accelerate\", \"launch\",\n",
+        "    \"train_text_to_image_lora.py\",\n",
+        "    f\"--pretrained_model_name_or_path={base_model_id}\",\n",
+        "    f\"--train_data_dir={image_folder_path}\",\n",
+        "    f\"--caption_column={text_column}\",\n",
+        "    f\"--image_column={image_column}\",\n",
+        "    f\"--dataloader_num_workers=8\",\n",
+        "    f\"--resolution={resolution}\", \"--center_crop\", \"--random_flip\",\n",
+        "    f\"--train_batch_size={train_batch_size}\",\n",
+        "    f\"--gradient_accumulation_steps={gradient_accumulation_steps}\",\n",
+        "    f\"--max_train_steps={max_train_steps}\",\n",
+        "    f\"--learning_rate={learning_rate}\",\n",
+        "    \"--lr_scheduler=constant\",\n",
+        "    \"--lr_warmup_steps=0\",\n",
+        "    f\"--output_dir={output_dir}\",\n",
+        "    f\"--rank={lora_rank}\",\n",
+        "    f\"--validation_prompt='{trigger_word} a sad frog in a blue hoodie, cartoon style'\",\n",
+        "    f\"--checkpointing_steps={checkpointing_steps}\",\n",
+        "    \"--checkpoints_total_limit=3\",\n",
+        "]\n",
+        "\n",
+        "if push_to_hub:\n",
+        "    command.extend([f\"--push_to_hub\", f\"--hub_model_id={hf_hub_repo_id}\"])\n",
+        "\n",
+        "training_command_str = \" \".join(command)\n",
+        "\n",
+        "\n",
+        "# --- 5. Execute the Training ---\n",
+        "print(\"\\n\" + \"=\"*80)\n",
+        "print(\"                           TRAINING COMMAND\")\n",
+        "print(\"=\"*80)\n",
+        "print(\"The following command will be executed in your terminal:\")\n",
+        "print(training_command_str)\n",
+        "print(\"\\n\" + \"=\"*80)\n",
+        "print(\"To start training, copy the command above and paste it into your terminal.\")\n",
+        "print(\"Make sure you are in the correct environment where the diffusers examples are located.\")\n",
+        "print(\"You may need to clone the diffusers repo first: git clone https://github.com/huggingface/diffusers.git\")\n",
+        "print(\"CORRECTED PATH: Then navigate to: cd diffusers/examples/text_to_image\")\n",
+        "print(\"=\"*80)\n",
+        "\n"
+      ],
+      "metadata": {
+        "id": "RPv7Gv5h--SO"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "yGDgzchblY6s"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "import sys\n",
+        "import datasets\n",
+        "import diffusers\n",
+        "import huggingface_hub\n",
+        "import requests\n",
+        "import torch\n",
+        "from dotenv import load_dotenv\n",
+        "from huggingface_hub import HfApi\n",
+        "from IPython.display import display"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "6hoZLPDalY6t"
+      },
+      "source": [
+        "We'll print out version number of the critical packages, to help with future reproducibility."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "CaRvn_celY6t"
+      },
+      "outputs": [],
+      "source": [
+        "print(\"Platform:\", sys.platform)\n",
+        "print(\"Python version:\", sys.version)\n",
+        "print(\"---\")\n",
+        "print(\"datasets version: \", datasets.__version__)\n",
+        "print(\"diffusers version: \", diffusers.__version__)\n",
+        "print(\"huggingface_hub version: \", huggingface_hub.__version__)\n",
+        "print(\"torch version:\", torch.__version__)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "VLBQ_2A0lY6u"
+      },
+      "source": [
+        "Let's check if a GPU is available.  If not, this notebook will take a long time to run!"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "jWTKdjUDlY6u"
+      },
+      "outputs": [],
+      "source": [
+        "if torch.cuda.is_available():\n",
+        "    device = \"cuda\"\n",
+        "    dtype = torch.float16\n",
+        "else:\n",
+        "    device = \"cpu\"\n",
+        "    dtype = torch.float32\n",
+        "\n",
+        "print(f\"Using {device} device with {dtype} data type.\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "RCI8s5uylY6u"
+      },
+      "source": [
+        "### Load Stable Diffusion"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "2RU4U5mulY6w"
+      },
+      "outputs": [],
+      "source": [
+        "\n",
+        "MODEL_NAME = \"runwayml/stable-diffusion-v1-5\"\n",
+        "\n",
+        "pipeline = diffusers.AutoPipelineForText2Image.from_pretrained(\n",
+        "    MODEL_NAME, torch_dtype=dtype\n",
+        ")\n",
+        "pipeline.to(device)\n",
+        "\n",
+        "print(type(pipeline))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "BMvqxn99lY6w"
+      },
+      "source": [
+        "Test base Model"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "-kBJqj9xlY6w"
+      },
+      "outputs": [],
+      "source": [
+        "images = pipeline([\"pepe the frog rolling eyes\"]*1).images\n",
+        "\n",
+        "for im in images:\n",
+        "    display(im)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "HqZRLoajlY6x"
+      },
+      "outputs": [],
+      "source": [
+        "#DATASET_NAME = \"worldquant-university/maya-dataset-v1\"\n",
+        "DATASET_NAME= \"iresidentevil/pepe_the_frog\"\n",
+        "data_builder = datasets.load_dataset_builder(DATASET_NAME)\n",
+        "\n",
+        "print(data_builder.dataset_name)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "4EeHRlBmlY6x"
+      },
+      "outputs": [],
+      "source": [
+        "print(data_builder.info.features)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "rgXvHJJVlY6y"
+      },
+      "outputs": [],
+      "source": [
+        "print(data_builder.info.splits)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "-L2YvGMnlY6y"
+      },
+      "outputs": [],
+      "source": [
+        "data = datasets.load_dataset(DATASET_NAME)\n",
+        "\n",
+        "print(data)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "k2iL94ILlY6z"
+      },
+      "outputs": [],
+      "source": [
+        "data[\"train\"][\"image\"]"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "6vBJgSPnlY6z"
+      },
+      "outputs": [],
+      "source": [
+        "# The values are PIL images, so they will be displayed\n",
+        "# automatically by Jupyter.\n",
+        "data[\"train\"][\"image\"][3]"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "Kbj0aOW9lY6z"
+      },
+      "outputs": [],
+      "source": [
+        "# Use dictionary indexing to look up the text values.\n",
+        "data[\"train\"][\"prompt\"]"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Q0RrkjXVlY60"
+      },
+      "source": [
+        "### LoRA Fine-tuning"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "36Jc_ijlwD75"
+      },
+      "outputs": [],
+      "source": [
+        "%cd diffusers/examples/text_to_image\n",
+        "\n",
+        "!accelerate launch train_text_to_image_lora.py \\\n",
+        "  --pretrained_model_name_or_path=\"runwayml/stable-diffusion-v1-5\" \\\n",
+        "  --train_data_dir=image_folder_path \\\n",
+        "  --caption_column=\"prompt\" \\\n",
+        "  --image_column=\"image\" \\\n",
+        "  --resolution=512 --center_crop --random_flip \\\n",
+        "  --train_batch_size=1 \\\n",
+        "  --gradient_accumulation_steps=4 \\\n",
+        "  --max_train_steps=2000 \\\n",
+        "  --learning_rate=1e-4 \\\n",
+        "  --lr_scheduler=\"cosine\" \\\n",
+        "  --lr_warmup_steps=0 \\\n",
+        "  --output_dir=output_dir \\\n",
+        "  --rank=16 \\\n",
+        "  --validation_prompt=\"pepe_style_frog, a high-quality, detailed image of pepe the frog smiling and holding a cup of coffee at sunrise\" \\\n",
+        "  --seed=42 \\\n",
+        "  --mixed_precision=\"fp16\" \\\n",
+        "  --checkpointing_steps=150"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "VKOcWmJ9lY62"
+      },
+      "source": [
+        "### Load LoRA Weights"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "SBGjOCmTlY63"
+      },
+      "outputs": [],
+      "source": [
+        "pipeline.load_lora_weights(\n",
+        "    output_dir,\n",
+        "\n",
+        "\n",
+        "    weight_name=\"pytorch_lora_weights.safetensors\",\n",
+        ")\n",
+        "pipeline.safety_checker = None"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "RYRckHGLlY63"
+      },
+      "outputs": [],
+      "source": [
+        "images = pipeline([\"pepe_style_frog making fun of rabbit that racing a tortile\"]).images\n",
+        "\n",
+        "for im in images:\n",
+        "    display(im)"
+      ]
+    }
+  ],
+  "metadata": {
+    "accelerator": "GPU",
+    "colab": {
+      "gpuType": "T4",
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.11.0"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}

docs/TRAINING.md ADDED Viewed

	@@ -0,0 +1,343 @@

+# 🎓 Model Training Guide
+This guide covers how to fine-tune your own Stable Diffusion model using LoRA (Low-Rank Adaptation) for creating custom character models like our Pepe generator.
+---
+## 📖 Table of Contents
+- [Overview](#overview)
+- [Prerequisites](#prerequisites)
+- [Dataset Preparation](#dataset-preparation)
+- [Training Configuration](#training-configuration)
+- [Running the Training](#running-the-training)
+- [Model Upload](#model-upload)
+---
+## 🎯 Overview
+### What is LoRA?
+**LoRA (Low-Rank Adaptation)** is a parameter-efficient fine-tuning technique that:
+- ✅ Trains only a small fraction of parameters (~0.5% of full model)
+- ✅ Requires significantly less VRAM (~10GB vs 40GB+)
+- ✅ Maintains base model quality while adding custom styles
+- ✅ Produces small, portable adapter files (~100MB vs 4GB+)
+- ✅ Can be combined with other LoRAs
+### Our Training Setup
+**Model**: Pepe the Frog LoRA
+**Base**: Stable Diffusion v1.5
+**Dataset**: [iresidentevil/pepe_the_frog](https://huggingface.co/datasets/iresidentevil/pepe_the_frog)
+**Result**: [MJaheen/Pepe_The_Frog_model_v1_lora](https://huggingface.co/MJaheen/Pepe_The_Frog_model_v1_lora)
+**Training Time**: ~2-3 hours on T4 GPU (Google Colab)
+---
+## 🛠️ Prerequisites
+### Hardware Requirements
+**Minimum**:
+- GPU: NVIDIA GPU with 10GB+ VRAM (e.g., RTX 3080, T4)
+- RAM: 16GB system RAM
+- Storage: 20GB free space
+**Recommended**:
+- GPU: NVIDIA A100, V100, or RTX 4090
+- RAM: 32GB system RAM
+- Storage: 50GB+ SSD
+**Cloud Options**:
+- Google Colab (Free T4 GPU)
+- Kaggle Notebooks (Free GPU)
+- Lambda Labs
+- RunPod
+- Vast.ai
+### Software Requirements
+```bash
+# Core dependencies
+pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
+pip install diffusers==0.31.0
+pip install transformers==4.45.1
+pip install accelerate==0.34.2
+pip install peft>=0.11.0
+pip install safetensors==0.4.4
+pip install datasets
+pip install bitsandbytes  # For 8-bit Adam optimizer (optional)
+```
+---
+## 📂 Dataset Preparation
+### Dataset Structure
+Your dataset should follow this structure:
+```
+dataset/
+├── image_1.png
+├── image_2.png
+├── image_3.png
+└── metadata.jsonl  # or metadata.csv
+```
+### Metadata Format
+**Option 1: JSONL (Recommended)**
+```jsonl
+{"file_name": "image_1.png", "prompt": "pepe_style_frog, happy pepe smiling"}
+{"file_name": "image_2.png", "prompt": "pepe_style_frog, sad pepe crying"}
+{"file_name": "image_3.png", "prompt": "pepe_style_frog, pepe drinking coffee"}
+```
+**Option 2: CSV**
+```csv
+file_name,prompt
+image_1.png,"pepe_style_frog, happy pepe smiling"
+image_2.png,"pepe_style_frog, sad pepe crying"
+image_3.png,"pepe_style_frog, pepe drinking coffee"
+```
+### Dataset Best Practices
+1. **Image Quality**
+   - Resolution: 512x512 or higher
+   - Format: PNG or JPG
+   - Clear, well-lit images
+   - Varied poses and expressions
+2. **Caption Quality**
+   - Include trigger word (e.g., `pepe_style_frog`)
+   - Describe key features and actions
+   - Be consistent in naming conventions
+   - 5-15 words per caption optimal
+3. **Dataset Size**
+   - Minimum: 20-50 images
+   - Optimal: 100-500 images
+   - More images = better generalization
+4. **Diversity**
+   - Various angles and poses
+   - Different expressions
+   - Multiple backgrounds
+   - Different lighting conditions
+### Our Pepe Dataset
+We used **[iresidentevil/pepe_the_frog](https://huggingface.co/datasets/iresidentevil/pepe_the_frog)** which contains:
+- ~200 high-quality Pepe images
+- Consistent 512x512 resolution
+- Varied expressions and styles
+- Pre-captioned with trigger word
+---
+## ⚙️ Training Configuration
+### Training Hyperparameters
+Here's the exact configuration we used for the Pepe model:
+```bash
+accelerate launch train_text_to_image_lora.py \
+  --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
+  --train_data_dir="/path/to/pepe-data" \
+  --caption_column="prompt" \
+  --image_column="image" \
+  --resolution=512 \
+  --center_crop \
+  --random_flip \
+  --train_batch_size=1 \
+  --gradient_accumulation_steps=4 \
+  --max_train_steps=2000 \
+  --learning_rate=1e-4 \
+  --lr_scheduler="cosine" \
+  --lr_warmup_steps=0 \
+  --output_dir="./output" \
+  --rank=16 \
+  --validation_prompt="pepe_style_frog, a high-quality, detailed image of pepe the frog smiling and holding a cup of coffee at sunrise" \
+  --validation_epochs=5 \
+  --seed=42 \
+  --mixed_precision="fp16" \
+  --checkpointing_steps=150
+```
+### Parameter Explanation
+| Parameter | Value | Description |
+|-----------|-------|-------------|
+| `pretrained_model_name_or_path` | `runwayml/stable-diffusion-v1-5` | Base model to fine-tune |
+| `train_data_dir` | `/path/to/data` | Path to your dataset |
+| `resolution` | `512` | Image resolution (512x512) |
+| `train_batch_size` | `1` | Batch size per GPU |
+| `gradient_accumulation_steps` | `4` | Effective batch size = 1 * 4 = 4 |
+| `max_train_steps` | `2000` | Total training steps |
+| `learning_rate` | `1e-4` | Initial learning rate |
+| `lr_scheduler` | `cosine` | Learning rate schedule |
+| `rank` | `16` | LoRA rank (higher = more parameters) |
+| `mixed_precision` | `fp16` | Use 16-bit precision for speed |
+| `checkpointing_steps` | `150` | Save checkpoint every N steps |
+### Hyperparameter Tuning Tips
+**Learning Rate**:
+- Too high: Training unstable, poor quality
+- Too low: Slow convergence, underfitting
+- Recommended: `1e-4` to `1e-5`
+**LoRA Rank**:
+- Lower (4-8): Faster training, smaller files, less expressive
+- Medium (16-32): Balanced (recommended)
+- Higher (64-128): More expressive, larger files, risk of overfitting
+**Training Steps**:
+- Small dataset (20-50 images): 500-1000 steps
+- Medium dataset (50-200 images): 1000-2000 steps
+- Large dataset (200+ images): 2000-5000 steps
+**Batch Size**:
+- Depends on VRAM availability
+- Effective batch size = `batch_size × gradient_accumulation_steps`
+- Recommended effective batch size: 4-8
+---
+## 🚀 Running the Training
+### Option 1: Google Colab (Recommended for Beginners)
+1. **Open the Notebook**:
+   - Use our provided notebook: `diffusion_model_finetuning.ipynb`
+   - Or create new Colab notebook
+2. **Setup GPU**:
+   ```
+   Runtime → Change runtime type → GPU (T4)
+   ```
+3. **Mount Google Drive** (optional):
+   ```python
+   from google.colab import drive
+   drive.mount('/content/drive')
+   ```
+4. **Install Dependencies**:
+   ```python
+   !pip install -q diffusers transformers accelerate peft
+   ```
+5. **Upload Dataset**:
+   - Upload to Google Drive
+   - Or download from Hugging Face
+6. **Run Training**:
+   ```python
+   !accelerate launch train_text_to_image_lora.py \
+     --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
+     --train_data_dir="/content/drive/MyDrive/pepe-data" \
+     --max_train_steps=2000 \
+     --learning_rate=1e-4 \
+     --output_dir="./output"
+   ```
+7. **Monitor Progress**:
+   - Watch loss decrease
+   - Check validation images
+   - Save checkpoints to Drive
+### Generate test image
+image = pipe("pepe_style_frog, wizard casting spells").images[0]
+image.save("validation.png")
+```
+## 📤 Model Upload
+### Prepare for Upload
+1. **Test Locally**:
+   ```python
+   from diffusers import StableDiffusionPipeline
+   pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
+   pipe.load_lora_weights("./output")
+   # Test
+   image = pipe("pepe_style_frog, happy pepe").images[0]
+   image.save("test.png")
+   ```
+2. **Prepare Files**:
+   ```
+   output/
+   ├── pytorch_lora_weights.safetensors  # Main file
+   ├── README.md  # Model card
+   └── sample_images/  # Example outputs
+   ```
+### Upload to Hugging Face
+1. **Install Hub CLI**:
+   ```bash
+   pip install huggingface_hub
+   huggingface-cli login
+   ```
+2. **Create Model Card** (`README.md`):
+   ```markdown
+   ---
+   license: creativeml-openrail-m
+   base_model: runwayml/stable-diffusion-v1-5
+   tags:
+   - stable-diffusion
+   - lora
+   - text-to-image
+   ---
+   # Pepe LoRA Model
+   Fine-tuned LoRA for generating Pepe the Frog images.
+   ## Usage
+   ```python
+   from diffusers import StableDiffusionPipeline
+   pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
+   pipe.load_lora_weights("YOUR_USERNAME/your-model-name")
+   image = pipe("pepe_style_frog, happy pepe").images[0]
+   ```
+   ```
+3. **Upload**:
+   ```python
+   from huggingface_hub import HfApi
+   api = HfApi()
+   api.create_repo("YOUR_USERNAME/pepe-lora", repo_type="model")
+   api.upload_folder(
+       folder_path="./output",
+       repo_id="YOUR_USERNAME/pepe-lora",
+       repo_type="model"
+   )
+   ```
+### Common Issues
+**Out of Memory**:
+- Reduce `train_batch_size` to 1
+- Enable `--gradient_checkpointing`
+- Use `--mixed_precision="fp16"`
+- Reduce image resolution

src/app.py CHANGED Viewed

@@ -1,4 +1,33 @@
-"""Pepe the Frog Meme Generator - Main Application"""
 import streamlit as st
 from PIL import Image
@@ -36,7 +65,16 @@ st.markdown("""
 def init_session_state():
-    """Initialize session state"""
     if 'generated_images' not in st.session_state:
         st.session_state.generated_images = []
     if 'generation_count' not in st.session_state:
@@ -47,7 +85,28 @@ def init_session_state():
 @st.cache_resource
 def load_generator(model_name: str = "Pepe Fine-tuned (LoRA)"):
-    """Load and cache the generator based on selected model"""
     config = ModelConfig()
     model_config = config.AVAILABLE_MODELS[model_name]
@@ -71,7 +130,15 @@ def load_generator(model_name: str = "Pepe Fine-tuned (LoRA)"):
 def get_example_prompts():
-    """Return example prompts"""
     return [
         "pepe the frog as a wizard casting spells",
         "pepe the frog coding on a laptop",
@@ -82,7 +149,29 @@ def get_example_prompts():
 def main():
-    """Main application"""
     init_session_state()
     # Sidebar (needs to be first to define selected_model)
@@ -199,7 +288,8 @@ def main():
         if st.session_state.generated_images:
             placeholder.image(
                 st.session_state.generated_images[-1],
-                width='stretch'
             )
         else:
             placeholder.info("Your meme will appear here...")

+"""Pepe the Frog Meme Generator - Main Streamlit Application.
+This is the main entry point for the web application. It provides a user-friendly
+interface for generating Pepe memes using AI-powered Stable Diffusion models.
+The application features:
+- Model selection (multiple LoRA variants, LCM support)
+- Style presets and raw prompt mode
+- Advanced generation settings (steps, guidance, seed)
+- Text overlay capability for meme creation
+- Gallery system for viewing generated images
+- Download functionality
+- Progress tracking during generation
+Application Structure:
+    1. Page configuration and styling
+    2. Session state initialization
+    3. Model loading and caching
+    4. Sidebar UI (model selection, settings)
+    5. Main content area (prompt input, generation)
+    6. Results display and download
+    7. Gallery view
+Usage:
+    Run with: streamlit run src/app.py
+    Access at: http://localhost:8501
+Author: MJaheen
+License: MIT
+"""
 import streamlit as st
 from PIL import Image
 def init_session_state():
+    """
+    Initialize Streamlit session state variables.
+    This function sets up persistent state across app reruns:
+    - generated_images: List of all generated images in current session
+    - generation_count: Counter for tracking total generations
+    - current_model: Currently selected model name for cache invalidation
+    Session state persists across reruns but is reset when the page is refreshed.
+    """
     if 'generated_images' not in st.session_state:
         st.session_state.generated_images = []
     if 'generation_count' not in st.session_state:
 @st.cache_resource
 def load_generator(model_name: str = "Pepe Fine-tuned (LoRA)"):
+    """
+    Load and cache the Stable Diffusion generator.
+    This function loads a PepeGenerator instance configured with the selected
+    model. It's cached using @st.cache_resource to avoid reloading the model
+    on every interaction, which would be very slow.
+    The cache is automatically invalidated when:
+    - The model_name parameter changes
+    - The user manually clears cache
+    Args:
+        model_name: Name of the model from AVAILABLE_MODELS dict.
+            Examples: "Pepe Fine-tuned (LoRA)", "Pepe + LCM (FAST)"
+    Returns:
+        PepeGenerator: Configured generator instance with loaded model.
+    Note:
+        Model loading can take 30-60 seconds on first load as it downloads
+        weights from Hugging Face (~4GB for base model + LoRA).
+    """
     config = ModelConfig()
     model_config = config.AVAILABLE_MODELS[model_name]
 def get_example_prompts():
+    """
+    Return a list of example prompts for inspiration.
+    These prompts are designed to work well with the fine-tuned Pepe model
+    and demonstrate various styles, activities, and scenarios.
+    Returns:
+        list: List of example prompt strings with trigger word and descriptions.
+    """
     return [
         "pepe the frog as a wizard casting spells",
         "pepe the frog coding on a laptop",
 def main():
+    """
+    Main application function that builds and runs the Streamlit UI.
+    This function orchestrates the entire application flow:
+    1. Initializes session state
+    2. Loads configuration and sets up sidebar controls
+    3. Handles model selection and switching
+    4. Processes user input (prompts, settings)
+    5. Generates images when requested
+    6. Displays results with download options
+    7. Shows gallery of previous generations
+    The UI is organized into:
+    - Sidebar: Model selection, style presets, advanced settings
+    - Main area: Prompt input, generation button, results
+    - Bottom: Gallery view (expandable)
+    Flow:
+        User selects model → Enters prompt → Adjusts settings →
+        Clicks generate → Shows progress → Displays result →
+        Offers download → Adds to gallery
+    """
+    # Initialize session state for persistent data across reruns
     init_session_state()
     # Sidebar (needs to be first to define selected_model)
         if st.session_state.generated_images:
             placeholder.image(
                 st.session_state.generated_images[-1],
+                #width='stretch'
+                st.image(img, use_column_width=True)
             )
         else:
             placeholder.info("Your meme will appear here...")

src/model/config.py CHANGED Viewed

@@ -1,4 +1,10 @@
-"""Configuration management for the meme generator"""
 from dataclasses import dataclass
 from typing import Optional
@@ -6,14 +12,45 @@ from typing import Optional
 @dataclass
 class ModelConfig:
-    """Model configuration parameters"""
     # Available models
     AVAILABLE_MODELS: dict = None
     def __post_init__(self):
         if self.AVAILABLE_MODELS is None:
             self.AVAILABLE_MODELS = {
                 "Pepe Fine-tuned (LoRA)": {
                     "base": "runwayml/stable-diffusion-v1-5",
                     "lora": "MJaheen/Pepe_The_Frog_model_v1_lora",
@@ -94,7 +131,7 @@ class ModelConfig:
     # Performance
     ENABLE_ATTENTION_SLICING: bool = True
     ENABLE_VAE_SLICING: bool = True
-    FORCE_CPU: bool = True  # Set to True to force CPU, False to use GPU if available
     # Available styles
     AVAILABLE_STYLES: tuple = (

+"""Configuration management for the Pepe meme generator.
+This module defines all configuration parameters for model selection,
+generation settings, and application behavior. The ModelConfig dataclass
+provides a centralized configuration system with sensible defaults.
+"""
 from dataclasses import dataclass
 from typing import Optional
 @dataclass
 class ModelConfig:
+    """
+    Central configuration for model and generation parameters.
+    This dataclass contains all settings for model selection, generation
+    parameters, and optimization flags. It supports multiple models including
+    fine-tuned LoRA variants and fast LCM models.
+    Attributes:
+        AVAILABLE_MODELS: Dictionary of available model configurations
+        SELECTED_MODEL: Currently selected model name
+        BASE_MODEL: HuggingFace ID of the base Stable Diffusion model
+        LORA_PATH: Path or HuggingFace ID of LoRA weights
+        USE_LORA: Whether to load and use LoRA weights
+        USE_LCM: Whether to use LCM (Latent Consistency Model) for fast inference
+        LCM_LORA_PATH: Path to LCM-LoRA weights
+        TRIGGER_WORD: Trigger word to activate fine-tuned style
+        DEFAULT_STEPS: Default number of diffusion steps
+        DEFAULT_GUIDANCE: Default guidance scale (CFG)
+        DEFAULT_WIDTH: Default output image width
+        DEFAULT_HEIGHT: Default output image height
+        DEFAULT_NEGATIVE_PROMPT: Default negative prompt for all generations
+        FORCE_CPU: Force CPU mode (disable GPU)
+        ENABLE_XFORMERS: Enable memory-efficient attention
+    """
     # Available models
     AVAILABLE_MODELS: dict = None
     def __post_init__(self):
+        """
+        Initialize AVAILABLE_MODELS dictionary if not already set.
+        This method is called automatically after __init__. It populates
+        the AVAILABLE_MODELS dictionary with all supported model configurations.
+        Each model can have different base models, LoRA weights, and optimization flags.
+        """
         if self.AVAILABLE_MODELS is None:
             self.AVAILABLE_MODELS = {
+                # Primary fine-tuned model - Best quality, trained on Pepe dataset
                 "Pepe Fine-tuned (LoRA)": {
                     "base": "runwayml/stable-diffusion-v1-5",
                     "lora": "MJaheen/Pepe_The_Frog_model_v1_lora",
     # Performance
     ENABLE_ATTENTION_SLICING: bool = True
     ENABLE_VAE_SLICING: bool = True
+    FORCE_CPU: bool = False  # Set to True to force CPU, False to use GPU if available
     # Available styles
     AVAILABLE_STYLES: tuple = (

src/model/generator.py CHANGED Viewed

@@ -1,4 +1,16 @@
-"""Pepe Meme Generator - Core generation logic"""
 from typing import Optional, List, Callable
 import torch
@@ -14,10 +26,43 @@ logger = logging.getLogger(__name__)
 class PepeGenerator:
-    """Main generator class for creating Pepe memes"""
     def __init__(self, config: Optional[ModelConfig] = None):
-        """Initialize the generator"""
         self.config = config or ModelConfig()
         self.device = self._get_device(self.config.FORCE_CPU)
         self.pipe = self._load_model(
@@ -153,28 +198,38 @@ class PepeGenerator:
     def generate(
         self,
         prompt: str,
-        style: str = "default",
         negative_prompt: Optional[str] = None,
-        num_inference_steps: int = 50,
         guidance_scale: float = 7.5,
-        seed: Optional[int] = None,
         width: int = 512,
         height: int = 512,
-        callback: Optional[Callable[[int, int], None]] = None,
-        raw_prompt: bool = False,
-    ) -> Image.Image:
-        """Generate a single Pepe meme image
         Args:
-            callback: Optional callback function (current_step, total_steps)
-            raw_prompt: If True, use prompt as-is without modifications
         """
-        # Apply style preset or use raw prompt
-        if raw_prompt:
-            enhanced_prompt = prompt
-        else:
-            enhanced_prompt = self._apply_style_preset(prompt, style)
         # Set default negative prompt
         if negative_prompt is None:
@@ -189,11 +244,11 @@ class PepeGenerator:
         logger.debug(f"Full prompt: {enhanced_prompt}")
         logger.debug(f"Model config - Base: {self.config.BASE_MODEL}, LoRA: {self.config.USE_LORA}")
-        # Create callback wrapper if provided (using new API)
         callback_on_step_end_fn = None
-        if callback:
             def callback_on_step_end_fn(pipe, step, timestep, callback_kwargs):
-                callback(step + 1, num_inference_steps)
                 return callback_kwargs
         # Generate image (removed autocast for CPU compatibility)

+"""Pepe Meme Generator - Core generation logic.
+This module contains the main PepeGenerator class which handles:
+- Loading and caching Stable Diffusion models
+- Managing LoRA and LCM-LoRA adapters
+- Configuring schedulers and optimizations
+- Generating images from text prompts
+- Progress tracking during generation
+The generator supports multiple models, automatic GPU/CPU detection,
+memory optimizations, and both standard and fast (LCM) inference modes.
+"""
 from typing import Optional, List, Callable
 import torch
 class PepeGenerator:
+    """
+    Main generator class for creating Pepe meme images.
+    This class manages the entire image generation pipeline including:
+    - Model loading and caching (with Streamlit cache_resource)
+    - LoRA and LCM-LoRA adapter management
+    - Scheduler configuration (DPM Solver or LCM)
+    - Memory optimizations (attention slicing, VAE slicing, xformers)
+    - Device management (automatic CUDA/CPU detection)
+    - Progress tracking callbacks
+    The generator is designed to work efficiently on both GPU and CPU,
+    with automatic optimizations based on available hardware.
+    Attributes:
+        config: ModelConfig instance with generation settings
+        device: Torch device ('cuda' or 'cpu')
+        pipe: Cached StableDiffusionPipeline instance
+    """
     def __init__(self, config: Optional[ModelConfig] = None):
+        """
+        Initialize the Pepe generator with configuration.
+        Sets up the generator by determining the compute device (GPU/CPU),
+        loading the model pipeline, and caching it for reuse. The model
+        loading is cached using Streamlit's cache_resource decorator to avoid
+        reloading on every interaction.
+        Args:
+            config: ModelConfig instance. If None, uses default configuration.
+        Example:
+            >>> config = ModelConfig()
+            >>> config.USE_LCM = True  # Enable fast generation
+            >>> generator = PepeGenerator(config)
+        """
         self.config = config or ModelConfig()
         self.device = self._get_device(self.config.FORCE_CPU)
         self.pipe = self._load_model(
     def generate(
         self,
         prompt: str,
         negative_prompt: Optional[str] = None,
+        num_inference_steps: int = 25,
         guidance_scale: float = 7.5,
         width: int = 512,
         height: int = 512,
+        seed: Optional[int] = None,
+        progress_callback: Optional[Callable[[int, int], None]] = None
+    ) -> Image:
+        """
+        Generate a Pepe meme image from a text prompt.
+        This method runs the diffusion process to generate an image based on
+        the provided text prompt. It supports various parameters to control
+        the generation quality, style, and randomness.
         Args:
+            prompt: Text description of the desired image. For best results with
+                the fine-tuned model, include the trigger word 'pepe_style_frog'.
+            negative_prompt: Text describing what to avoid in the image.
+                If None, uses default from config.
+            num_inference_steps: Number of denoising steps (4-8 for LCM, 20-50 normal).
+            guidance_scale: CFG scale (1.0-2.0 for LCM, 5.0-15.0 normal).
+            width: Output image width in pixels (must be divisible by 8).
+            height: Output image height in pixels (must be divisible by 8).
+            seed: Random seed for reproducible generation.
+            progress_callback: Optional callback(current_step, total_steps).
+        Returns:
+            PIL Image object containing the generated image.
         """
+        # Use the prompt as-is (style handling is done in app.py before calling generate)
+        enhanced_prompt = prompt
         # Set default negative prompt
         if negative_prompt is None:
         logger.debug(f"Full prompt: {enhanced_prompt}")
         logger.debug(f"Model config - Base: {self.config.BASE_MODEL}, LoRA: {self.config.USE_LORA}")
+        # Create callback wrapper if provided (using new diffusers API)
         callback_on_step_end_fn = None
+        if progress_callback:
             def callback_on_step_end_fn(pipe, step, timestep, callback_kwargs):
+                progress_callback(step + 1, num_inference_steps)
                 return callback_kwargs
         # Generate image (removed autocast for CPU compatibility)

src/utils/image_processor.py CHANGED Viewed

@@ -1,4 +1,16 @@
-"""Image processing utilities"""
 from PIL import Image, ImageDraw, ImageFont, ImageEnhance
 from typing import Optional, Tuple
@@ -8,7 +20,18 @@ logger = logging.getLogger(__name__)
 class ImageProcessor:
-    """Handles image post-processing"""
     @staticmethod
     def add_meme_text(
@@ -18,7 +41,27 @@ class ImageProcessor:
         font_size: int = 40,
         font_path: Optional[str] = None,
     ) -> Image.Image:
-        """Add classic meme text to image"""
         img = image.copy()
         draw = ImageDraw.Draw(img)
@@ -159,7 +202,34 @@ class ImageProcessor:
         sharpness: float = 1.2,
         contrast: float = 1.1,
     ) -> Image.Image:
-        """Apply enhancement filters"""
         # Sharpen
         enhancer = ImageEnhance.Sharpness(image)

+"""Image Processing Utilities for Meme Creation.
+This module provides utilities for post-processing generated images:
+- Adding classic meme text with outlines
+- Adding signatures/watermarks
+- Enhancing image quality (sharpness, contrast)
+All methods are static and can be used without instantiation.
+The ImageProcessor class acts as a namespace for image manipulation functions.
+Author: MJaheen
+License: MIT
+"""
 from PIL import Image, ImageDraw, ImageFont, ImageEnhance
 from typing import Optional, Tuple
 class ImageProcessor:
+    """
+    Static utility class for image post-processing operations.
+    This class provides methods for enhancing generated images with meme text,
+    signatures, and quality improvements. All methods are static and work with
+    PIL Image objects.
+    Methods:
+        add_meme_text: Add top/bottom text in classic meme style
+        add_signature: Add watermark/signature to image
+        enhance_image: Apply sharpness and contrast enhancements
+    """
     @staticmethod
     def add_meme_text(
         font_size: int = 40,
         font_path: Optional[str] = None,
     ) -> Image.Image:
+        """
+        Add classic Impact-font meme text with white text and black outline.
+        Creates the traditional meme format with text at the top and/or bottom
+        of the image. Text is automatically converted to uppercase and rendered
+        with a thick black outline for readability on any background.
+        Args:
+            image: Input PIL Image to add text to
+            top_text: Text to display at top of image (default: "")
+            bottom_text: Text to display at bottom of image (default: "")
+            font_size: Size of the font in points (default: 40)
+            font_path: Optional path to custom font file (default: uses Impact)
+        Returns:
+            PIL Image with meme text overlay (copy of original, not modified in-place)
+        Note:
+            Falls back to default font if Impact font is not found.
+            Text is centered horizontally automatically.
+        """
         img = image.copy()
         draw = ImageDraw.Draw(img)
         sharpness: float = 1.2,
         contrast: float = 1.1,
     ) -> Image.Image:
+        """
+        Apply sharpness and contrast enhancements to improve image quality.
+        This method applies PIL's ImageEnhance filters to make the image
+        crisper and more vibrant. Useful for post-processing AI-generated
+        images which can sometimes appear slightly soft.
+        Args:
+            image: Input PIL Image to enhance
+            sharpness: Sharpness multiplier (default: 1.2)
+                - 0.0: Blurred
+                - 1.0: Original sharpness
+                - 2.0: Very sharp
+                Recommended range: 1.0-1.5
+            contrast: Contrast multiplier (default: 1.1)
+                - 0.0: Gray
+                - 1.0: Original contrast
+                - 2.0: High contrast
+                Recommended range: 1.0-1.3
+        Returns:
+            Enhanced PIL Image (modified in-place)
+        Example:
+            >>> image = Image.open("soft_image.png")
+            >>> enhanced = ImageProcessor.enhance_image(image, sharpness=1.3, contrast=1.2)
+            >>> enhanced.save("sharp_image.png")
+        """
         # Sharpen
         enhancer = ImageEnhance.Sharpness(image)