MNIST 28×28 Text→Image (FiLM + CFG)

This repository provides a minimal diffusion model for generating MNIST digits (0–9) from text prompts. It uses a tiny UNet with FiLM conditioning and Classifier-Free Guidance (CFG).

Quickstart (dynamic import)

from huggingface_hub import snapshot_download
import importlib.util, sys, os

local_dir = snapshot_download("starkdv123/mnist-28px-text2img")
pipe_path = os.path.join(local_dir, "pipeline_mnist.py")
spec = importlib.util.spec_from_file_location("pipeline_mnist", pipe_path)
mod = importlib.util.module_from_spec(spec)
sys.modules[spec.name] = mod
spec.loader.exec_module(mod)

Pipe = getattr(mod, "MNISTTextToImagePipeline")
pipe = Pipe.from_pretrained(local_dir)
img = pipe("seven", num_inference_steps=120, guidance_scale=2.5).images[0]
img.save("seven.png")

Alternative (sys.path)

from huggingface_hub import snapshot_download
import sys

local_dir = snapshot_download("starkdv123/mnist-28px-text2img")
sys.path.append(local_dir)
from pipeline_mnist import MNISTTextToImagePipeline as Pipe
pipe = Pipe.from_pretrained(local_dir)

Alternative (Diffusers custom_pipeline)

from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("starkdv123/mnist-28px-text2img", custom_pipeline="pipeline_mnist")
# Note: requires diffusers>=0.30 and may still be sensitive to custom components.

Training details

  • Dataset: MNIST train split
  • Image size: 28×28 (grayscale)
  • Normalizer: (x - 0.5) / 0.5 → network predicts noise in [-1,1]
  • Timesteps: T=500, betas 1e-05→0.02
  • UNet base channels: 32; cond dim: 64; time dim: 128
  • CFG: drop‑cond p=0.1, null class id=10
  • Optimizer: AdamW, lr=0.001, batch=128, epochs=10

Files included

  • model.safetensors — trained UNet weights
  • config.json — model hyperparameters
  • scheduler_config.json — noise schedule
  • pipeline_mnist.py — custom pipeline definition
  • model_index.json — metadata for Diffusers
  • samples/grid_0_9.png — final 0–9 grid
  • samples/grid_e*.png — per‑epoch grids (training progress)

Training progress (epoch grids)

Epoch 1 Epoch 2 Epoch 3 Epoch 4 Epoch 5 Epoch 6 Epoch 7 Epoch 8 Epoch 9 Epoch 10

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support