--- base_model: - black-forest-labs/FLUX.1-Fill-dev pipeline_tag: text-to-image library_name: transformers tags: - art --- # Calligrapher: Freestyle Text Image Customization

📄 Project Page | 📦 Code | 🎥 Video | 🤗 HF_Demo

## 🎯 Overview **Calligrapher** is a novel diffusion-based framework that innovatively integrates advanced text customization with artistic typography for digital calligraphy and design applications. Our framework supports text customization under various settings including self-reference, cross-reference, and non-text reference customization. ## ✨ Key Features - **🎨 Freestyle Text Customization**: Generate text with diverse stylized images and text prompts - **🔄 Various Reference Modes**: Support for self-reference, cross-reference, and non-text reference customization - **🚀 High-Quality Results**: Photorealistic text image customization with consistent typography - **🌐 Multi-Language Support**: Style-centric text customization across diverse languages (see this issue)

## 📦 Repository Contents This Hugging Face repository contains: - **`calligrapher.bin`**: Pre-trained Calligrapher model weights. - **`Calligrapher_bench_testing.zip`**: Comprehensive test dataset with examples for both self-reference and cross-reference customization scenarios with additional reference images for testing, omitting a small portion of samples due to IP concerns. ## 🛠️ Quick Start ### Installation We provide two ways to set up the environment (requiring Python 3.10 + PyTorch 2.5.0 + CUDA): #### Using pip ```bash # Clone the repository git clone https://github.com/Calligrapher2025/Calligrapher.git cd Calligrapher # Install dependencies pip install -r requirements.txt ``` #### Using Conda ```bash # Clone the repository git clone https://github.com/Calligrapher2025/Calligrapher.git cd Calligrapher # Create and activate conda environment conda env create -f env.yml conda activate calligrapher ``` ### Download Models & Testing Data ```python from huggingface_hub import snapshot_download # Download Calligrapher model and test data snapshot_download("Calligrapher2025/Calligrapher") # Download required base models (granted access needed for FLUX.1-Fill) snapshot_download("black-forest-labs/FLUX.1-Fill-dev", token="your_token") snapshot_download("google/siglip-so400m-patch14-384") ``` ### Configuration Before running the models, you need to configure the paths in `path_dict.json`: ```json { "data_dir": "path/to/Calligrapher_bench_testing", "cli_save_dir": "path/to/cli_results", "gradio_save_dir": "path/to/gradio_results", "gradio_temp_dir": "path/to/gradio_tmp", "base_model_path": "path/to/FLUX.1-Fill-dev", "image_encoder_path": "path/to/siglip-so400m-patch14-384", "calligrapher_path": "path/to/calligrapher.bin" } ``` Configuration parameters: - `data_dir`: Path to store the test dataset - `cli_save_dir`: Path to save results from command-line interface experiments - `gradio_save_dir`: Path to save results from Gradio interface experiments - `gradio_temp_dir`: Path to save Gradio temporary files - `base_model_path`: Path to the base model FLUX.1-Fill-dev - `image_encoder_path`: Path to the SigLIP image encoder model - `calligrapher_path`: Path to the Calligrapher model weights ### Run Gradio Demo ```bash # Basic Gradio demo python gradio_demo.py # PLEASE consider trying examples here first - demo with custom mask upload (recommended for first-time users) # This version includes pre-configured examples and is RECOMMENDED for users to first understand how to use the model python gradio_demo_upload_mask.py ``` Below is a preview of the Gradio demo interfaces:

We also provide a gradio demo enabling multilingual freestyle text customization such as Chinese, which is supported by [TextFLUX](https://github.com/yyyyyxie/textflux). To use this gradio demo, first download [TextFLUX weights](https://huggingface.co/yyyyyxie/textflux-lora/blob/main/pytorch_lora_weights.safetensors) and configure the "textflux_path" entry in "path_dict.json". Then download [the font resource](https://github.com/yyyyyxie/textflux/blob/main/resource/font/Arial-Unicode-Regular.ttf) to "./resources/" and run: ```bash python gradio_demo_multilingual.py ``` **✨User Tips:** 1. **Speed vs Quality Trade-off.** Use fewer steps (e.g., 10-step which takes ~4s/image on a single A6000 GPU) for faster generation, but quality may be lower. 2. **Inpaint Position Freedom.** Inpainting positions are flexible - they don't necessarily need to match the original text locations in the input image. 3. **Iterative Editing.** Drag outputs from the gallery to the Image Editing Panel (clean the Editing Panel first) for quick refinements. 4. **Mask Optimization.** Adjust mask size/aspect ratio to match your desired content. The model tends to fill the masks, and harmonizes the generation with background in terms of color and lighting. 5. **Reference Image Tip.** White-background references improve style consistency - the encoder also considers background context of the given reference image. 6. **Resolution Balance.** Very high-resolution generation sometimes triggers spelling errors. 512/768px are recommended considering the model is trained under the resolution of 512. ## 🎨 Command Line Usage Examples ### Self-reference Customization ```bash python infer_calligrapher_self_custom.py ``` ### Cross-reference Customization ```bash python infer_calligrapher_cross_custom.py ``` **Note:** Image result files starting with "result" are the customization outputs, while files starting with "vis_result" are concatenated results showing the source image, reference image, and model output together. ## 📊 Framework

Our framework integrates localized style injection and diffusion-based learning, featuring: - **Self-distillation mechanism** for automatic typography benchmark construction. - **Localized style injection** via trainable style encoder. - **In-context generation** for enhanced style alignment. ## 🎭 Results Gallery