---
base_model:
- black-forest-labs/FLUX.1-Fill-dev
pipeline_tag: text-to-image
library_name: transformers
tags:
- art
---
# Calligrapher: Freestyle Text Image Customization
## π― Overview
**Calligrapher** is a novel diffusion-based framework that innovatively integrates advanced text customization with artistic typography for digital calligraphy and design applications. Our framework supports text customization under various settings including self-reference, cross-reference, and non-text reference customization.
## β¨ Key Features
- **π¨ Freestyle Text Customization**: Generate text with diverse stylized images and text prompts
- **π Various Reference Modes**: Support for self-reference, cross-reference, and non-text reference customization
- **π High-Quality Results**: Photorealistic text image customization with consistent typography
- **π Multi-Language Support**: Style-centric text customization across diverse languages (see this issue)
## π¦ Repository Contents
This Hugging Face repository contains:
- **`calligrapher.bin`**: Pre-trained Calligrapher model weights.
- **`Calligrapher_bench_testing.zip`**: Comprehensive test dataset with examples for both self-reference and cross-reference customization scenarios with additional reference images for testing, omitting a small portion of samples due to IP concerns.
## π οΈ Quick Start
### Installation
We provide two ways to set up the environment (requiring Python 3.10 + PyTorch 2.5.0 + CUDA):
#### Using pip
```bash
# Clone the repository
git clone https://github.com/Calligrapher2025/Calligrapher.git
cd Calligrapher
# Install dependencies
pip install -r requirements.txt
```
#### Using Conda
```bash
# Clone the repository
git clone https://github.com/Calligrapher2025/Calligrapher.git
cd Calligrapher
# Create and activate conda environment
conda env create -f env.yml
conda activate calligrapher
```
### Download Models & Testing Data
```python
from huggingface_hub import snapshot_download
# Download Calligrapher model and test data
snapshot_download("Calligrapher2025/Calligrapher")
# Download required base models (granted access needed for FLUX.1-Fill)
snapshot_download("black-forest-labs/FLUX.1-Fill-dev", token="your_token")
snapshot_download("google/siglip-so400m-patch14-384")
```
### Configuration
Before running the models, you need to configure the paths in `path_dict.json`:
```json
{
"data_dir": "path/to/Calligrapher_bench_testing",
"cli_save_dir": "path/to/cli_results",
"gradio_save_dir": "path/to/gradio_results",
"gradio_temp_dir": "path/to/gradio_tmp",
"base_model_path": "path/to/FLUX.1-Fill-dev",
"image_encoder_path": "path/to/siglip-so400m-patch14-384",
"calligrapher_path": "path/to/calligrapher.bin"
}
```
Configuration parameters:
- `data_dir`: Path to store the test dataset
- `cli_save_dir`: Path to save results from command-line interface experiments
- `gradio_save_dir`: Path to save results from Gradio interface experiments
- `gradio_temp_dir`: Path to save Gradio temporary files
- `base_model_path`: Path to the base model FLUX.1-Fill-dev
- `image_encoder_path`: Path to the SigLIP image encoder model
- `calligrapher_path`: Path to the Calligrapher model weights
### Run Gradio Demo
```bash
# Basic Gradio demo
python gradio_demo.py
# PLEASE consider trying examples here first - demo with custom mask upload (recommended for first-time users)
# This version includes pre-configured examples and is RECOMMENDED for users to first understand how to use the model
python gradio_demo_upload_mask.py
```
Below is a preview of the Gradio demo interfaces:
We also provide a gradio demo enabling multilingual freestyle text customization such as Chinese, which is supported by [TextFLUX](https://github.com/yyyyyxie/textflux). To use this gradio demo, first download [TextFLUX weights](https://huggingface.co/yyyyyxie/textflux-lora/blob/main/pytorch_lora_weights.safetensors) and configure the "textflux_path" entry in "path_dict.json". Then download [the font resource](https://github.com/yyyyyxie/textflux/blob/main/resource/font/Arial-Unicode-Regular.ttf) to "./resources/" and run:
```bash
python gradio_demo_multilingual.py
```
**β¨User Tips:**
1. **Speed vs Quality Trade-off.** Use fewer steps (e.g., 10-step which takes ~4s/image on a single A6000 GPU) for faster generation, but quality may be lower.
2. **Inpaint Position Freedom.** Inpainting positions are flexible - they don't necessarily need to match the original text locations in the input image.
3. **Iterative Editing.** Drag outputs from the gallery to the Image Editing Panel (clean the Editing Panel first) for quick refinements.
4. **Mask Optimization.** Adjust mask size/aspect ratio to match your desired content. The model tends to fill the masks, and harmonizes the generation with background in terms of color and lighting.
5. **Reference Image Tip.** White-background references improve style consistency - the encoder also considers background context of the given reference image.
6. **Resolution Balance.** Very high-resolution generation sometimes triggers spelling errors. 512/768px are recommended considering the model is trained under the resolution of 512.
## π¨ Command Line Usage Examples
### Self-reference Customization
```bash
python infer_calligrapher_self_custom.py
```
### Cross-reference Customization
```bash
python infer_calligrapher_cross_custom.py
```
**Note:** Image result files starting with "result" are the customization outputs, while files starting with "vis_result" are concatenated results showing the source image, reference image, and model output together.
## π Framework
Our framework integrates localized style injection and diffusion-based learning, featuring:
- **Self-distillation mechanism** for automatic typography benchmark construction.
- **Localized style injection** via trainable style encoder.
- **In-context generation** for enhanced style alignment.
## π Results Gallery