Update README.md (#1)

8014986 4 months ago

2.77 kB

license: apache-2.0
language:
  - en
  - zh
base_model:
  - meituan-longcat/LongCat-Image-Edit
base_model_relation: quantized
pipeline_tag: image-text-to-image
library_name: diffusers
tags:
  - diffusion-single-file

For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11

Feel free to request for other models for compression as well (for either the diffusers library, ComfyUI, or any other model), although models that use architectures which are unfamiliar to me might be more difficult.

How to Use

`diffusers`

import torch
from diffusers import LongCatImageEditPipeline, LongCatImageTransformer2DModel

# for transformers version >=5.0.0
# from transformers.initialization import no_init_weights

# else
from transformers.modeling_utils import no_init_weights

with no_init_weights():
    transformer = LongCatImageTransformer2DModel.from_config(
        LongCatImageTransformer2DModel.load_config(
            "meituan-longcat/LongCat-Image-Edit", subfolder="transformer"
        ),
        torch_dtype=torch.bfloat16
    ).to(torch.bfloat16)
DFloat11Model.from_pretrained(
    "mingyi456/LongCat-Image-Edit-DF11",
    device="cpu",
    bfloat16_model=transformer,
)
pipe = LongCatImageEditPipeline.from_pretrained(
    "meituan-longcat/LongCat-Image-Edit",
    transformer=transformer, 
    torch_dtype=torch.bfloat16
)
DFloat11Model.from_pretrained(
    "mingyi456/Qwen2.5-VL-7B-Instruct-DF11",
    device="cpu",
    bfloat16_model=pipe.text_encoder,
)
pipe.enable_model_cpu_offload()

img = Image.open('assets/test.png').convert('RGB')
prompt = '将猫变成狗'
image = pipe(
    img,
    prompt,
    negative_prompt='',
    guidance_scale=4.5,
    num_inference_steps=50,
    num_images_per_prompt=1,
    generator=torch.Generator("cpu").manual_seed(43)
).images[0]

image.save('image longcat-image-edit.png')

ComfyUI

Currently, this model is not supported natively in ComfyUI. Do let me know if it receives native support, and I will get to supporting it.

Compression details

This is the pattern_dict for compression:

pattern_dict = {
    r"transformer_blocks\.\d+": (
        "norm1.linear",
        "norm1_context.linear",
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
        "attn.to_out.0",
        "attn.add_q_proj",
        "attn.add_k_proj",
        "attn.add_v_proj",
        "attn.to_add_out",
        "ff.net.0.proj",
        "ff.net.2",
        "ff_context.net.0.proj",
        "ff_context.net.2",
    ),
    r"single_transformer_blocks\.\d+": (
        "norm.linear",
        "proj_mlp",
        "proj_out",
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
    ),
}