--- pipeline_tag: text-to-image library_name: diffusers license: apache-2.0 tags: - diffusion - text-to-image - photoroom - prx - open-source - image-generation - flow-matching demo: https://huggingface.co/spaces/Photoroom/PRX-1024-beta-version model_type: diffusion-transformer inference: true --- # PRX: Open Text-to-Image Generative Model ![PRX](https://cdn-uploads.huggingface.co/production/uploads/68d136d7307413e80188d819/ZiHS6kQv64EArhBcv7_Yk.jpeg) **PRX (Photoroom Experimental)** is a **1.3-billion-parameter text-to-image model trained entirely from scratch** and released under an **Apache 2.0 license**. It is part of Photoroom’s broader effort to **open-source the complete process** behind training large-scale text-to-image models — covering architecture design, optimization strategies, and post-training alignment. The goal is to make PRX both a **strong open baseline** and a **transparent research reference** for those developing or studying diffusion-transformer models. For more information, please read our [announcement blog post](https://huggingface.co/blog/Photoroom/prx-open-source-t2i-model). ## Model description PRX is designed to be **lightweight yet capable**, easy to fine-tune or extend, and fully open. PRX generates high-quality images from text using a simplified MMDiT architecture where text tokens don’t update through transformer blocks. It uses flow matching with discrete scheduling for efficient sampling and Google’s T5-Gemma-2B-2B-UL2 model for multilingual text encoding. The model has around **1.3B parameters** and delivers fast inference without sacrificing quality. You can choose between **Flux VAE** for balanced quality and speed, or **DC-AE** for higher latent compression and faster processing. This card in particular describes `Photoroom/prx-512-t2i`, one of the PRX model variants: - **Resolution:** 512 pixels - **Architecture:** PRX (MMDiT-like diffusion transformer variant) - **Latent backbone:** Flux's VAE - **Text encoder:** T5-Gemma-2B-2B-UL2 - **Training stage:** Base model - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) For other checkpoints, browse the full [PRX collection](https://huggingface.co/collections/Photoroom/prx). ## Example usage You can use PRX directly in [Diffusers](https://huggingface.co/docs/diffusers/main/en/api/pipelines/prx): ```python from diffusers.pipelines.prx import PRXPipeline pipe = PRXPipeline.from_pretrained( "Photoroom/prx-512-t2i", torch_dtype=torch.bfloat16 ).to("cuda") prompt = "A front-facing portrait of a lion in the golden savanna at sunset" image = pipe(prompt, num_inference_steps=28, guidance_scale=5.0).images[0] image.save("lion.png") ``` ## Visual examples and demo Here are some examples from one of our best checkpoints so far ([Photoroom/prx-1024-t2i-beta](https://huggingface.co/Photoroom/prx-1024-t2i-beta)).
[PRX Demo on Hugging Face Spaces](https://huggingface.co/spaces/Photoroom/PRX-1024-beta-version) — interactive text-to-image demo for `Photoroom/prx-1024-t2i-beta`. ## Training details PRX models were trained from scratch using recent advances in diffusion and flow-matching training. We experimented with a range of modern techniques for efficiency, stability, and alignment, which we’ll cover in more detail in our upcoming series of research posts: - [Part 0: Overview and release](https://huggingface.co/blog/Photoroom/prx-open-source-t2i-model) - Part 1: Design experiments and architecture benchmark *(coming soon)* - Part 2: Accelerating training *(coming soon)* - Part 3: Post-pretraining *(coming soon)* ## Other PRX models You can find additional checkpoints in the [PRX collection](https://huggingface.co/collections/Photoroom/prx): - **Base** – pretrained model before alignment; best starting point for fine-tuning or research - **SFT** — supervised fine-tuned model; produces more aesthetically pleasing, ready-to-use generations - **Latent backbones** — Flux's and DC-AE VAEs - **Distilled** – 8-step generation with LADD - **Resolutions** – 256, 512, and 1024 pixels ## License PRX is available under an **Apache 2.0 license**. ## Use restrictions You must not use PRX models for: 1. any of the restricted uses set forth in the [Gemma Prohibited Use Policy](ai.google.dev/gemma/prohibited_use_policy); 2. or any activity that violates applicable laws or regulations.