Diffusion CoT

Team

non-profit

Activity Feed

AI & ML interests

diffusion

Recent Activity

JackyZhuo authored a paper about 11 hours ago

Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision

JackyZhuo authored a paper about 11 hours ago

Factuality Matters: When Image Generation and Editing Meet Structured Visuals

JackyZhuo authored a paper about 11 hours ago

Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding

View all activity

JackyZhuo

authored 4 papers about 11 hours ago

Lumina-OmniLV: A Unified Multimodal Framework for General Low-Level Vision

Paper • 2504.04903 • Published Apr 7

Factuality Matters: When Image Generation and Editing Meet Structured Visuals

Paper • 2510.05091 • Published Oct 6 • 18

Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding

Paper • 2510.06308 • Published Oct 7 • 53

PICABench: How Far Are We from Physically Realistic Image Editing?

Paper • 2510.17681 • Published 28 days ago • 62

sayakpaul

authored a paper about 1 month ago

Factuality Matters: When Image Generation and Editing Meet Structured Visuals

Paper • 2510.05091 • Published Oct 6 • 18

jieliu

authored 3 papers about 2 months ago

Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model

Paper • 2509.04548 • Published Sep 4 • 4

RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published Sep 10 • 72

Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance

Paper • 2508.21016 • Published Aug 28

JackyZhuo

published a dataset 2 months ago

diffusion-cot/echo-4o-instruction-following

Viewer • Updated Aug 19 • 68k • 625

JackyZhuo

authored 9 papers 2 months ago

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23, 2024 • 25

I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers with Projected Flow

Paper • 2410.07536 • Published Oct 10, 2024 • 5

OmniCaptioner: One Captioner to Rule Them All

Paper • 2504.07089 • Published Apr 9 • 20

VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning

Paper • 2504.07960 • Published Apr 10 • 50

From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning

Paper • 2504.16080 • Published Apr 22 • 15

Vision-to-Music Generation: A Survey

Paper • 2503.21254 • Published Mar 27

Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling

Paper • 2507.17801 • Published Jul 23 • 1

Resurrect Mask AutoRegressive Modeling for Efficient and Scalable Image Generation

Paper • 2507.13032 • Published Jul 17

TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation

Paper • 2503.07050 • Published Mar 10

sayakpaul

updated a dataset 2 months ago

diffusion-cot/imgedit-simpler

Viewer • Updated Sep 9 • 724k • 718

sayakpaul

published a dataset 2 months ago

diffusion-cot/imgedit-simpler

Viewer • Updated Sep 9 • 724k • 718

AI & ML interests

Recent Activity

Team members 4

diffusion-cot's activity