Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2507.03745

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18, 2024 • 18
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Paper • 2401.09962 • Published Jan 18, 2024 • 9
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Paper • 2401.10404 • Published Jan 18, 2024 • 10
ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19, 2024 • 13

Video Generation

A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality

Paper • 2507.07202 • Published Jul 9 • 22
StreamDiT: Real-Time Streaming Text-to-Video Generation

Paper • 2507.03745 • Published Jul 4 • 31
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Paper • 2507.01945 • Published Jul 2 • 78
TokensGen: Harnessing Condensed Tokens for Long Video Generation

Paper • 2507.15728 • Published Jul 21 • 7

Representation & Optimization

Understanding about representation sheds light on optimization

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31 • 1

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

Paper • 2306.10012 • Published Jun 16, 2023 • 36
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Paper • 2403.05135 • Published Mar 8, 2024 • 45
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Paper • 2408.06072 • Published Aug 12, 2024 • 39
haoningwu/StoryGen

Updated Jun 26 • 4

Efficient Video Diffusion

StreamDiT: Real-Time Streaming Text-to-Video Generation

Paper • 2507.03745 • Published Jul 4 • 31

Video Generation

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Paper • 2506.09113 • Published Jun 10 • 102
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion

Paper • 2506.08009 • Published Jun 9 • 29
Seeing Voices: Generating A-Roll Video from Audio with Mirage

Paper • 2506.08279 • Published Jun 9 • 27
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement

Paper • 2506.07848 • Published Jun 9 • 4

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90
StreamDiT: Real-Time Streaming Text-to-Video Generation

Paper • 2507.03745 • Published Jul 4 • 31

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18, 2024 • 18
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Paper • 2401.09962 • Published Jan 18, 2024 • 9
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Paper • 2401.10404 • Published Jan 18, 2024 • 10
ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19, 2024 • 13

Efficient Video Diffusion

StreamDiT: Real-Time Streaming Text-to-Video Generation

Paper • 2507.03745 • Published Jul 4 • 31

Video Generation

A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality

Paper • 2507.07202 • Published Jul 9 • 22
StreamDiT: Real-Time Streaming Text-to-Video Generation

Paper • 2507.03745 • Published Jul 4 • 31
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Paper • 2507.01945 • Published Jul 2 • 78
TokensGen: Harnessing Condensed Tokens for Long Video Generation

Paper • 2507.15728 • Published Jul 21 • 7

Video Generation

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Paper • 2506.09113 • Published Jun 10 • 102
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion

Paper • 2506.08009 • Published Jun 9 • 29
Seeing Voices: Generating A-Roll Video from Audio with Mirage

Paper • 2506.08279 • Published Jun 9 • 27
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement

Paper • 2506.07848 • Published Jun 9 • 4

Representation & Optimization

Understanding about representation sheds light on optimization

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31 • 1

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90
StreamDiT: Real-Time Streaming Text-to-Video Generation

Paper • 2507.03745 • Published Jul 4 • 31

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

Paper • 2306.10012 • Published Jun 16, 2023 • 36
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Paper • 2403.05135 • Published Mar 8, 2024 • 45
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Paper • 2408.06072 • Published Aug 12, 2024 • 39
haoningwu/StoryGen

Updated Jun 26 • 4

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs