- 
	
	
	
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like
Paper • 2402.07383 • Published • 16 - 
	
	
	
Matcha-TTS: A fast TTS architecture with conditional flow matching
Paper • 2309.03199 • Published • 13 - 
	
	
	
Natural language guidance of high-fidelity text-to-speech with synthetic annotations
Paper • 2402.01912 • Published • 12 - 
	
	
	
Fast Timing-Conditioned Latent Audio Diffusion
Paper • 2402.04825 • Published • 8 
RO-HOON OH
heiscold
		·
				AI & ML interests
TTS, Audio Editing, Speech Editing
		
		Organizations
None yet
Audio_
			
			
	
	LLM
			
			
	
	Music_Generation
			
			
	
	- 
	
	
	
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
Paper • 2402.06178 • Published • 15 - 
	
	
	
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
Paper • 2401.12179 • Published • 21 - 
	
	
	
Fast Timing-Conditioned Latent Audio Diffusion
Paper • 2402.04825 • Published • 8 - 
	
	
	
Brain2Music: Reconstructing Music from Human Brain Activity
Paper • 2307.11078 • Published • 41 
Diffusion_FM_...
			
			
	
	- 
	
	
	
Multistep Consistency Models
Paper • 2403.06807 • Published • 16 - 
	
	
	
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Paper • 2403.17804 • Published • 20 - 
	
	
	
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 31 - 
	
	
	
Consistency Flow Matching: Defining Straight Flows with Velocity Consistency
Paper • 2407.02398 • Published • 17 
TTS, VC
			
			
	
	- 
	
	
	
Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like
Paper • 2402.07383 • Published • 16 - 
	
	
	
Matcha-TTS: A fast TTS architecture with conditional flow matching
Paper • 2309.03199 • Published • 13 - 
	
	
	
Natural language guidance of high-fidelity text-to-speech with synthetic annotations
Paper • 2402.01912 • Published • 12 - 
	
	
	
Fast Timing-Conditioned Latent Audio Diffusion
Paper • 2402.04825 • Published • 8 
Music_Generation
			
			
	
	- 
	
	
	
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
Paper • 2402.06178 • Published • 15 - 
	
	
	
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
Paper • 2401.12179 • Published • 21 - 
	
	
	
Fast Timing-Conditioned Latent Audio Diffusion
Paper • 2402.04825 • Published • 8 - 
	
	
	
Brain2Music: Reconstructing Music from Human Brain Activity
Paper • 2307.11078 • Published • 41 
Audio_
			
			
	
	Diffusion_FM_...
			
			
	
	- 
	
	
	
Multistep Consistency Models
Paper • 2403.06807 • Published • 16 - 
	
	
	
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Paper • 2403.17804 • Published • 20 - 
	
	
	
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 31 - 
	
	
	
Consistency Flow Matching: Defining Straight Flows with Velocity Consistency
Paper • 2407.02398 • Published • 17 
LLM