Generate detailed visual insights from images and text
Generate depth maps from images
Generate text based on images and input text
Segment images into parts and maps