An interactive demo for the DeepSeek-OCR model.
Create and enrich datasets with AI
Gemini native image for 3D co-drawing
Real-time in-browser speech recognition
Watermarking LLM-generated text with SynthID Text
Personalised Podcasts For All - Available in 13 Languages
Transcribe audio or YouTube videos into text
Transform text into engaging podcast dialogues or detailed reports
Ask questions and get detailed answers
Generate customized realistic photos from face images
In-browser speech recognition w/ word-level timestamps
Apply the motion of a video on a portrait
Generate captions and analyze images with various tasks