Transcribe audio or YouTube videos into text
Generate speech from text using Microsoft Edge TTS
Generate depth map from image