TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval Paper • 2104.08271 • Published Apr 16, 2021
Unsupervised learning from video to detect foreground objects in single images Paper • 1703.10901 • Published Mar 31, 2017
Mining for meaning: from vision to language through multiple networks consensus Paper • 1806.01954 • Published Jun 5, 2018