Frontier Foundation Models for Video Understanding
Dense Grounded Understanding of Images and Videos
The first journey begins here
VLMEvalKit Evaluation Results Collection