Yang Shi's picture

6 13 1

Yang Shi

DogNeverSleep

·

https://FrankYang-17.github.io/

FrankYang-17

AI & ML interests

👨🏻‍🎓PhD student at Peking University

Recent Activity

updated a dataset 8 days ago

DogNeverSleep/MME-VideoOCR-VLMEvalKit

published a dataset 8 days ago

DogNeverSleep/MME-VideoOCR-VLMEvalKit

upvoted a paper 9 days ago

IF-VidCap: Can Video Caption Models Follow Instructions?

View all activity

Organizations

commented a paper 17 days ago

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

Paper • 2510.10395 • Published 19 days ago • 28 •

commented 2 papers about 1 month ago

OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

Paper • 2509.24900 • Published Sep 29 • 53 •

RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark

Paper • 2509.24897 • Published Sep 29 • 46 •

New activity in DogNeverSleep/MME-VideoOCR_Dataset 5 months ago

Add paper link, license

#2 opened 5 months ago by

commented a paper 5 months ago

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios

Paper • 2505.21333 • Published May 27 • 38 •

commented a paper 7 months ago

Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30 •