TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs Paper • 2509.18056 • Published Sep 22 • 27
LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs Paper • 2506.21862 • Published Jun 27 • 36
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection Paper • 2308.05480 • Published Aug 10, 2023 • 2
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published Apr 10 • 50