SpatialVID: A Large-Scale Video Dataset with Spatial Annotations Paper • 2509.09676 • Published Sep 11 • 31
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning Paper • 2509.03646 • Published Sep 3 • 30
M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision Paper • 2509.01360 • Published Sep 1 • 11
M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision Paper • 2509.01360 • Published Sep 1 • 11 • 1
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling Paper • 2508.17445 • Published Aug 24 • 80
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding Paper • 2507.07984 • Published Jul 10 • 42