VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images Paper • 2604.09531 • Published 4 days ago • 6
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Paper • 2604.06870 • Published 6 days ago • 35