Submitted by akhaliq 32 Many-Shot In-Context Learning in Multimodal Foundation Models · 6 authors 142 3
Submitted by akhaliq 30 Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection · 16 authors 1.05k 2
Submitted by akhaliq 20 Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion · 8 authors
Submitted by akhaliq 14 TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction · 5 authors