AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs Paper • 2508.16153 • Published Aug 22 • 154
A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models Paper • 2508.01548 • Published Aug 3 • 13
MagicTryOn: Harnessing Diffusion Transformer for Garment-Preserving Video Virtual Try-on Paper • 2505.21325 • Published May 27 • 4
DepthMaster: Taming Diffusion Models for Monocular Depth Estimation Paper • 2501.02576 • Published Jan 5 • 15
ROICtrl: Boosting Instance Control for Visual Generation Paper • 2411.17949 • Published Nov 27, 2024 • 87
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second Paper • 2410.02073 • Published Oct 2, 2024 • 41
3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection Paper • 2410.01647 • Published Oct 2, 2024 • 31