AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning Paper • 2510.16156 • Published Oct 17 • 1
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29 • 140
Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap Paper • 2509.26542 • Published Sep 30 • 8
CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models Paper • 2505.19235 • Published May 25 • 3
Angles Don't Lie: Unlocking Training-Efficient RL Through the Model's Own Signals Paper • 2506.02281 • Published Jun 2 • 4
HippoMM: Hippocampal-inspired Multimodal Memory for Long Audiovisual Event Understanding Paper • 2504.10739 • Published Apr 14 • 2
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm Paper • 2409.07226 • Published Sep 11, 2024 • 1
Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey Paper • 2407.21794 • Published Jul 31, 2024 • 7
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and KiSing-v2 Paper • 2401.17619 • Published Jan 31, 2024 • 1
OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection Paper • 2306.09301 • Published Jun 15, 2023 • 1