ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published 11 days ago • 96
Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs Paper • 2511.16664 • Published 17 days ago • 24
V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models Paper • 2511.16668 • Published 17 days ago • 53
Reasoning Language Model Inference Serving Unveiled: An Empirical Study Paper • 2510.18672 • Published Oct 21 • 7
DiffAdapt: Difficulty-Adaptive Reasoning for Token-Efficient LLM Inference Paper • 2510.19669 • Published Oct 22 • 1
GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving Paper • 2510.11769 • Published Oct 13 • 25
Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression Paper • 2505.19433 • Published May 26 • 5
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17 • 93
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve? Paper • 2502.17535 • Published Feb 24 • 8
Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research Paper • 2502.12669 • Published Feb 18 • 2
Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing Paper • 2502.04411 • Published Feb 6 • 4
Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published Feb 4 • 15
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference Paper • 2502.00299 • Published Feb 1 • 2
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22 • 90
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 429
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression Paper • 2412.17483 • Published Dec 23, 2024 • 34
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation Paper • 2412.13649 • Published Dec 18, 2024 • 21