Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published 6 days ago • 108
Small Models are Valuable Plug-ins for Large Language Models Paper • 2305.08848 • Published May 15, 2023 • 4
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data Paper • 2304.01196 • Published Apr 3, 2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 34