Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper β’ 2309.15223 β’ Published Sep 26, 2023 β’ 22
Simple Projection Variants Improve ColBERT Performance Paper β’ 2510.12327 β’ Published 20 days ago β’ 5
Chart-RVR Collection Models trained using GRPO for enhanced Chart Reasoning β’ 3 items β’ Updated Aug 24 β’ 1
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Paper β’ 2402.00159 β’ Published Jan 31, 2024 β’ 65
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr β’ Feb 11 β’ 82
Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control Paper β’ 2504.17130 β’ Published Apr 23 β’ 1
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published Feb 4 β’ 246
Executable Code Actions Elicit Better LLM Agents Paper β’ 2402.01030 β’ Published Feb 1, 2024 β’ 172
LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset Paper β’ 2402.09391 β’ Published Feb 14, 2024 β’ 2