When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training Paper β’ 2411.13476 β’ Published Nov 20, 2024 β’ 16
COMETA: A Corpus for Medical Entity Linking in the Social Media Paper β’ 2010.03295 β’ Published Oct 7, 2020 β’ 2
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! β’ 30 items β’ Updated Jun 12, 2024 β’ 247
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent Paper β’ 2312.10003 β’ Published Dec 15, 2023 β’ 44
Axiomatic Preference Modeling for Longform Question Answering Paper β’ 2312.02206 β’ Published Dec 2, 2023 β’ 10
Reward models on the hub Collection UNMAINTAINED: See RewardBench... A place to collect reward models, an often not released artifact of RLHF. β’ 18 items β’ Updated Apr 13, 2024 β’ 25
Improving Large Language Model Fine-tuning for Solving Math Problems Paper β’ 2310.10047 β’ Published Oct 16, 2023 β’ 7
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper β’ 2307.09288 β’ Published Jul 18, 2023 β’ 246
Lemur: Harmonizing Natural Language and Code for Language Agents Paper β’ 2310.06830 β’ Published Oct 10, 2023 β’ 34
GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors Paper β’ 2310.08529 β’ Published Oct 12, 2023 β’ 18
CausalLM is not optimal for in-context learning Paper β’ 2308.06912 β’ Published Aug 14, 2023 β’ 18
RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools Paper β’ 2306.14447 β’ Published Jun 26, 2023 β’ 6