Collection of useful papers.
-
Attention Is All You Need
Paper • 1706.03762 • Published • 96 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 53 -
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
Paper • 2101.03961 • Published • 13 -
Proximal Policy Optimization Algorithms
Paper • 1707.06347 • Published • 11