Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.11045

Training & Architectures

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 91
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Paper • 2307.08691 • Published Jul 17, 2023 • 9
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 159
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 55

Instruction Dataset

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

Paper • 2309.08958 • Published Sep 16, 2023 • 2
Generative Data Augmentation using LLMs improves Distributional Robustness in Question Answering

Paper • 2309.06358 • Published Sep 3, 2023 • 1
Tuna: Instruction Tuning using Feedback from Large Language Models

Paper • 2310.13385 • Published Oct 20, 2023 • 10
Retrieval-Generation Synergy Augmented Large Language Models

Paper • 2310.05149 • Published Oct 8, 2023 • 1

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Paper • 2309.03883 • Published Sep 7, 2023 • 35
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 52
Agents: An Open-source Framework for Autonomous Language Agents

Paper • 2309.07870 • Published Sep 14, 2023 • 42
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 51

Papers: Instruct

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Paper • 2310.13961 • Published Oct 21, 2023 • 5
Tuna: Instruction Tuning using Feedback from Large Language Models

Paper • 2310.13385 • Published Oct 20, 2023 • 10
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models

Paper • 2310.13127 • Published Oct 19, 2023 • 12
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning

Paper • 2310.00492 • Published Sep 30, 2023 • 2

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 105
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 78
In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 43
Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 43

Training & Architectures

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 91
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Paper • 2307.08691 • Published Jul 17, 2023 • 9
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 159
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 55

Papers: Instruct

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Paper • 2310.13961 • Published Oct 21, 2023 • 5
Tuna: Instruction Tuning using Feedback from Large Language Models

Paper • 2310.13385 • Published Oct 20, 2023 • 10
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models

Paper • 2310.13127 • Published Oct 19, 2023 • 12
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning

Paper • 2310.00492 • Published Sep 30, 2023 • 2

Instruction Dataset

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

Paper • 2309.08958 • Published Sep 16, 2023 • 2
Generative Data Augmentation using LLMs improves Distributional Robustness in Question Answering

Paper • 2309.06358 • Published Sep 3, 2023 • 1
Tuna: Instruction Tuning using Feedback from Large Language Models

Paper • 2310.13385 • Published Oct 20, 2023 • 10
Retrieval-Generation Synergy Augmented Large Language Models

Paper • 2310.05149 • Published Oct 8, 2023 • 1

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 105
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 78
In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 43
Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 43

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Paper • 2309.03883 • Published Sep 7, 2023 • 35
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 52
Agents: An Open-source Framework for Autonomous Language Agents

Paper • 2309.07870 • Published Sep 14, 2023 • 42
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 51

Previous
1
...
3
4
5
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs