-
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 44 -
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers
Paper • 2311.10642 • Published • 26 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77
Collections
Discover the best community collections!
Collections including paper arxiv:2311.11045
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
In-Context Former: Lightning-fast Compressing Context for Large Language Model
Paper • 2406.13618 • Published -
ViPer: Visual Personalization of Generative Models via Individual Preference Learning
Paper • 2407.17365 • Published • 13 -
KAN or MLP: A Fairer Comparison
Paper • 2407.16674 • Published • 43
-
Contrastive Chain-of-Thought Prompting
Paper • 2311.09277 • Published • 36 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
mosaicml/mpt-7b-storywriter
Text Generation • Updated • 1.05k • 839 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33
-
GPT4All: An Ecosystem of Open Source Compressed Language Models
Paper • 2311.04931 • Published • 23 -
Can LLMs Follow Simple Rules?
Paper • 2311.04235 • Published • 14 -
Prompt Engineering a Prompt Engineer
Paper • 2311.05661 • Published • 25 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Paper • 2211.05100 • Published • 34 -
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Paper • 2201.11115 • Published -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 23 -
FinGPT: Large Generative Models for a Small Language
Paper • 2311.05640 • Published • 31
-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 119 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 29 -
Make Pixels Dance: High-Dynamic Video Generation
Paper • 2311.10982 • Published • 69
-
Contrastive Chain-of-Thought Prompting
Paper • 2311.09277 • Published • 36 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 14 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 44
-
HuggingFaceH4/zephyr-7b-alpha
Text Generation • 7B • Updated • 3k • • 1.11k -
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 119 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 37
-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 35 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper • 2310.12823 • Published • 36 -
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper • 2303.16634 • Published • 3 -
GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems
Paper • 2310.12397 • Published • 1
-
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 44 -
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers
Paper • 2311.10642 • Published • 26 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77
-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 119 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 29 -
Make Pixels Dance: High-Dynamic Video Generation
Paper • 2311.10982 • Published • 69
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
In-Context Former: Lightning-fast Compressing Context for Large Language Model
Paper • 2406.13618 • Published -
ViPer: Visual Personalization of Generative Models via Individual Preference Learning
Paper • 2407.17365 • Published • 13 -
KAN or MLP: A Fairer Comparison
Paper • 2407.16674 • Published • 43
-
Contrastive Chain-of-Thought Prompting
Paper • 2311.09277 • Published • 36 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
mosaicml/mpt-7b-storywriter
Text Generation • Updated • 1.05k • 839 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33
-
Contrastive Chain-of-Thought Prompting
Paper • 2311.09277 • Published • 36 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 14 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 44
-
GPT4All: An Ecosystem of Open Source Compressed Language Models
Paper • 2311.04931 • Published • 23 -
Can LLMs Follow Simple Rules?
Paper • 2311.04235 • Published • 14 -
Prompt Engineering a Prompt Engineer
Paper • 2311.05661 • Published • 25 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77
-
HuggingFaceH4/zephyr-7b-alpha
Text Generation • 7B • Updated • 3k • • 1.11k -
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 119 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 37
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Paper • 2211.05100 • Published • 34 -
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Paper • 2201.11115 • Published -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 23 -
FinGPT: Large Generative Models for a Small Language
Paper • 2311.05640 • Published • 31
-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 35 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper • 2310.12823 • Published • 36 -
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper • 2303.16634 • Published • 3 -
GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems
Paper • 2310.12397 • Published • 1