Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.17764

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 471k • • 12.8k
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
open-r1/OpenR1-Math-220k

Viewer • Updated Feb 18 • 450k • 7.61k • 658

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 471k • • 12.8k
deepseek-ai/Janus-Pro-7B

Any-to-Any • Updated Feb 1 • 80.1k • 3.52k
hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10 • 4.17M • • 5.19k
Zyphra/Zonos-v0.1-hybrid

Text-to-Speech • Updated Jun 3 • 19.2k • 1.1k

1.58-bit FLUX

Paper • 2412.18653 • Published Dec 24, 2024 • 84
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625
BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 69
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 105

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625

Papers I have read

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 23
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

mistralai/Mistral-7B-Instruct-v0.3

7B • Updated Jul 24 • 1.12M • 2.2k
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.57M • • 11.7k
PKU-Alignment/align-anything

Viewer • Updated Apr 5 • 69.4k • 1.54k • 44
NousResearch/hermes-function-calling-v1

Viewer • Updated Aug 30, 2024 • 11.6k • 1.46k • 346

In an era where artificial intelligence is reshaping global economies, governance, and security, nations face a pivotal challenge: to harness AI’s pot

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625

CohereLabs/c4ai-command-r-plus-08-2024

Text Generation • 104B • Updated Apr 15 • 2.87k • 276
meta-llama/Meta-Llama-3-8B

Text Generation • 8B • Updated Sep 27, 2024 • 1.5M • • 6.35k
meta-llama/Meta-Llama-3-70B

Text Generation • 71B • Updated Sep 27, 2024 • 13.2k • • 869
impira/layoutlm-document-qa

Document Question Answering • 0.1B • Updated Mar 18, 2023 • 33.7k • 1.15k

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 471k • • 12.8k
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298
open-r1/OpenR1-Math-220k

Viewer • Updated Feb 18 • 450k • 7.61k • 658

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 23
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 471k • • 12.8k
deepseek-ai/Janus-Pro-7B

Any-to-Any • Updated Feb 1 • 80.1k • 3.52k
hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10 • 4.17M • • 5.19k
Zyphra/Zonos-v0.1-hybrid

Text-to-Speech • Updated Jun 3 • 19.2k • 1.1k

mistralai/Mistral-7B-Instruct-v0.3

7B • Updated Jul 24 • 1.12M • 2.2k
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.57M • • 11.7k
PKU-Alignment/align-anything

Viewer • Updated Apr 5 • 69.4k • 1.54k • 44
NousResearch/hermes-function-calling-v1

Viewer • Updated Aug 30, 2024 • 11.6k • 1.46k • 346

1.58-bit FLUX

Paper • 2412.18653 • Published Dec 24, 2024 • 84
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625
BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 69
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 105

In an era where artificial intelligence is reshaping global economies, governance, and security, nations face a pivotal challenge: to harness AI’s pot

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625

Papers I have read

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625

CohereLabs/c4ai-command-r-plus-08-2024

Text Generation • 104B • Updated Apr 15 • 2.87k • 276
meta-llama/Meta-Llama-3-8B

Text Generation • 8B • Updated Sep 27, 2024 • 1.5M • • 6.35k
meta-llama/Meta-Llama-3-70B

Text Generation • 71B • Updated Sep 27, 2024 • 13.2k • • 869
impira/layoutlm-document-qa

Document Question Answering • 0.1B • Updated Mar 18, 2023 • 33.7k • 1.15k

Previous
1
2
3
4
5
...
23
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs