Commit
·
02a80c2
1
Parent(s):
515e294
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,5 +1,10 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
# MPT-7B (Base)
|
|
@@ -35,17 +40,17 @@ We demonstrate generations as long as 80k tokens on a single A100-80GB GPU in ou
|
|
| 35 |
* [MPT-7B-Instruct](https://huggingface.co/mosaicml/mpt-7b-instruct): a model for short-form instruction following.
|
| 36 |
It is built by finetuning MPT-7B on a [dataset](https://huggingface.co/datasets/sam-mosaic/dolly_hhrlhf) we also release, derived from the [Databricks Dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and the [Anthropic Helpful and Harmless (HH-RLHF)](https://huggingface.co/datasets/Anthropic/hh-rlhf) datasets.
|
| 37 |
* License: _CC-By-SA-3.0_ (commercial use permitted)
|
| 38 |
-
* [Online Demo](https://huggingface.co/spaces/mosaicml/mpt-7b-instruct)
|
| 39 |
|
| 40 |
* [MPT-7B-Chat](TBD): a chatbot-like model for dialogue generation.
|
| 41 |
It is built by finetuning MPT-7B on the [ShareGPT-Vicuna](https://huggingface.co/datasets/jeffwan/sharegpt_vicuna), [HC3](https://huggingface.co/datasets/Hello-SimpleAI/HC3),
|
| 42 |
[Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca), [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf), and [Evol-Instruct](https://huggingface.co/datasets/victor123/evol_instruct_70k) datasets.
|
| 43 |
* License: _CC-By-NC-SA-4.0_ (non-commercial use only)
|
| 44 |
-
* [Online Demo](https://huggingface.co/spaces/mosaicml/mpt-7b-chat)
|
| 45 |
|
| 46 |
## Model Date
|
| 47 |
|
| 48 |
-
May
|
| 49 |
|
| 50 |
## Model License
|
| 51 |
|
|
@@ -53,9 +58,9 @@ Apache-2.0 (commercial use permitted)
|
|
| 53 |
|
| 54 |
## Documentation
|
| 55 |
|
| 56 |
-
* [Blog post]
|
| 57 |
* [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
|
| 58 |
-
* Questions: contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)
|
| 59 |
|
| 60 |
|
| 61 |
## How to Use
|
|
@@ -166,19 +171,20 @@ While great efforts have been taken to clean the pretraining data, it is possibl
|
|
| 166 |
|
| 167 |
## Acknowledgements
|
| 168 |
|
|
|
|
| 169 |
We gratefully acknowledge the work of the researchers who created the [LLaMA series of models](https://arxiv.org/abs/2302.13971), which was the impetus for our efforts.
|
| 170 |
-
|
| 171 |
|
| 172 |
## Citation
|
| 173 |
|
| 174 |
Please cite this model using the following format:
|
| 175 |
|
| 176 |
```
|
| 177 |
-
@online{
|
| 178 |
author = {MosaicML NLP Team},
|
| 179 |
-
title = {
|
| 180 |
year = {2023},
|
| 181 |
-
url = {
|
| 182 |
note = {Accessed: 2023-03-28}, % change this date
|
| 183 |
urldate = {2023-03-28} % change this date
|
| 184 |
}
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- Composer
|
| 5 |
+
- MosaicML
|
| 6 |
+
- llm-foundry
|
| 7 |
+
- StreamingDatasets
|
| 8 |
---
|
| 9 |
|
| 10 |
# MPT-7B (Base)
|
|
|
|
| 40 |
* [MPT-7B-Instruct](https://huggingface.co/mosaicml/mpt-7b-instruct): a model for short-form instruction following.
|
| 41 |
It is built by finetuning MPT-7B on a [dataset](https://huggingface.co/datasets/sam-mosaic/dolly_hhrlhf) we also release, derived from the [Databricks Dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and the [Anthropic Helpful and Harmless (HH-RLHF)](https://huggingface.co/datasets/Anthropic/hh-rlhf) datasets.
|
| 42 |
* License: _CC-By-SA-3.0_ (commercial use permitted)
|
| 43 |
+
* [Online Demo on HuggingFace Spaces](https://huggingface.co/spaces/mosaicml/mpt-7b-instruct)
|
| 44 |
|
| 45 |
* [MPT-7B-Chat](TBD): a chatbot-like model for dialogue generation.
|
| 46 |
It is built by finetuning MPT-7B on the [ShareGPT-Vicuna](https://huggingface.co/datasets/jeffwan/sharegpt_vicuna), [HC3](https://huggingface.co/datasets/Hello-SimpleAI/HC3),
|
| 47 |
[Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca), [HH-RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf), and [Evol-Instruct](https://huggingface.co/datasets/victor123/evol_instruct_70k) datasets.
|
| 48 |
* License: _CC-By-NC-SA-4.0_ (non-commercial use only)
|
| 49 |
+
* [Online Demo on HuggingFace Spaces](https://huggingface.co/spaces/mosaicml/mpt-7b-chat)
|
| 50 |
|
| 51 |
## Model Date
|
| 52 |
|
| 53 |
+
May 5, 2023
|
| 54 |
|
| 55 |
## Model License
|
| 56 |
|
|
|
|
| 58 |
|
| 59 |
## Documentation
|
| 60 |
|
| 61 |
+
* [Blog post: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](www.mosaicml.com/blog/mpt-7b)
|
| 62 |
* [Codebase (mosaicml/llm-foundry repo)](https://github.com/mosaicml/llm-foundry/)
|
| 63 |
+
* Questions: Feel free to contact us via the [MosaicML Community Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg)!
|
| 64 |
|
| 65 |
|
| 66 |
## How to Use
|
|
|
|
| 171 |
|
| 172 |
## Acknowledgements
|
| 173 |
|
| 174 |
+
We would like to thank our friends at AI2 for helping us to curate our pretraining dataset, choose a great tokenizer, and for many other helpful conversations along the way ⚔️
|
| 175 |
We gratefully acknowledge the work of the researchers who created the [LLaMA series of models](https://arxiv.org/abs/2302.13971), which was the impetus for our efforts.
|
| 176 |
+
and also acknowledge the hard work of the [Together](https://www.together.xyz) team, which put together the RedPajama dataset.
|
| 177 |
|
| 178 |
## Citation
|
| 179 |
|
| 180 |
Please cite this model using the following format:
|
| 181 |
|
| 182 |
```
|
| 183 |
+
@online{MosaicML2023Introducing,
|
| 184 |
author = {MosaicML NLP Team},
|
| 185 |
+
title = {Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs},
|
| 186 |
year = {2023},
|
| 187 |
+
url = {www.mosaicml.com/blog/mpt-7b},
|
| 188 |
note = {Accessed: 2023-03-28}, % change this date
|
| 189 |
urldate = {2023-03-28} % change this date
|
| 190 |
}
|