storytracer commited on
Commit
f9be8ba
·
verified ·
1 Parent(s): 3df1f60

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ pinned: false
12
  We are a group of researchers working together to collect and curate openly licensed and public domain data for training large language models.
13
  So far, we have released:
14
 
15
- - [The Common Pile v0.1](https://huggingface.co/collections/common-pile/common-pile-v01-raw-data-6826b454a5a6a445d0b51b37), an 8 TB dataset of text from over 30 diverse sources
16
  - Our paper: [The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text](https://huggingface.co/papers/2506.05209)
17
  - [Comma v0.1-1T](https://huggingface.co/common-pile/comma-v0.1-1t) and [Comma v0.1-2T](https://huggingface.co/common-pile/comma-v0.1-2t), 7B parameter LLMs trained on text from the Common Pile v0.1
18
  - The [training dataset](https://huggingface.co/datasets/common-pile/comma_v0.1_training_dataset) used to train the Comma v0.1 models
 
12
  We are a group of researchers working together to collect and curate openly licensed and public domain data for training large language models.
13
  So far, we have released:
14
 
15
+ - [The Common Pile v0.1](https://huggingface.co/collections/common-pile/common-pile-v01-68307d37df48e36f02717f21), an 8 TB dataset of text from over 30 diverse sources
16
  - Our paper: [The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text](https://huggingface.co/papers/2506.05209)
17
  - [Comma v0.1-1T](https://huggingface.co/common-pile/comma-v0.1-1t) and [Comma v0.1-2T](https://huggingface.co/common-pile/comma-v0.1-2t), 7B parameter LLMs trained on text from the Common Pile v0.1
18
  - The [training dataset](https://huggingface.co/datasets/common-pile/comma_v0.1_training_dataset) used to train the Comma v0.1 models