mosaicml
/

mosaic-bert-base-seqlen-1024

Model card Files Files and versions

jacobfulano commited on Apr 28, 2023

Commit

4ce1645

·

1 Parent(s): aadf20a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ Hugging Face's [bert-base-uncased](https://huggingface.co/bert-base-uncased).
 __This model was trained with [ALiBi](https://arxiv.org/abs/2108.12409) on a sequence length of 1024 tokens.__
-ALiBi allows a model trained with a sequence length n to extrapolate to sequence lengths >2n. For more details, see [Train Short, Test Long: Attention with Linear
 Biases Enables Input Length Extrapolation (Press et al. 2022)](https://arxiv.org/abs/2108.12409)
 It is part of the family of MosaicBERT-Base models:

 __This model was trained with [ALiBi](https://arxiv.org/abs/2108.12409) on a sequence length of 1024 tokens.__
+ALiBi allows a model trained with a sequence length n to easily extrapolate to sequence lengths >2n during finetuning. For more details, see [Train Short, Test Long: Attention with Linear
 Biases Enables Input Length Extrapolation (Press et al. 2022)](https://arxiv.org/abs/2108.12409)
 It is part of the family of MosaicBERT-Base models: