Add pipeline tag: text-generation

#2
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +16 -2
README.md CHANGED
@@ -1,10 +1,21 @@
1
  ---
2
- library_name: transformers
3
  language:
4
  - en
 
5
  license: cc-by-nc-4.0
 
6
  ---
7
 
 
 
 
 
 
 
 
 
 
 
8
  # Model Information
9
 
10
  We introduce **UltraLong-8B**, a series of ultra-long context language models designed to process extensive sequences of text (up to 1M, 2M, and 4M tokens) while maintaining competitive performance on standard benchmarks. Built on the Llama-3.1, UltraLong-8B leverages a systematic training recipe that combines efficient continued pretraining with instruction tuning to enhance long-context understanding and instruction-following capabilities. This approach enables our models to efficiently scale their context windows without sacrificing general performance.
@@ -82,4 +93,7 @@ Chejian Xu ([email protected]), Wei Ping ([email protected])
82
  journal={arXiv preprint},
83
  year={2025}
84
  }
85
- </pre>
 
 
 
 
1
  ---
 
2
  language:
3
  - en
4
+ library_name: transformers
5
  license: cc-by-nc-4.0
6
+ pipeline_tag: text-generation
7
  ---
8
 
9
+ # Paper title and link
10
+
11
+ The model was presented in the paper [From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models](https://huggingface.co/papers/2504.06214).
12
+
13
+ # Paper abstract
14
+
15
+ The abstract of the paper is the following:
16
+
17
+
18
+
19
  # Model Information
20
 
21
  We introduce **UltraLong-8B**, a series of ultra-long context language models designed to process extensive sequences of text (up to 1M, 2M, and 4M tokens) while maintaining competitive performance on standard benchmarks. Built on the Llama-3.1, UltraLong-8B leverages a systematic training recipe that combines efficient continued pretraining with instruction tuning to enhance long-context understanding and instruction-following capabilities. This approach enables our models to efficiently scale their context windows without sacrificing general performance.
 
93
  journal={arXiv preprint},
94
  year={2025}
95
  }
96
+ </pre>
97
+
98
+ ## Project Page
99
+ https://ultralong.github.io/