nvidia
/

Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation

text-generation-inference

Model card Files Files and versions

Add pipeline tag: text-generation

#2

by nielsr HF Staff - opened Apr 9

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

Files changed (1) hide show

README.md +16 -2

README.md CHANGED Viewed

@@ -1,10 +1,21 @@
 ---
-library_name: transformers
 language:
 - en
 license: cc-by-nc-4.0
 ---
 # Model Information
 We introduce **UltraLong-8B**, a series of ultra-long context language models designed to process extensive sequences of text (up to 1M, 2M, and 4M tokens) while maintaining competitive performance on standard benchmarks. Built on the Llama-3.1, UltraLong-8B leverages a systematic training recipe that combines efficient continued pretraining with instruction tuning to enhance long-context understanding and instruction-following capabilities. This approach enables our models to efficiently scale their context windows without sacrificing general performance.
@@ -82,4 +93,7 @@ Chejian Xu ([email protected]), Wei Ping ([email protected])
   journal={arXiv preprint},
   year={2025}
  }
-</pre>

 ---
 language:
 - en
+library_name: transformers
 license: cc-by-nc-4.0
+pipeline_tag: text-generation
 ---
+# Paper title and link
+The model was presented in the paper [From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models](https://huggingface.co/papers/2504.06214).
+# Paper abstract
+The abstract of the paper is the following:
 # Model Information
 We introduce **UltraLong-8B**, a series of ultra-long context language models designed to process extensive sequences of text (up to 1M, 2M, and 4M tokens) while maintaining competitive performance on standard benchmarks. Built on the Llama-3.1, UltraLong-8B leverages a systematic training recipe that combines efficient continued pretraining with instruction tuning to enhance long-context understanding and instruction-following capabilities. This approach enables our models to efficiently scale their context windows without sacrificing general performance.
   journal={arXiv preprint},
   year={2025}
  }
+</pre>
+## Project Page
+https://ultralong.github.io/