locuslab
/

safelm-1.7b-instruct

Model card Files Files and versions

dsam99 commited on Sep 15

Commit

18eb4ad

·

verified ·

1 Parent(s): 1d84b1a

Update README.md

Files changed (1) hide show

README.md +32 -12

README.md CHANGED Viewed

@@ -1,14 +1,26 @@
 ---
 version: main
 family: smollm2-1.7b
-model_name: _base-1.7b-score0_20p_123rephrase_mild_45ref_45web_ref6x-600B-step-250000
 license: mit
 tags:
-  - model
-  - transformer
-  - smollm2
 ---
-# SmolLM2 _base-1.7b-score0_20p_123rephrase_mild_45ref_45web_ref6x-600B-step-250000 (Version: main)
 ## Model Details
 - **Architecture:** SmolLM2
@@ -31,16 +43,24 @@ train:
 ```
-## Model Loading and Revision System
-This repository hosts multiple revisions of the model.
-To load a specific revision, use the `revision` parameter. For example:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained("locuslab/_base-1.7b-score0_20p_123rephrase_mild_45ref_45web_ref6x-600B-step-250000", revision="final")
-tokenizer = AutoTokenizer.from_pretrained("locuslab/_base-1.7b-score0_20p_123rephrase_mild_45ref_45web_ref6x-600B-step-250000", revision="final")
 ```
-Replace `"final"` with the desired revision.

 ---
 version: main
 family: smollm2-1.7b
+model_name: locuslab/safelm-1.7b_instruct_rephrase_refusal_moral_ed_600B
 license: mit
 tags:
+- model
+- transformer
+- smollm2
+- safety p
+datasets:
+- locuslab/refuseweb
+- locuslab/safeweb
+- locuslab/moral_education
+- HuggingFaceTB/smollm-corpus
+base_model:
+- locuslab/safelm-1.7b_base_rephrase_refusal_moral_ed_600B
 ---
+# SafeLM-1.7B Instruct
+SafeLM is a 1.7B parameter model family that is trained via [Safety Pretraining](https://www.arxiv.org/abs/2504.16980). We train language models to be natively safe by incorporating safety
+directly into the pretraining pipeline. This is our instruction-tuned model. Our safety data curation involves scoring harmful content, rephrasing and contextualizing potentially harmful examples, and refusal training throughout pretraining.
+Please check out our [paper](https://www.arxiv.org/abs/2504.16980) and [website](https://locuslab.github.io/safety-pretraining/) for more details!
 ## Model Details
 - **Architecture:** SmolLM2
 ```
+## Quickstart
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("locuslab/safelm-1.7b_instruct_rephrase_refusal_moral_ed_600B")
+tokenizer = AutoTokenizer.from_pretrained("locuslab/safelm-1.7b_instruct_rephrase_refusal_moral_ed_600B")
 ```
+## Citation
+If you find our work helpful, please cite our work as:
+```
+@article{maini2025safety,
+  title={Safety pretraining: Toward the next generation of safe ai},
+  author={Maini, Pratyush and Goyal, Sachin and Sam, Dylan and Robey, Alex and Savani, Yash and Jiang, Yiding and Zou, Andy and Lipton, Zachary C and Kolter, J Zico},
+  journal={arXiv preprint arXiv:2504.16980},
+  year={2025}
+}
+```