dsam99 commited on
Commit
18eb4ad
·
verified ·
1 Parent(s): 1d84b1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -12
README.md CHANGED
@@ -1,14 +1,26 @@
1
  ---
2
  version: main
3
  family: smollm2-1.7b
4
- model_name: _base-1.7b-score0_20p_123rephrase_mild_45ref_45web_ref6x-600B-step-250000
5
  license: mit
6
  tags:
7
- - model
8
- - transformer
9
- - smollm2
 
 
 
 
 
 
 
 
10
  ---
11
- # SmolLM2 _base-1.7b-score0_20p_123rephrase_mild_45ref_45web_ref6x-600B-step-250000 (Version: main)
 
 
 
 
12
 
13
  ## Model Details
14
  - **Architecture:** SmolLM2
@@ -31,16 +43,24 @@ train:
31
 
32
  ```
33
 
34
- ## Model Loading and Revision System
35
-
36
- This repository hosts multiple revisions of the model.
37
- To load a specific revision, use the `revision` parameter. For example:
38
 
39
  ```python
40
  from transformers import AutoModelForCausalLM, AutoTokenizer
41
 
42
- model = AutoModelForCausalLM.from_pretrained("locuslab/_base-1.7b-score0_20p_123rephrase_mild_45ref_45web_ref6x-600B-step-250000", revision="final")
43
- tokenizer = AutoTokenizer.from_pretrained("locuslab/_base-1.7b-score0_20p_123rephrase_mild_45ref_45web_ref6x-600B-step-250000", revision="final")
44
  ```
45
 
46
- Replace `"final"` with the desired revision.
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  version: main
3
  family: smollm2-1.7b
4
+ model_name: locuslab/safelm-1.7b_instruct_rephrase_refusal_moral_ed_600B
5
  license: mit
6
  tags:
7
+ - model
8
+ - transformer
9
+ - smollm2
10
+ - safety p
11
+ datasets:
12
+ - locuslab/refuseweb
13
+ - locuslab/safeweb
14
+ - locuslab/moral_education
15
+ - HuggingFaceTB/smollm-corpus
16
+ base_model:
17
+ - locuslab/safelm-1.7b_base_rephrase_refusal_moral_ed_600B
18
  ---
19
+ # SafeLM-1.7B Instruct
20
+
21
+ SafeLM is a 1.7B parameter model family that is trained via [Safety Pretraining](https://www.arxiv.org/abs/2504.16980). We train language models to be natively safe by incorporating safety
22
+ directly into the pretraining pipeline. This is our instruction-tuned model. Our safety data curation involves scoring harmful content, rephrasing and contextualizing potentially harmful examples, and refusal training throughout pretraining.
23
+ Please check out our [paper](https://www.arxiv.org/abs/2504.16980) and [website](https://locuslab.github.io/safety-pretraining/) for more details!
24
 
25
  ## Model Details
26
  - **Architecture:** SmolLM2
 
43
 
44
  ```
45
 
46
+ ## Quickstart
 
 
 
47
 
48
  ```python
49
  from transformers import AutoModelForCausalLM, AutoTokenizer
50
 
51
+ model = AutoModelForCausalLM.from_pretrained("locuslab/safelm-1.7b_instruct_rephrase_refusal_moral_ed_600B")
52
+ tokenizer = AutoTokenizer.from_pretrained("locuslab/safelm-1.7b_instruct_rephrase_refusal_moral_ed_600B")
53
  ```
54
 
55
+ ## Citation
56
+
57
+ If you find our work helpful, please cite our work as:
58
+
59
+ ```
60
+ @article{maini2025safety,
61
+ title={Safety pretraining: Toward the next generation of safe ai},
62
+ author={Maini, Pratyush and Goyal, Sachin and Sam, Dylan and Robey, Alex and Savani, Yash and Jiang, Yiding and Zou, Andy and Lipton, Zachary C and Kolter, J Zico},
63
+ journal={arXiv preprint arXiv:2504.16980},
64
+ year={2025}
65
+ }
66
+ ```