achyutarajaram commited on
Commit
a2f3304
·
verified ·
1 Parent(s): f175beb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -3
README.md CHANGED
@@ -1,3 +1,49 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ ## Sparse Model from Gao et al. 2025
6
+
7
+ Weights for a sparse model from Gao et al. 2025, used for the qualitative results from the paper (related to bracket counting and variable binding). All weights for the other models used in the paper, as well as lightweight inference code, are present in https://github.com/openai/circuit_sparsity. In the context of that repo, this model is csp_yolo2.
8
+
9
+ This is a runnable standalone huggingface implementation for one of the models.
10
+
11
+ Some trivial code to load the locally converted HF model + tokenizer and
12
+ run a tiny generation.
13
+
14
+
15
+ import torch
16
+ from transformers import AutoModelForCausalLM, AutoTokenizer
17
+
18
+ if __name__ == "__main__":
19
+ PROMPT = "def square_sum(xs):\n return sum(x * x for x in xs)\n\nsquare_sum([1, 2, 3])\n"
20
+ tok = AutoTokenizer.from_pretrained("circuit-sparsity", trust_remote_code=True)
21
+ model = AutoModelForCausalLM.from_pretrained(
22
+ "circuit-sparsity",
23
+ trust_remote_code=True,
24
+ torch_dtype="auto",
25
+ )
26
+ model.to("cuda" if torch.cuda.is_available() else "cpu")
27
+ inputs = tok(PROMPT, return_tensors="pt", add_special_tokens=False)["input_ids"].to(
28
+ model.device
29
+ )
30
+
31
+ with torch.no_grad():
32
+ out = model.generate(
33
+ inputs,
34
+ max_new_tokens=64,
35
+ do_sample=True,
36
+ temperature=0.8,
37
+ top_p=0.95,
38
+ return_dict_in_generate=False,
39
+ )
40
+
41
+ print("=== Prompt ===")
42
+ print(PROMPT)
43
+ print("\n=== Generation ===")
44
+ print(tok.decode(out[0], skip_special_tokens=True))
45
+
46
+
47
+
48
+ ## License
49
+ This project is licensed under the [Apache License 2.0](LICENSE).