RWKV
/

SmerkyG BlinkDL commited on
Commit
baf4769
·
verified ·
0 Parent(s):

Duplicate from BlinkDL/rwkv-7-pile

Browse files

Co-authored-by: BlinkDL <[email protected]>

.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - EleutherAI/pile
5
+ ---
6
+
7
+ RWKV-7 trained on the Pile w/ "20b tokenizer" (332115325534 tokens)
8
+
9
+ 0.1B = L12-D768, lr 8e-4 to 3e-5 cosine decay, wd 0.1, bsz 8x30x4096
10
+
11
+ 0.4B = L24-D1024, lr 6e-4 to 2e-5 cosine decay, wd 0.1, bsz 8x30x4096
12
+
13
+ 1.5B = L24-D2048, lr 5e-4 to 1.5e-5 cosine decay, wd 0.1, bsz 8x45x4096
14
+
15
+ Check https://github.com/BlinkDL/RWKV-LM for details.
16
+
17
+ How to run it:
18
+
19
+ https://pypi.org/project/rwkv/
20
+
21
+ or
22
+
23
+ https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v7
RWKV-x070-Pile-1.47B-20241210-ctx4096.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b5dc71a921e59d2eaa947c8276d4d5fac873ea12dad47f9f5a8390ff0d8e507
3
+ size 2930288193
RWKV-x070-Pile-164M-L33-D512-20241218-ctx4096.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c2e586c4a09ef4c0f0ba973a84f229dd87d1eb5c27a210401731a04a021f36b
3
+ size 328874758
RWKV-x070-Pile-165M-L25-D576-20241218-ctx4096.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6bf07def143d3cd1a4f0bfef69159563f98f48cd9f4a15767825e4a3efad6b41
3
+ size 330455994
RWKV-x070-Pile-168M-20241120-ctx4096.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5db3e76067028196aa88164a2c81f4083c3f464142ee5395463402566ca29a17
3
+ size 335409434
RWKV-x070-Pile-421M-20241127-ctx4096.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:00bcbc502587dfc01f5f37033400c3d8a13c38da4f2c3b72ff7c8dab961a2be5
3
+ size 842414919