fla-hub
/

rwkv7-0.4B-world

@@ -1,5 +1,6 @@
 ---
-license: apache-2.0
 language:
 - en
 - zh
@@ -9,10 +10,10 @@ language:
 - ar
 - es
 - pt
 metrics:
 - accuracy
-base_model:
-- BlinkDL/rwkv-7-world
 pipeline_tag: text-generation
 ---
@@ -24,7 +25,6 @@ This is RWKV-7 model under flash-linear attention format.
 ## Model Details
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
@@ -100,7 +100,6 @@ This model is trained on the World v3 with a total of 3.119 trillion tokens.
 - **Training regime:** bfloat16, lr 4e-4 to 1e-5 "delayed" cosine decay, wd 0.1 (with increasing batch sizes during the middle)
 ## FAQ
 Q: safetensors metadata is none.

 ---
+base_model:
+- BlinkDL/rwkv-7-world
 language:
 - en
 - zh
 - ar
 - es
 - pt
+license: apache-2.0
+library_name: transformers
 metrics:
 - accuracy
 pipeline_tag: text-generation
 ---
 ## Model Details
 ### Model Description
 <!-- Provide a longer summary of what this model is. -->
 - **Training regime:** bfloat16, lr 4e-4 to 1e-5 "delayed" cosine decay, wd 0.1 (with increasing batch sizes during the middle)
 ## FAQ
 Q: safetensors metadata is none.