DavidAU commited on
Commit
9008d3e
·
verified ·
1 Parent(s): 5a18a82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -45,17 +45,17 @@ pipeline_tag: text-generation
45
 
46
  <img src="thought.jpg" style="float:right; width:300px; height:300px; padding:5px;">
47
 
48
- This as a 8X3B, Mixture of Experts model with 4/8 experts (4 Llama fine tunes) activated, all with Deepseek
49
  Reasoning tech installed (in each one) giving you a 24B (8X3B) parameter model in only 18.4B model size.
50
 
51
- This model is a composed of EIGHT finetuned Llama 3.2 3B models for reasoning/thought.
52
 
53
  This model can be used for creative, non-creative use cases and general usage.
54
 
55
  Three example prompts with output posted at the bottom of this page.
56
 
57
- This is a very stable model, which can operate at temps 1+ 2+ and higher and generate coherent thought(s) and exceeds the original
58
- many other "thinking models" in terms of performance, coherence and depth of thought.
59
 
60
  You can select/set the number of experts to use from 1 to 8.
61
 
@@ -75,7 +75,7 @@ See "Experts Activation" to adjust the number of experts used from 1 to 8 with t
75
 
76
  PROTOTYPE NOTES:
77
 
78
- 1. This model may go "on and on" in some cases. Set your context to at least 8k, 12 to 16 is better.
79
  2. Sometimes the model will be "all thought" and "no action" - in this case , stop the generation and tell the model to "execute the plan."
80
  3. Feel free to really go "wild" with temp with this model, especially creative use cases.
81
  4. The models selected are designed for problem solving and deep thinking/reasoning.
 
45
 
46
  <img src="thought.jpg" style="float:right; width:300px; height:300px; padding:5px;">
47
 
48
+ This as a 8X3B, Mixture of Experts model with 4/8 experts (8 Llama 3.2 fine tunes) activated, all with
49
  Reasoning tech installed (in each one) giving you a 24B (8X3B) parameter model in only 18.4B model size.
50
 
51
+ This model is a composed of EIGHT finetuned Llama 3.2 3B models for reasoning/thoughts.
52
 
53
  This model can be used for creative, non-creative use cases and general usage.
54
 
55
  Three example prompts with output posted at the bottom of this page.
56
 
57
+ This is a very stable model, which can operate at temps 1+ 2+ and higher and generate coherent thought(s) and exceeds
58
+ many other "thinking models" in terms of performance, coherence and depth of thought - including long train of thought reasoning.
59
 
60
  You can select/set the number of experts to use from 1 to 8.
61
 
 
75
 
76
  PROTOTYPE NOTES:
77
 
78
+ 1. This model may go "on and on" in some cases. Set your context to at least 8k, 12k to 16k is better as the model can easily output 12k+ in thoughts.
79
  2. Sometimes the model will be "all thought" and "no action" - in this case , stop the generation and tell the model to "execute the plan."
80
  3. Feel free to really go "wild" with temp with this model, especially creative use cases.
81
  4. The models selected are designed for problem solving and deep thinking/reasoning.