Update README.md
Browse files
README.md
CHANGED
|
@@ -12,6 +12,28 @@ base_model: DavidAU/DeepSeek-MOE-4X8B-R1-Distill-Llama-3.1-Deep-Thinker-Uncensor
|
|
| 12 |
This model was converted to GGUF format from [`DavidAU/DeepSeek-MOE-4X8B-R1-Distill-Llama-3.1-Deep-Thinker-Uncensored-24B`](https://huggingface.co/DavidAU/DeepSeek-MOE-4X8B-R1-Distill-Llama-3.1-Deep-Thinker-Uncensored-24B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
|
| 13 |
Refer to the [original model card](https://huggingface.co/DavidAU/DeepSeek-MOE-4X8B-R1-Distill-Llama-3.1-Deep-Thinker-Uncensored-24B) for more details on the model.
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
## Use with llama.cpp
|
| 16 |
Install llama.cpp through brew (works on Mac and Linux)
|
| 17 |
|
|
|
|
| 12 |
This model was converted to GGUF format from [`DavidAU/DeepSeek-MOE-4X8B-R1-Distill-Llama-3.1-Deep-Thinker-Uncensored-24B`](https://huggingface.co/DavidAU/DeepSeek-MOE-4X8B-R1-Distill-Llama-3.1-Deep-Thinker-Uncensored-24B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
|
| 13 |
Refer to the [original model card](https://huggingface.co/DavidAU/DeepSeek-MOE-4X8B-R1-Distill-Llama-3.1-Deep-Thinker-Uncensored-24B) for more details on the model.
|
| 14 |
|
| 15 |
+
---
|
| 16 |
+
This as a 4X8B, Mixture of Experts model with all 4 experts (4 Llama fine tunes) activated, all with Deepseek Reasoning tech installed (in each one) giving you a 32B (4X8B) parameter model in only 24.9B model size.
|
| 17 |
+
|
| 18 |
+
This model is a Deepseek model with "Distilled" components of "thinking/reasoning" fused into it.
|
| 19 |
+
|
| 20 |
+
This model can be used for creative, non-creative use cases and general usage.
|
| 21 |
+
|
| 22 |
+
This is a very stable model, which can operate at temps 1+ 2+ and higher and generate coherent thought(s) and exceeds the original distill model (by Deepseek) in terms of performance, coherence and depth of thought.
|
| 23 |
+
|
| 24 |
+
The actual "DeepSeek" thinking / reasoning tech built (grafted in directly, by DavidAU) into it. The "thinking/reasoning" tech (for the model at this repo) is from the original Llama 3.1 "Distill" model from Deepseek:
|
| 25 |
+
|
| 26 |
+
[ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B ]
|
| 27 |
+
|
| 28 |
+
This model is for all use cases, and it has a slightly more creative slant than a standard model.
|
| 29 |
+
|
| 30 |
+
This model can also be used for solving logic puzzles, riddles, and other problems with the enhanced "thinking" systems by DeepSeek.
|
| 31 |
+
|
| 32 |
+
This model also can solve problems/riddles/ and puzzles normally beyond the abilities of a Llama 3.1 model due to DeepSeek systems.
|
| 33 |
+
|
| 34 |
+
This model MAY produce NSFW / uncensored content.
|
| 35 |
+
|
| 36 |
+
---
|
| 37 |
## Use with llama.cpp
|
| 38 |
Install llama.cpp through brew (works on Mac and Linux)
|
| 39 |
|