ZeroXClem commited on
Commit
d829f4a
ยท
verified ยท
1 Parent(s): aa95f61

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +114 -6
README.md CHANGED
@@ -4,17 +4,42 @@ tags:
4
  - merge
5
  - mergekit
6
  - lazymergekit
 
 
 
 
 
 
 
 
 
 
 
 
7
  ---
8
-
9
  # ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
10
 
11
- ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
 
12
 
13
- ## ๐Ÿงฉ Configuration
14
 
15
- ```yaml
16
- # Merge configuration for ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix using MODEL STOCK
 
 
 
 
 
 
 
 
 
 
 
17
 
 
 
18
  name: ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
19
  base_model: mergekit-community/L3.1-Athena-d-8B
20
  dtype: bfloat16
@@ -25,4 +50,87 @@ models:
25
  - model: Undi95/Meta-Llama-3.1-8B-Claude
26
  - model: mergekit-community/good_mix_model_Stock
27
  tokenizer_source: mergekit-community/L3.1-Athena-d-8B
28
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - merge
5
  - mergekit
6
  - lazymergekit
7
+ - model_stock
8
+ - ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
9
+ language:
10
+ - en
11
+ base_model:
12
+ - Pedro13543/mega_blend_model
13
+ - Skywork/Skywork-o1-Open-Llama-3.1-8B
14
+ - Undi95/Meta-Llama-3.1-8B-Claude
15
+ - mergekit-community/good_mix_model_Stock
16
+ - mergekit-community/L3.1-Athena-d-8B
17
+ pipeline_tag: text-generation
18
+ library_name: transformers
19
  ---
 
20
  # ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
21
 
22
+ ## Overview
23
+ **ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix** is a powerful AI model built through **model stock merging** using **MergeKit**. It brings together some of the best models available on **Hugging Face**, ensuring strong performance in a wide range of NLP tasks, including reasoning, coding, roleplay, and instruction-following.
24
 
25
+ This model was created by merging high-quality foundational and fine-tuned models to create an optimized **blended architecture** that retains the strengths of each contributing model.
26
 
27
+ ## Merge Details
28
+ - **Merge Method:** `model_stock`
29
+ - **Base Model:** [`mergekit-community/L3.1-Athena-d-8B`](https://huggingface.co/mergekit-community/L3.1-Athena-d-8B)
30
+ - **Dtype:** `bfloat16`
31
+ - **Tokenizer Source:** `mergekit-community/L3.1-Athena-d-8B`
32
+
33
+ ## Models Merged
34
+ The following models contributed to this fusion:
35
+
36
+ - [`Pedro13543/mega_blend_model`](https://huggingface.co/Pedro13543/mega_blend_model) - A well-balanced blend of roleplay and instruction-tuned Llama-3.1 variants.
37
+ - [`Skywork/Skywork-o1-Open-Llama-3.1-8B`](https://huggingface.co/Skywork/Skywork-o1-Open-Llama-3.1-8B) - Optimized for reasoning and slow-thinking capabilities.
38
+ - [`Undi95/Meta-Llama-3.1-8B-Claude`](https://huggingface.co/Undi95/Meta-Llama-3.1-8B-Claude) - Fine-tuned on Claude Opus/Sonnet data, improving response depth and conversational engagement.
39
+ - [`mergekit-community/good_mix_model_Stock`](https://huggingface.co/mergekit-community/good_mix_model_Stock) - A diverse mixture including RP-focused and knowledge-heavy datasets.
40
 
41
+ ## Configuration
42
+ ```yaml
43
  name: ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
44
  base_model: mergekit-community/L3.1-Athena-d-8B
45
  dtype: bfloat16
 
50
  - model: Undi95/Meta-Llama-3.1-8B-Claude
51
  - model: mergekit-community/good_mix_model_Stock
52
  tokenizer_source: mergekit-community/L3.1-Athena-d-8B
53
+ ```
54
+
55
+ ## Features & Improvements
56
+ ๐Ÿ”น **Advanced Reasoning & Thoughtfulness** - Thanks to `Skywork-o1` integration, this model excels in logical thinking and problem-solving.
57
+
58
+ ๐Ÿ”น **Enhanced Conversational Depth** - The inclusion of `Meta-Llama-3.1-8B-Claude` adds better response structuring, making it more engaging in dialogue.
59
+
60
+ ๐Ÿ”น **Versatile Roleplay & Creativity** - Leveraging `mega_blend_model` and `good_mix_model_Stock`, the model supports immersive roleplaying and storytelling.
61
+
62
+ ๐Ÿ”น **Strong Instruction Following** - Trained on various instruction datasets to provide clear, informative, and helpful responses.
63
+
64
+ ## Use Cases
65
+ - **Chat & Roleplay** - Supports natural, engaging, and dynamic conversational flow.
66
+ - **Programming & Code Generation** - Provides reliable code completions and debugging suggestions.
67
+ - **Creative Writing** - Generates compelling stories, character dialogues, and immersive text.
68
+ - **Educational Assistance** - Helps explain complex topics and answer academic questions.
69
+ - **Logic & Problem-Solving** - Can handle reasoning-based and structured thought processes.
70
+
71
+
72
+ ## ๐Ÿ›  How to Use
73
+
74
+ ### ๐Ÿ”ฅ Ollama (Quick Inference)
75
+
76
+ You can run the model using **Ollama** for direct testing:
77
+
78
+ ```bash
79
+ ollama run hf.co/ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
80
+ ```
81
+
82
+ ### ๐Ÿค— Hugging Face Transformers (Python)
83
+
84
+ ```python
85
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
86
+ import torch
87
+
88
+ model_name = "ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix"
89
+
90
+ # Load tokenizer & model
91
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
92
+ model = AutoModelForCausalLM.from_pretrained(
93
+ model_name,
94
+ torch_dtype=torch.bfloat16,
95
+ device_map="auto"
96
+ )
97
+
98
+ # Initialize text generation pipeline
99
+ text_generator = pipeline(
100
+ "text-generation",
101
+ model=model,
102
+ tokenizer=tokenizer,
103
+ torch_dtype=torch.bfloat16,
104
+ device_map="auto"
105
+ )
106
+
107
+ # Example prompt
108
+ prompt = "Describe the significance of AI ethics in modern technology."
109
+
110
+ # Generate output
111
+ outputs = text_generator(
112
+ prompt,
113
+ max_new_tokens=200,
114
+ do_sample=True,
115
+ temperature=0.7,
116
+ top_k=50,
117
+ top_p=0.95
118
+ )
119
+
120
+ print(outputs[0]["generated_text"])
121
+ ```
122
+
123
+ ---
124
+
125
+ ## Model Alignment & Ethics
126
+ โš ๏ธ **Uncensored Use**: This model does not apply strict moderation. Users should implement appropriate **safety filters** before deployment.
127
+
128
+ โš ๏ธ **Responsibility Notice**: You are responsible for the outputs generated by this model. It is recommended to apply **ethical safeguards** and **content moderation** when integrating this model into applications.
129
+
130
+ ๐Ÿ“œ **License**: Governed by the **Meta Llama 3.1 Community License Agreement**.
131
+
132
+ ## Feedback & Contributions
133
+ We welcome feedback, bug reports, and performance evaluations! If you find improvements or wish to contribute, feel free to reach out or submit suggestions.
134
+
135
+ ---
136
+ **ZeroXClem Team | 2025**