File size: 8,640 Bytes
aa95f61 7abe5e4 aa95f61 7abe5e4 aa95f61 d829f4a 7abe5e4 aa95f61 d829f4a aa95f61 d309973 d829f4a aa95f61 d829f4a aa95f61 d829f4a aa95f61 d829f4a 7abe5e4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 |
---
language:
- en
license: apache-2.0
library_name: transformers
tags:
- merge
- mergekit
- lazymergekit
- model_stock
- ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
base_model:
- Pedro13543/mega_blend_model
- Skywork/Skywork-o1-Open-Llama-3.1-8B
- Undi95/Meta-Llama-3.1-8B-Claude
- mergekit-community/good_mix_model_Stock
- mergekit-community/L3.1-Athena-d-8B
pipeline_tag: text-generation
model-index:
- name: Llama-3.1-8B-AthenaSky-MegaMix
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 63.01
name: strict accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 31.39
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 27.95
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 3.69
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 6.9
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 27.82
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
name: Open LLM Leaderboard
---
# ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
## Overview
**ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix** is a powerful AI model built through **model stock merging** using **MergeKit**. It brings together some of the best models available on **Hugging Face**, ensuring strong performance in a wide range of NLP tasks, including reasoning, coding, roleplay, and instruction-following.

This model was created by merging high-quality foundational and fine-tuned models to create an optimized **blended architecture** that retains the strengths of each contributing model.
## Merge Details
- **Merge Method:** `model_stock`
- **Base Model:** [`mergekit-community/L3.1-Athena-d-8B`](https://huggingface.co/mergekit-community/L3.1-Athena-d-8B)
- **Dtype:** `bfloat16`
- **Tokenizer Source:** `mergekit-community/L3.1-Athena-d-8B`
## Models Merged
The following models contributed to this fusion:
- [`Pedro13543/mega_blend_model`](https://huggingface.co/Pedro13543/mega_blend_model) - A well-balanced blend of roleplay and instruction-tuned Llama-3.1 variants.
- [`Skywork/Skywork-o1-Open-Llama-3.1-8B`](https://huggingface.co/Skywork/Skywork-o1-Open-Llama-3.1-8B) - Optimized for reasoning and slow-thinking capabilities.
- [`Undi95/Meta-Llama-3.1-8B-Claude`](https://huggingface.co/Undi95/Meta-Llama-3.1-8B-Claude) - Fine-tuned on Claude Opus/Sonnet data, improving response depth and conversational engagement.
- [`mergekit-community/good_mix_model_Stock`](https://huggingface.co/mergekit-community/good_mix_model_Stock) - A diverse mixture including RP-focused and knowledge-heavy datasets.
## Configuration
```yaml
name: ZeroXClem-Llama-3.1-8B-AthenaSky-MegaMix
base_model: mergekit-community/L3.1-Athena-d-8B
dtype: bfloat16
merge_method: model_stock
models:
- model: Pedro13543/mega_blend_model
- model: Skywork/Skywork-o1-Open-Llama-3.1-8B
- model: Undi95/Meta-Llama-3.1-8B-Claude
- model: mergekit-community/good_mix_model_Stock
tokenizer_source: mergekit-community/L3.1-Athena-d-8B
```
## Features & Improvements
๐น **Advanced Reasoning & Thoughtfulness** - Thanks to `Skywork-o1` integration, this model excels in logical thinking and problem-solving.
๐น **Enhanced Conversational Depth** - The inclusion of `Meta-Llama-3.1-8B-Claude` adds better response structuring, making it more engaging in dialogue.
๐น **Versatile Roleplay & Creativity** - Leveraging `mega_blend_model` and `good_mix_model_Stock`, the model supports immersive roleplaying and storytelling.
๐น **Strong Instruction Following** - Trained on various instruction datasets to provide clear, informative, and helpful responses.
## Use Cases
- **Chat & Roleplay** - Supports natural, engaging, and dynamic conversational flow.
- **Programming & Code Generation** - Provides reliable code completions and debugging suggestions.
- **Creative Writing** - Generates compelling stories, character dialogues, and immersive text.
- **Educational Assistance** - Helps explain complex topics and answer academic questions.
- **Logic & Problem-Solving** - Can handle reasoning-based and structured thought processes.
## ๐ How to Use
### ๐ฅ Ollama (Quick Inference)
You can run the model using **Ollama** for direct testing:
```bash
ollama run hf.co/ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix
```
### ๐ค Hugging Face Transformers (Python)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
model_name = "ZeroXClem/Llama-3.1-8B-AthenaSky-MegaMix"
# Load tokenizer & model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Initialize text generation pipeline
text_generator = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Example prompt
prompt = "Describe the significance of AI ethics in modern technology."
# Generate output
outputs = text_generator(
prompt,
max_new_tokens=200,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
print(outputs[0]["generated_text"])
```
---
## Model Alignment & Ethics
โ ๏ธ **Uncensored Use**: This model does not apply strict moderation. Users should implement appropriate **safety filters** before deployment.
โ ๏ธ **Responsibility Notice**: You are responsible for the outputs generated by this model. It is recommended to apply **ethical safeguards** and **content moderation** when integrating this model into applications.
๐ **License**: Governed by the **Meta Llama 3.1 Community License Agreement**.
## Feedback & Contributions
We welcome feedback, bug reports, and performance evaluations! If you find improvements or wish to contribute, feel free to reach out or submit suggestions.
---
**ZeroXClem Team | 2025 ** 
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/ZeroXClem__Llama-3.1-8B-AthenaSky-MegaMix-details)
| Metric |Value|
|-------------------|----:|
|Avg. |26.79|
|IFEval (0-Shot) |63.01|
|BBH (3-Shot) |31.39|
|MATH Lvl 5 (4-Shot)|27.95|
|GPQA (0-shot) | 3.69|
|MuSR (0-shot) | 6.90|
|MMLU-PRO (5-shot) |27.82|
|