Update README.md (Minor Fix)
Browse files
README.md
CHANGED
|
@@ -540,7 +540,7 @@ base_model:
|
|
| 540 |
<div class="section-content">
|
| 541 |
<p>Creation Process: SFT</p>
|
| 542 |
<p>SFT on approx 10 million tokens, SFW / NSFW RP, stories, creative instruct & chat data.</p>
|
| 543 |
-
<p>MoE
|
| 544 |
<p>I think there's likely a better config to be found, but experimentation with the model to find it is quite draining.</p>
|
| 545 |
<div class="dropdown-container">
|
| 546 |
<details>
|
|
|
|
| 540 |
<div class="section-content">
|
| 541 |
<p>Creation Process: SFT</p>
|
| 542 |
<p>SFT on approx 10 million tokens, SFW / NSFW RP, stories, creative instruct & chat data.</p>
|
| 543 |
+
<p>MoE are brutal to train even with a small dataset like mine, so I took a different approach from usual. I used a very low LR in an effort to avoid having to apply DPO / KTO training afterwards.</p>
|
| 544 |
<p>I think there's likely a better config to be found, but experimentation with the model to find it is quite draining.</p>
|
| 545 |
<div class="dropdown-container">
|
| 546 |
<details>
|