Nitral-AI's picture
Update README.md
64ea423 verified
metadata
base_model:
  - Nitral-AI/Captain-Eris_Violet-0.420-Rebased
  - Nitral-AI/Captain-Eris_Violet-GRPO-Rebased
library_name: transformers
tags:
  - merge
  - finetune
  - GRPO
  - QLORA
  - SFT
license: other
language:
  - en

Update: The model image itself is now available as an importable character card for SillyTavern. This serves as an example of how to prepare your own card for use with this model.

image/png

Training Notes: This model was developed using a combination of multi-stage supervised fine-tuning, pre-trained QLoRA adapters, and multi-stage RLHF optimized with GRPO. The final model was created by merging the most promising candidates identified during the process.

Quants Here: Thanks to Mradermacher <3 Regular GGUF Imatrix GGUF 4bpw Exl2

SillyTavern Reasoning Block Parsing Example:

image/png

SillyTavern Mistral Formatting Example: Master Import Preset Here

image/png

Series Comparison:

image/png

The following YAML configuration was used to produce this final version of the model:

slices:
  - sources:
      - model: Nitral-AI/Captain-Eris_Violet-0.420-Rebased
        layer_range: [0, 40]
      - model: Nitral-AI/Captain-Eris_Violet-GRPO-Rebased
        layer_range: [0, 40]
merge_method: slerp
base_model: Nitral-AI/Captain-Eris_Violet-0.420-Rebased
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.420
dtype: bfloat16