About the "long ass time" versus model quality.
Hello,
in your model card you wrote that it took long ass time to make this merge, but I wanted to let you know that it's a very interesting merge, so overall I think it was worth the time and effort and I appreciate that you invested it into making this model a reality!
I think this is the very first time I've seen a model make connections which are usually unlikely to happen, but actually inevitable (due to logical consequences) between the effects of user's superpower which should properly affect everyone around the user, meaning the char as well as side characters controlled by it even if they are not properly defined in the character description, but a good AI should know about them and take them into account which this model did for some magical reason. This kind of little nuance is something I've never seen any model do, not even bigger ones.
Of course I wanted to explore this phenomenon using rerolls and trying different parameters to see how far it can be pushed, but unfortunately during my brief testing I did not manage to reproduce the same or better results again yet, so I'll keep trying.
Still, this model is a surprising little gem, so thanks for it! β€
Hello,
in your model card you wrote that it took long ass time to make this merge, but I wanted to let you know that it's a very interesting merge, so overall I think it was worth the time and effort and I appreciate that you invested it into making this model a reality!
I think this is the very first time I've seen a model make connections which are usually unlikely to happen, but actually inevitable (due to logical consequences) between the effects of user's superpower which should properly affect everyone around the user, meaning the char as well as side characters controlled by it even if they are not properly defined in the character description, but a good AI should know about them and take them into account which this model did for some magical reason. This kind of little nuance is something I've never seen any model do, not even bigger ones.
Of course I wanted to explore this phenomenon using rerolls and trying different parameters to see how far it can be pushed, but unfortunately during my brief testing I did not manage to reproduce the same or better results again yet, so I'll keep trying.
Still, this model is a surprising little gem, so thanks for it! β€
Thank you very much for your feedback!
I am slowly but surely progressing towards my newer RP model release and I produced this model primarily to benchmark against it (for middle layers range).
I am trying to take a more deliberate approach and run multiple evaluations in lm-eval-harness to determine if the model is good besides just feels.
I'm really glad that you found this one good, so the bar is already set pretty high, but the goal is to surpass it!