Safetensors
mistral

I personally like the v3 more...

#2
by Laetilia - opened

In my humble opinion, while this model isn't bad, the v3 I like more.
The v3 feels more creative and active in comparison, even if more finicky.

At the recommended sampling settings, the v4 felt like it is partially, an AI-assistant.
A sort of yes-man, which has the coherence, smarts, but is not really... interesting?

After playing with sampling settings for a while, I liked behavior of v4 more with these...
--temp 1.2 --min-p 0.1 --top-p 0.95 --dry-multiplier 0.8
...sampling settings, which sacrificed some coherence for more creativity and independence.
Still. The model felt like an LLM, and was "Why you did X?", over and over and over again.
As well as "What happens next?", over and over and over again. Rather not pleasant.

I tried v4 at Q5_K_M, which I think is a good quant precision.
It is plausible that other quant I'd like more (Q5 is what I liked for v3 the most).
Or that if I've played with sampling even more (even higher temp? XTC? adjust Top K?), I'd like v4 more.
But, at least for now, I've tired from pocking at v4, when the v3 feels much better (which I know how to use nowadays).

Don't get me wrong, the v4 is a very decent, or perhaps even a good fine-tune.
It can be quite fun to roleplay with!
However, to my humble opinion and experience, the v3 is excellent (better).

And, of course, settings and experiences of other people can differ from mine.

Thanks for the feedback!

I've heard similar things from other people using the model, where it maybe feels a bit overcooked. I was testing a ton of wildly different settings and data when I trained it and I think I know why it feels like that. This model was a merge of two models, a definitely overcooked rock solid stable but boring model, and a model that wrote amazingly but had brain farts. I think that rock solid model is coming through a bit too much.

Before I do another attempt though, doing a big manual slop rewrite / cleanup as I've let that get away from me with all the new datasets.

Sign up or log in to comment