Excellent work

#5
by user59365 - opened

Excellent work on this, easily the best 24B model that's been tried and those have been many, this one has that "large model" feel of paid services, meaning coherent and well-versed. It even appears to fit snugly into a 16GB card with the right quant-level and back-end adjustments.

Keep it up Drummer, looking forward to any new models and updates.

Agreed, the prose is fantastic and solid.

The only issue that has arisen is severe degradation after around 10k context, coherence remains but all that well-versed part goes out the window, which is a shame as that is way too little for any long-form chatting.

How long a context window was this trained to function on? Just to check out user-errors, or alternatively, how that could be extended to function for at least 20k of context?

The only issue that has arisen is severe degradation after around 10k context, coherence remains but all that well-versed part goes out the window, which is a shame as that is way too little for any long-form chatting.

How long a context window was this trained to function on? Just to check out user-errors, or alternatively, how that could be extended to function for at least 20k of context?

Do you know if that's the case with Cydonia as well?

The only issue that has arisen is severe degradation after around 10k context, coherence remains but all that well-versed part goes out the window, which is a shame as that is way too little for any long-form chatting.

How long a context window was this trained to function on? Just to check out user-errors, or alternatively, how that could be extended to function for at least 20k of context?

Do you know if that's the case with Cydonia as well?

Haven't tested Cydonia, but you could assume it to be the case, as both models possess the same metadata (meaning same bases, parameters, etc).

Sign up or log in to comment