aquif-ai/aquif-3.5-Max-42B-A3B · Tested quant locally MMLU-Pro bench "computer science" section

Tested quant locally MMLU-Pro bench "computer science" section

by CoruNethron - opened 12 days ago

12 days ago

Hello. Thank you for this release.

Compared it with Qwen3 25B A3B REAP by cerebras, running MMLU-Pro, "computer science" section.

aquif Q6_K quant, with 8 experts activated per token, 4000 tokens context limit: 82.2% correct answers
aquif Q6_K quant, with 16 experts activated per token, 8192 tokens context limit: 77.6% correct answers
REAP iQ5K_M quant, with 16 experts activated per token, 8192 tokens context limit: 69.0% correct answers

Actually not so representative comparison both, cause cerebras/REAP is pruned and cause it was with less bits per parameter.

Wonder why increasing activated experts actually dropped quality.

Also, noticed tendency for repetitive outputs, when model begins to either loop through few paragraphs or even few tokens over and over - this effect was observed with both models. I didn't set any repetition penalty parameters.

aquiffoo

aquif AI org 11 days ago

Wonder why increasing activated experts actually dropped quality.

that's actually common in MoE models.

when you activate more than the default amount of experts per token, you could be routing the input to non-specialized experts that dillute the quality of the output.

Also, noticed tendency for repetitive outputs, when model begins to either loop through few paragraphs or even few tokens over and over - this effect was observed with both models

yeah, we didn't improve that from the base model that much. we plan on improving repetitions with future model releases. stay tuned!

aquiffoo changed discussion status to closed 11 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment