Calibration Data
Hi,
I just had a first conversation with the model in german, and it had at least one pretty obvious grammatical mistake in it's response. I will test some more, but could it be, that you only use english language for calibration data? I noticed something similar with the qat gemma models by google and am wondering if it might be the issue.
Calibration data shouldn’t be the cause here. (German is included.)
During quantization, the model inevitably loses part of its learned representation.
As bits are discarded, frequent linguistic patterns tend to be retained, while memorization of specific literature or less-represented languages may deteriorate more noticeably.
So if you’re seeing grammatical issues in German after quantization, it’s most likely due to its smaller share in the model’s original training data rather than the calibration data itself.
Just my take.