-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When quantizing gemma2 in W8A8 format, the input is not positive-definite and gemma2-27B cannot be quantized. #1152
Comments
Hi @HelloCard! This issue is due to inherit numerical instability in the GPTQ algorithm. Below is listed a few courses of action which can be help randomize the data to avoid instability. Updating to the latest release/ build from source may also fix the issue due to very slight algorithm implementation differences. llm-compressor/src/llmcompressor/modifiers/quantization/gptq/utils/gptq_wrapper.py Line 180 in 606aab2
Please let me know if none of these solutions work for you |
@kylesayrs |
@HelloCard Hm, I'll dig a little deeper into this. Hessian instability can sometimes be a symptom of incorrect data processor modeling. Please make sure that your dataset has non-identical samples and that your model's weights are loading correctly. |
@kylesayrs Any further suggestions? |
Describe the bug
I have tried it on v0.1.0 to v0.4.0, and the performance is the same, with a very large error value and the prompt "input is not positive-definite".
The model is byroneverson/gemma-2-27b-it-abliterated, and it should be fairly easy to reproduce the problem.
Below is the error message, the running environment, and the script I used.
Environment
To Reproduce
Errors
The text was updated successfully, but these errors were encountered: