Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Re-enable Sparse Compression for 2of4 Examples #1153

Merged
merged 4 commits into from
Feb 14, 2025
Merged

Conversation

rahul-tuli
Copy link
Collaborator

@rahul-tuli rahul-tuli commented Feb 14, 2025

This PR restores sparse compression for our 2of4 examples, which was previously disabled due to a bug in the vLLM Cutlass integration.

Background

A bug in the Cutlass integration caused certain sparse-only compressed models to produce gibberish results. To mitigate this issue, we temporarily turned off sparse compression for our 2of4 examples.

The bug has since been fixed by @tlrmchlsmth in vllm-project/vllm#13198. With this fix in place, we can safely re-enable sparse compression for these examples.

Changes

  • Re-enable sparse compression for 2of4 examples.

Testing

  • Verified that sparse-only compressed models now produce expected outputs.

@rahul-tuli rahul-tuli self-assigned this Feb 14, 2025
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@rahul-tuli rahul-tuli marked this pull request as ready for review February 14, 2025 15:43
kylesayrs
kylesayrs previously approved these changes Feb 14, 2025
Copy link
Collaborator

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a comment about prev vllm versions

Copy link

@tlrmchlsmth tlrmchlsmth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

horheynm
horheynm previously approved these changes Feb 14, 2025
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update the E2E test case as well?

Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider adding a comment to the example about disable_sparse_compression (just that it exists/can be used)?

@rahul-tuli
Copy link
Collaborator Author

It is there in the readme as a separate section, if anyone wants to try

Signed-off-by: Rahul Tuli <[email protected]>
Signed-off-by: Rahul Tuli <[email protected]>
@rahul-tuli
Copy link
Collaborator Author

Can we update the E2E test case as well?

Done and tested!

@dsikka dsikka added the ready When a PR is ready for review label Feb 14, 2025
@dsikka dsikka enabled auto-merge (squash) February 14, 2025 21:48
@dsikka dsikka merged commit cdf686f into main Feb 14, 2025
7 checks passed
@dsikka dsikka deleted the update-2of4-example branch February 14, 2025 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready When a PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants