-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[e2e] Update vllm tests with additional datasets #1131
Conversation
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
a1abe8a
to
35f1d50
Compare
If you could determine and note how much extra time these new additions would take when running the e2e tests, that would be super helpful! It would be ideal if we can preempt any required timeout changes to the nightly runs. We have a custom test run workflow which you can easily configure to run that specific set of tests against this PR's branch (and choose a different timeout) to do the measurements. I am available to help use that workflow, just ping me on Slack. |
@dbarbuzzi sure thing! Worth discussing in standup Monday to see how to balance coverage and time to complete. Right now it's 6 new configs to 22 pre-existing, not sure if we want/need that many. Probably 15-20 more minutes total as is |
@dsikka responded an additional 15-20 minutes of nightly run time will not be a concern, so moving this to ready. Just FYI this requires downloading of the following datasets/models. Is there a way to run the nightly tests to confirm, or should we just merge this in today and see how nightly tests go tonight? |
8e7484f
to
7b39550
Compare
tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great job!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to move some of these preprocessing and tokenization function definitions to TextGenerationDataset
in the future so that others can use it, but no rush there.
LGTM!
All tests are passing except cadence: "nightly"
test_type: "regression"
model: Qwen/Qwen2.5-0.5B
recipe: tests/e2e/vLLM/recipes/Sparse_2of4/recipe_sparse_2of4_fp8_dynamic.yaml
scheme: sparse2of4_fp8_dynamic
dataset_id: garage-bAInd/Open-Platypus
dataset_split: train |
ff2cb87
ff2cb87
to
adf9210
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py
Outdated
Show resolved
Hide resolved
0f90d94
…bration, open-elm and qwen Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
0f90d94
to
0166ffc
Compare
SUMMARY: An e2e test was removed from #1131 as it was failing out at vllm for a reason that has since been resolved by vllm-project/vllm#13198. This re-adds the test shown [here](#1131 (comment)). I confirmed this runs with [the nightly vllm wheel built by the testing CI/CD](https://github.com/neuralmagic/llm-compressor-testing/actions/runs/13360960551). This adds <2 minutes to the nightly test time. TEST PLAN: No new src code to test. Signed-off-by: Brian Dellabetta <[email protected]> Co-authored-by: Dipika Sikka <[email protected]>
SUMMARY:
Adding a handful more e2e tests with 3 more datasets
and a new SLM:
I also included an env var flag to skip uploads to HF, defaulting to original behavior. I found this useful for tests.
This adds 15-20 minutes of extra testing (and 1.2GB of HF assets to download) to the nightly runs, which team has said is fine.
To run (you'll have you update your path & device_id):
TEST PLAN:
Additional config files for a broader range of datasets and an additional model.