Release v0.4.1 · vllm-project/llm-compressor

What's Changed

Remove version by @dsikka in #1077
Require 'ready' label for transformers tests by @dbarbuzzi in #1079
GPTQModifier Nits and Code Clarity by @kylesayrs in #1068
Also run on pushes to main by @dbarbuzzi in #1083
VLM: Phi3 Vision Example by @kylesayrs in #1032
VLM: Qwen2_VL Example by @kylesayrs in #1027
Composability with sparse and quantization compressors by @rahul-tuli in #948
Remove TraceableMistralForCausalLM by @kylesayrs in #1052
[Fix Test Failure]: Propagate name change to test by @rahul-tuli in #1088
[Audio] Support Audio Datasets by @kylesayrs in #1085
[Test Fix] Add Quantization then finetune tests by @horheynm in #964
[Smoothquant] Phi3 Vision Mappings by @kylesayrs in #1089
[VLM] Multimodal Data Collator by @kylesayrs in #1087
VLM: Model Tracing Guide by @kylesayrs in #1030
Turn off 2:4 sparse compression until supported in vllm by @rahul-tuli in #1092
[Test Fix] Fix Consecutive oneshot by @horheynm in #971
[Bug Fix] Fix test that requre GPU by @horheynm in #1096
Add Idefics3/SmolVLM quant support via traceable class by @leon-seidel in #1095
Traceability Guide: Clarity and typo by @kylesayrs in #1099
[VLM] Examples README by @kylesayrs in #1057
Raise warning for 24 compressed sparse-only models by @rahul-tuli in #1107
Remove log_model_load by @kylesayrs in #1016
Return empty sparsity config if targets and ignores are empty by @rahul-tuli in #1115
Remove uses of get_observer by @kylesayrs in #939
FSDP utils cleanup by @kylesayrs in #854
Update maintainers, add notice by @kylesayrs in #1091
Replace readme paths with urls by @kylesayrs in #1097
GPTQ add Arkiv link, move file location by @kylesayrs in #1100
Extend remove_hooks to remove subsets by @kylesayrs in #1021
[Audio] Whisper Example and Readme by @kylesayrs in #1106
[Audio] Add whisper fp8 dynamic example by @kylesayrs in #1111
[VLM] Update pixtral data collator to reflect latest transformers changes by @kylesayrs in #1116
Use unique test names in TestvLLM by @dbarbuzzi in #1124
Remove smoothquant from examples by @kylesayrs in #1121
Extend disable_hooks to keep subsets by @kylesayrs in #1023
Unpin pynvml to fix e2e test failures with vLLM by @dsikka in #1125
Replace LayerCompressor with HooksMixin by @kylesayrs in #1038
[Oneshot Refactor] Rename get_shared_processor_src to get_processor_name_from_model by @horheynm in #1108
Allow Shortcutting Min-max Observer by @kylesayrs in #887
[Polish] Remove unused code by @horheynm in #1128
Properly restore training mode with eval_context by @kylesayrs in #1126
SQ and QM: Remove torch.cuda.empty_cache, use calibration_forward_context by @kylesayrs in #1114
[Oneshot Refactor] dataclass Arguments by @horheynm in #1103
[Bugfix] SparseGPT, Pipelines by @kylesayrs in #1130
[Oneshot refactor] Refactor initialize_model_from_path by @horheynm in #1109
[e2e] Update vllm tests with additional datasets by @brian-dellabetta in #1131
Update: SparseGPT recipes by @rahul-tuli in #1142
Add timer support for testing by @dsikka in #1137
[Audio] Support Whisper V3 by @kylesayrs in #1147
Fix: Re-enable Sparse Compression for 2of4 Examples by @rahul-tuli in #1153
[VLM] Add caption to flickr dataset by @kylesayrs in #1138
[VLM] Update mllama traceable definition by @kylesayrs in #1140
Fix CPU Offloading by @dsikka in #1159
[TRL_SFT_Trainer] Fix and Update Examples code by @horheynm in #1161
[TRL_SFT_Trainer] Fix TRL-SFT Distillation Training by @horheynm in #1163
Bump version for patch release by @dsikka in #1166
Update DeepSeek Examples by @dsikka in #1175
Update gemma2 examples with a note about sample generation by @dsikka in #1176

New Contributors

@leon-seidel made their first contribution in #1095

Full Changelog: 0.4.0...0.4.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.4.1

What's Changed

New Contributors

Contributors