Extend `disable_hooks` to keep subsets #1023

kylesayrs · 2025-01-01T20:05:22Z

Purpose

Allow subsets of handles to remain active, as is needed in the case of wanda

def _get_activations(self, model, dataloader, nsamples=128):
  def save_acts(module, input, name):
      ...
  
  hooks = set(
      self.register_hook(mod, partial(save_acts, name=name), "forward_pre")
      for name, mod in self.model.named_modules()
      if isinstance(mod, torch.nn.Linear) and "lm_head" not in name
  )
  # in the future, if the user puts wanda after another modifier,
  # initialize will run after other modifiers have added hooks

  # want to disable hooks from other modifiers, but keep the ones we just added
  with HooksMixin.disable_hooks(keep=hooks):
      run_basic(model, dataloader)
  self.remove_hooks(hooks)

Prerequisites

Extend remove_hooks to remove subsets #1021

Postrequisites

Layer compressor deprecation

Changes

Add a _HOOKS_KEEP_ENABLED class variable

Tests

Added test_disable_hooks_keep in tests

Signed-off-by: Kyle Sayers <[email protected]>

github-actions · 2025-01-01T20:05:36Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Signed-off-by: Kyle Sayers <[email protected]>

…branch 'origin' into kylesayrs/hooks-mixin-keep

Signed-off-by: Kyle Sayers <[email protected]>

rahul-tuli

The new changes LGTM! Great work @kylesayrs

src/llmcompressor/modifiers/utils/hooks.py

The base branch was changed.

## Purpose ## * Allow subsets of hooks to be removed * Not strictly needed but helps promote code clarity in the case of wanda which adds and removes subsets of hooks at different times. ## Postrequisites ## * #1023 * Layer compressor deprecation ## Changes ## * Change the datatype of `_hooks` from `List` to `Set` * Add `handles` argument to `HooksMixin.remove_hooks` ## Testing ## * Added `test_remove_hooks_parameterized` test --------- Signed-off-by: Kyle Sayers <[email protected]>

src/llmcompressor/modifiers/utils/hooks.py

Signed-off-by: Kyle Sayers <[email protected]>

src/llmcompressor/modifiers/utils/hooks.py

rahul-tuli

LGTM pending @brian-dellabetta 's comment on defaulting to None

Signed-off-by: Kyle Sayers <[email protected]>

brian-dellabetta

🚀

src/llmcompressor/modifiers/utils/hooks.py

@rahul-tuli

## Purpose ## * Remove layer compressor to decouple modifiers from data pipelines * Reduce abstractions * Support VLMs with SparseGPT and Wanda ## Prerequisites ## * #1021 * #1023 * #1068 * #1030 ## Changes ## ### Interface/ Features ### * SparseGPT and Wanda now both support VLM architectures * Added `sequential_targets` to match GPTQ and made `targets` an alias * Support hessian offloading for `SparseGPT` * Add customized `_LinAlgError` for `SparseGPT` ### Implementations ### * Changed implementation styles of `SparseGPTModifier` and `WandaPruningModifier` to match `GPTQModifier` * Removed `LayerCompressor`, `ModuleCompressionWrapper`, `SparseGptWrapper`, and `WandaWrapper` * Shared implementations between SparseGPT and Wanda are implemented by the `SparsityModifierMixin` * Removed lines blocking `allow_tf32` * Maybe @rahul-tuli knows why this was originally implemented, potentially to avoid hardware issues? * This change was only present for wanda. Given that all other modifiers do not have this change, I see no reason why it should stay * Updated sparsegpt tests to reflect new implementation ### Tests ### * Updated obcq tests to reflect new implementations * Removed `test_sgpt_defaults.py` since this test doesn't test anything new or novel about this modifier ## Testing ## * `grep -r "LayerCompressor\|ModuleCompressionWrapper\|SparseGptWrapper\|WandaWrapper" src/ examples/ tests/` * Modified `test_invalid_layerwise_recipes_raise_exceptions` and `test_successful_layerwise_recipe` pass * `llama3_8b_2of4.py` passes and was evaluated with both SparseGPT and Wanda ## Potential Follow ups ## * Add module `targets` and `ignore` to SparseGPT and Wanda ## Regression Testing ## The hessian, row scalar, and compressed weight values were confirmed to be unchanged in the case that of one calibration sample. The final evaluations are different, which is likely due to numerical imprecision (dividing by int vs torch.int), different pipelines (different subgraph partitions => different imprecision from cpu offloading, potentially different module arguments). ### Evaluation Models were compressed using `examples/sparse_2of4_quantization_fp8/llama3_8b_2of4.py` <details><summary>sparsegpt</summary> Main ``` hf (pretrained=/home/ksayers/llm-compressor/old_Llama-3.2-1B-Instruct2of4-sparse,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1 | Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |----------|------:|------|-----:|------|---|-----:|---|-----:| |winogrande| 1|none | 5|acc |? |0.5391|? | 0.014| ``` Branch ``` hf (pretrained=/home/ksayers/llm-compressor/new_Llama-3.2-1B-Instruct2of4-sparse,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1 | Tasks |Version|Filter|n-shot|Metric| |Value| |Stderr| |----------|------:|------|-----:|------|---|----:|---|-----:| |winogrande| 1|none | 5|acc |? |0.547|? | 0.014| ``` </details> To test wanda, the `SparseGPTModifier` was replaced with the `WandaPruningModifier` <details><summary>wanda</summary> Main ``` hf (pretrained=/home/kyle/old_llm-compressor/Llama-3.2-1B-Instruct2of4-sparse,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1 | Tasks |Version|Filter|n-shot|Metric| |Value| |Stderr| |----------|------:|------|-----:|------|---|----:|---|-----:| |winogrande| 1|none | 5|acc |↑ |0.532|± | 0.014| ``` Branch ``` hf (pretrained=/home/kyle/llm-compressor/Llama-3.2-1B-Instruct2of4-sparse,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1 | Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |----------|------:|------|-----:|------|---|-----:|---|-----:| |winogrande| 1|none | 5|acc |↑ |0.5414|± | 0.014| ``` </details> --------- Signed-off-by: Kyle Sayers <[email protected]> Co-authored-by: Dipika Sikka <[email protected]>

kylesayrs added 3 commits December 31, 2024 16:14

extend remove_hooks to remove subsets

de278ce

Signed-off-by: Kyle Sayers <[email protected]>

change arg type

2754145

Signed-off-by: Kyle Sayers <[email protected]>

implement keep argument

b2e98c3

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs added 2 commits January 1, 2025 15:10

use lazy value assignment rather than container, update docstring

3ab5323

Signed-off-by: Kyle Sayers <[email protected]>

make keeps composable

b605db5

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs mentioned this pull request Jan 1, 2025

Extend remove_hooks to remove subsets #1021

Merged

kylesayrs self-assigned this Jan 1, 2025

kylesayrs mentioned this pull request Jan 14, 2025

Replace LayerCompressor with HooksMixin #1038

Merged

kylesayrs added the ready When a PR is ready for review label Jan 19, 2025

kylesayrs added 6 commits January 24, 2025 00:15

Merge branch 'main' into kylesayrs/hooks-mixin-keep

54067ab

Merge branch 'main' into kylesayrs/hooks-mixin-remove-subsets

1b11b54

Merge branch 'main' into kylesayrs/hooks-mixin-keep

e3623cc

update docstring

eb83e67

Signed-off-by: Kyle Sayers <[email protected]>

Merge branch 'kylesayrs/hooks-mixin-remove-subsets', remote-tracking …

2d6e366

…branch 'origin' into kylesayrs/hooks-mixin-keep

fix merge

5070615

Signed-off-by: Kyle Sayers <[email protected]>

rahul-tuli previously approved these changes Jan 29, 2025

View reviewed changes

src/llmcompressor/modifiers/utils/hooks.py Show resolved Hide resolved

kylesayrs changed the base branch from main to kylesayrs/hooks-mixin-remove-subsets January 29, 2025 18:32

kylesayrs changed the base branch from kylesayrs/hooks-mixin-remove-subsets to main January 29, 2025 18:34

Merge remote-tracking branch 'origin' into kylesayrs/hooks-mixin-keep

922ea62

dsikka requested changes Feb 3, 2025

View reviewed changes

src/llmcompressor/modifiers/utils/hooks.py Show resolved Hide resolved

src/llmcompressor/modifiers/utils/hooks.py Show resolved Hide resolved

update docstring with more examples

ea4f2a2

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs requested a review from dsikka February 3, 2025 20:59

Merge branch 'main' into kylesayrs/hooks-mixin-keep

46ae8eb

dsikka previously approved these changes Feb 5, 2025

View reviewed changes

dsikka requested a review from rahul-tuli February 5, 2025 17:20

brian-dellabetta reviewed Feb 5, 2025

View reviewed changes

src/llmcompressor/modifiers/utils/hooks.py Outdated Show resolved Hide resolved

src/llmcompressor/modifiers/utils/hooks.py Show resolved Hide resolved

src/llmcompressor/modifiers/utils/hooks.py Show resolved Hide resolved

rahul-tuli previously approved these changes Feb 5, 2025

View reviewed changes

use immutable default

a2934b3

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs dismissed stale reviews from rahul-tuli and dsikka via a2934b3 February 5, 2025 18:39

rahul-tuli approved these changes Feb 5, 2025

View reviewed changes

brian-dellabetta approved these changes Feb 5, 2025

View reviewed changes

src/llmcompressor/modifiers/utils/hooks.py Show resolved Hide resolved

dsikka merged commit c4e213a into main Feb 5, 2025
6 of 7 checks passed

dsikka deleted the kylesayrs/hooks-mixin-keep branch February 5, 2025 19:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend `disable_hooks` to keep subsets #1023

Extend `disable_hooks` to keep subsets #1023

kylesayrs commented Jan 1, 2025 •

edited

Loading

github-actions bot commented Jan 1, 2025

rahul-tuli left a comment

rahul-tuli left a comment

brian-dellabetta left a comment

Extend disable_hooks to keep subsets #1023

Extend disable_hooks to keep subsets #1023

Conversation

kylesayrs commented Jan 1, 2025 • edited Loading

Purpose

Prerequisites

Postrequisites

Changes

Tests

github-actions bot commented Jan 1, 2025

rahul-tuli left a comment

Choose a reason for hiding this comment

rahul-tuli left a comment

Choose a reason for hiding this comment

brian-dellabetta left a comment

Choose a reason for hiding this comment

Extend `disable_hooks` to keep subsets #1023

Extend `disable_hooks` to keep subsets #1023

kylesayrs commented Jan 1, 2025 •

edited

Loading