Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Training] Unifying Preprocess + Postprocessing logic for Train/Oneshot #1212

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

horheynm
Copy link
Collaborator

@horheynm horheynm commented Feb 28, 2025

Order of reviews:
#1206
#1207
#1209
#1212 <-- Here
#1214

SUMMARY:

  • Move the preprocessing and postprocessing logic out of src/llmcompressor/transformers/finetune/text_generation.py and into
    src/llmcompressor/entrypoints/utils.py

TEST PLAN:
Pass tests

Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

dsikka pushed a commit that referenced this pull request Mar 3, 2025
Order of reviews:
#1206
#1207 <-- Here
#1209 
#1212
#1214 

SUMMARY:
* Decouple arg parser to be used for both oneshot and train

TEST PLAN:
* Pass tests
dsikka added a commit that referenced this pull request Mar 5, 2025
Order of reviews:
#1206  <-- Here
#1207
#1209 
#1212
#1214 

SUMMARY:
Rename data_args to dataset_args

TEST PLAN:
Pass tests
FInd `data_args` using `grep`

---------

Signed-off-by: George Ohashi <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
dsikka pushed a commit that referenced this pull request Mar 5, 2025
Order of reviews:
#1206
#1207
#1209 <-- Here
#1212
#1214 

SUMMARY:
* Move dataset logic out of transformers module
`src/llmcompressor/transformers/finetune/data/data_helpers.py`, add it
to `src/llmcompressor/datasets/utils.py`


TEST PLAN:
Pass tests
Comment on lines 306 to 140
model_args: ModelArguments,
dataset_args: DatasetArguments,
recipe_args: RecipeArguments,
training_args: TrainingArguments,
model_args,
dataset_args,
recipe_args,
training_args,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove the type hints here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

circular import

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With what?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parse_args from init in llmcompressor.args.

Also this main function will be killed when stage runner is removed, so type annotation here is not a priority compared to module structure to put logic outside of /transformers modules

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circular import but we're still importing them at the top?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice, must have been a different PR. Added back!!

Comment on lines 306 to 140
model_args: ModelArguments,
dataset_args: DatasetArguments,
recipe_args: RecipeArguments,
training_args: TrainingArguments,
model_args,
dataset_args,
recipe_args,
training_args,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With what?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready When a PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants