-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Trainer] fix save_model #9286
[Trainer] fix save_model #9286
Conversation
Thanks for your contribution! |
if isinstance(self.model, LoRAModel) and (self.model.quantized or self.args.pipeline_parallel_degree > 1): | ||
self.save_model(output_dir, False, signal_dir) | ||
elif isinstance(self.model, LoRAModel) or isinstance(self.model, PrefixModelForCausalLM): | ||
self.save_model(output_dir, True, signal_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
signal_dir = os.path.join(signal_dir, os.path.split(output_dir)[-1])
5b20bd3
to
6ebe5b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #9286 +/- ##
===========================================
- Coverage 53.27% 53.09% -0.19%
===========================================
Files 657 657
Lines 107194 106533 -661
===========================================
- Hits 57104 56559 -545
+ Misses 50090 49974 -116 ☔ View full report in Codecov by Sentry. |
693d6fb
to
2eafad3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* bug fix * bug fix
* bug fix * bug fix
* [Unified Checkpoint] Support expert parallel (#9055) * update code * [Unified Checkpoint] Fix generation config save (#9223) * [Unified Checkpoint] update async_save_info in develop (#9173) * [Unified Checkpoint] update async save logic (#9274) * update async save signal * fix async save hang * bug fix * bug fix * [Trainer] fix save_model (#9286) * bug fix * bug fix --------- Co-authored-by: Weiguo Zhu <[email protected]>
PR types
Others
PR changes
Others
Description
Modify the
save_model
call to enhance compatibility.