-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NVIDIA] Add stage2 NCCL kernel overlap #7092
[NVIDIA] Add stage2 NCCL kernel overlap #7092
Conversation
Thanks for your contribution! |
Add @jeng1220 for vis. |
Codecov Report
@@ Coverage Diff @@
## develop #7092 +/- ##
===========================================
- Coverage 59.75% 59.75% -0.01%
===========================================
Files 559 559
Lines 82347 82359 +12
===========================================
+ Hits 49210 49213 +3
- Misses 33137 33146 +9
|
3bed524
to
f470b71
Compare
@@ -1576,6 +1585,9 @@ def get_expected_keys(inputs, keys): | |||
offload=cpu_offload, | |||
**extra_kwargs, | |||
) | |||
if level == "os_g": | |||
model._set_reduce_overlap(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a flag to control whether using overlap or not? There are some constraints for the overlap, such as the logging_step
should bigger than 1 for broadcast overlap and no other sync could be called during the training for broadcast overlap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
f470b71
to
d4397a6
Compare
@FeixLiu Would you please take a look again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Performance optimization
PR changes
Others
Description
This PR adds NCCL kernel overlap feature to stage2 FSDP training. It brings 1.5% E2E speedup in GPT training.