-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optim fused linear grad add #55927
Optim fused linear grad add #55927
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
1fd0700
to
17b0fc2
Compare
13bf600
to
39fddc6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost LGTM
@@ -65,7 +66,7 @@ void FusedLinearParamGradAddImpl(const Context &ctx, | |||
use_addto); | |||
} | |||
|
|||
if (dbias_out == nullptr) return; | |||
if (!has_bias) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
实际情况是,dbias_out
永远都不会是nullptr
?
@@ -159,7 +161,7 @@ void FusedLinearParamGradAdd(const Context &ctx, | |||
multi_precision = false; | |||
} | |||
|
|||
if (dbias_out) { | |||
if (has_bias && dbias_out) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里dbias_out
的处理逻辑要保持跟上面L136 dweight_out
的处理方式一样吗?
… optim (PaddlePaddle#56094) * skip CopyOrAdd when tmp grad is None (PaddlePaddle#55679) * Optim fused linear grad add (PaddlePaddle#55927)
… optim (PaddlePaddle#56094) * skip CopyOrAdd when tmp grad is None (PaddlePaddle#55679) * Optim fused linear grad add (PaddlePaddle#55927)
… optim (PaddlePaddle#56094) * skip CopyOrAdd when tmp grad is None (PaddlePaddle#55679) * Optim fused linear grad add (PaddlePaddle#55927)
… optim (PaddlePaddle#56094) * skip CopyOrAdd when tmp grad is None (PaddlePaddle#55679) * Optim fused linear grad add (PaddlePaddle#55927)
… optim (PaddlePaddle#56094) * skip CopyOrAdd when tmp grad is None (PaddlePaddle#55679) * Optim fused linear grad add (PaddlePaddle#55927)
… optim (PaddlePaddle#56094) * skip CopyOrAdd when tmp grad is None (PaddlePaddle#55679) * Optim fused linear grad add (PaddlePaddle#55927)
PR types
Others
PR changes
Others
Description
Optim fused linear grad add
PCard-70444