-
Notifications
You must be signed in to change notification settings - Fork 23k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve numerical stability of LayerNorm #59987
Conversation
💊 CI failures summary and remediationsAs of commit c635ad4 (more details on the Dr. CI page and at hud.pytorch.org/pr/59987):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_linux_xenial_py3_clang5_asan_test2 (1/1)Step: "Run tests" (full log | diagnosis details | 🔁 rerun)
|
This pull request was exported from Phabricator. Differential Revision: D29115235 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D29115235 |
8dbc4c2
to
7713f76
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
7713f76
to
a225a4d
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
a225a4d
to
536bf07
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
536bf07
to
f41793b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
This pull request was exported from Phabricator. Differential Revision: D29115235 |
f41793b
to
08f6cc9
Compare
08f6cc9
to
b92f595
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D29115235 |
b92f595
to
15784d8
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
15784d8
to
3a817c3
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
ecf0ee4
to
e62ee74
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
e62ee74
to
3458f4b
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
3458f4b
to
7f16d11
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
7f16d11
to
f162b39
Compare
This pull request was exported from Phabricator. Differential Revision: D29115235 |
f162b39
to
f280af4
Compare
Summary: Pull Request resolved: pytorch#59987 Similar as GroupNorm, improve numerical stability of LayerNorm by Welford algorithm and pairwise sum. Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "LayerNorm" Reviewed By: ngimel Differential Revision: D29115235 fbshipit-source-id: 376dac89a4e14bd340aaaf169fef8d0d4ca4a1c4
This pull request was exported from Phabricator. Differential Revision: D29115235 |
f280af4
to
c635ad4
Compare
This pull request has been merged in 963c983. |
Summary: Pull Request resolved: pytorch#59987 Similar as GroupNorm, improve numerical stability of LayerNorm by Welford algorithm and pairwise sum. Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "LayerNorm" Reviewed By: ngimel Differential Revision: D29115235 fbshipit-source-id: 5183346c3c535f809ec7d98b8bdf6d8914bfe790
Summary: Pull Request resolved: #59987 Similar as GroupNorm, improve numerical stability of LayerNorm by Welford algorithm and pairwise sum. Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "LayerNorm" Reviewed By: ngimel Differential Revision: D29115235 fbshipit-source-id: 5183346c3c535f809ec7d98b8bdf6d8914bfe790
Summary: Similar as GroupNorm, improve numerical stability of LayerNorm by Welford algorithm and pairwise sum.
Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "LayerNorm"
Differential Revision: D29115235
Similar as #54921.
For input = torch.rand(4, 1024, 1024, 72, dtype=torch.float32) and normalized_shape = [1024, 1024, 72]
Previously the absolute error on CPU is over 0.03 and after this PR it will be less than 1e-6
The single-thread forward time change on CPU is 1209.96ms -> 1274.34ms
The forward time change on CUDA is 184.61ms -> 208.27ms.