Do not modify saved variables in-place for spectral norm during power iteration #62293

soulitzer · 2021-07-27T21:24:29Z

Interestingly enough, the original code did have a mechanism that aims to prevent this very issue:
but it performs a clone AFTER modifying u and v in-place.
This wouldn't work though because we can later use the cloned u and v in operations that save for backward, and the next time we execute forward, we modify the same cloned u and v in-place.
So if the idea is that we want to avoid modifying saved variable in-place we should clone it BEFORE the in-place operation.

facebook-github-bot · 2021-07-27T21:24:35Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/62293
📄 Preview docs built from this PR

💊 CI failures summary and remediations

As of commit bb92cc1 (more details on the Dr. CI page):

1/1 failures introduced in this PR

1 failure not recognized by patterns:

Job	Step	Action
^{linux-xenial-py3.6-gcc5.4 / calculate-docker-image}	^{Checkout PyTorch}	🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

torch/nn/utils/parametrizations.py

test/test_nn.py

soulitzer · 2021-08-19T18:47:47Z

As discussed with @albanD offline, since we cannot test with gradcheck with the parametrized spectral norm module as-is, we try to test gradcheck again with no_grad removed. See #63593. The CI should reveal whether it works with DP or not.

This was more complicated that initially thought though because there are many side effects of removing the no_grad. Its very possible we're testing completely different.

Anyway, the things changed in linked PR are:

(repeat changes from this PR)
remove @nograd
no longer modify u and v in-place via out arg (because we are in grad mode now), instead we do it out of place then copy_ after so that DP would still work (though I can't test that locally). We need an extra clone before F.normalize(...) because it saves u, v for backward now. Also do the same for weight_mat
change clone memory_format=torch.contiguous_format -> perserve_format (TODO: why does this break gradcheck??)
In test.nn F.normalize does not support undefined gradients so we need check_undefined=False now.
Adds retain_graph to .backward and .grad calls because u and v begin to require grad after the first forward and connects to the graph from the previous iteration

albanD · 2021-08-19T18:52:18Z

Adds retain_graph to .backward and .grad calls because u and v begin to require grad after the first forward and connects to the graph from the previous iteration

A wrapper function should be used to make sure we don't change the "original" u and v in any way I think.

In test.nn F.normalize does not support undefined gradients so we need check_undefined=False now.

That is surprising... especially ops with a single output should work out of the box.

soulitzer · 2021-08-19T21:47:04Z

A wrapper function should be used to make sure we don't change the "original" u and v in any way I think.

This fixed everything. The check_undefined_grad issue went away too with the new wrapper. So yeah it's not F.normalize, I should've double checked the first time.

soulitzer · 2021-08-19T22:02:55Z

From the linked PR, the test is failing for DP - Not sure if DP is possible when power iteration is done in grad mode because self._u needs to be modified in-place, but self._u can be a view apparently. The grad_fn is BroadcastBackward (which I could not grep) seems to return multiple views as well.

https://app.circleci.com/pipelines/github/pytorch/pytorch/366855/workflows/b0ec5ba5-b366-43d0-8846-5cf6d675e456/jobs/15480216

albanD

ok!

facebook-github-bot · 2021-08-23T17:54:23Z

@soulitzer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-08-24T16:09:32Z

@soulitzer has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

codecov · 2021-08-24T20:05:10Z

Codecov Report

Merging #62293 (bb92cc1) into master (94d6215) will increase coverage by 0.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #62293      +/-   ##
==========================================
+ Coverage   67.05%   67.13%   +0.08%     
==========================================
  Files         691      691              
  Lines       90496    90496              
==========================================
+ Hits        60679    60752      +73     
+ Misses      29817    29744      -73

facebook-github-bot · 2021-08-24T20:10:40Z

@soulitzer merged this pull request in 5be17ec.

soulitzer requested a review from albanD July 27, 2021 21:24

soulitzer requested a review from lezcano as a code owner July 27, 2021 21:24

facebook-github-bot added the cla signed label Jul 27, 2021

soulitzer force-pushed the spectral-norm-remove-out branch from 3d63e9f to c0c61ad Compare July 27, 2021 21:26

albanD reviewed Jul 27, 2021

View reviewed changes

torch/nn/utils/parametrizations.py Outdated Show resolved Hide resolved

soulitzer force-pushed the spectral-norm-remove-out branch from c0c61ad to 772c8b0 Compare July 28, 2021 02:26

soulitzer requested a review from albanD July 28, 2021 14:58

soulitzer force-pushed the spectral-norm-remove-out branch from 29836b1 to a3eec40 Compare July 28, 2021 20:14

lezcano reviewed Jul 29, 2021

View reviewed changes

torch/nn/utils/parametrizations.py Outdated Show resolved Hide resolved

albanD reviewed Jul 29, 2021

View reviewed changes

test/test_nn.py Show resolved Hide resolved

soulitzer force-pushed the spectral-norm-remove-out branch from ca15208 to c2cb316 Compare July 30, 2021 19:40

soulitzer requested review from albanD and lezcano August 9, 2021 16:08

albanD approved these changes Aug 23, 2021

View reviewed changes

soulitzer added 4 commits August 24, 2021 12:08

Do not modify u,v in-place for spectral norm

9b00b0a

Update test and fix

fe141c4

Clone before use

f25a1e6

Remove unnecessary if

bb92cc1

soulitzer force-pushed the spectral-norm-remove-out branch from c2cb316 to bb92cc1 Compare August 24, 2021 16:09

facebook-github-bot closed this in 5be17ec Aug 24, 2021

facebook-github-bot added the Merged label Aug 24, 2021

soulitzer mentioned this pull request Aug 30, 2021

Backprop second time on spectral_norm #55368

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not modify saved variables in-place for spectral norm during power iteration #62293

Do not modify saved variables in-place for spectral norm during power iteration #62293

Do not modify saved variables in-place for spectral norm during power iteration #62293

Do not modify saved variables in-place for spectral norm during power iteration #62293

Conversation

🔗 Helpful links

💊 CI failures summary and remediations

1 failure not recognized by patterns:

Choose a reason for hiding this comment

Codecov Report