-
Notifications
You must be signed in to change notification settings - Fork 23k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port log_softmax
to structured kernel
#57374
Port log_softmax
to structured kernel
#57374
Conversation
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 79444182a5b85cf603960c64b33d1ba6eb9c8e7f Pull Request resolved: #57374
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit e1165ba (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (1/1)Step: "Build" (full log | diagnosis details | 🔁 rerun)
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
log_softmax
to structured kernel
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
aten/src/ATen/native/cuda/SoftMax.cu
Outdated
auto res = host_softmax<LogSoftMaxForwardEpilogue,true>(input, dim, half_to_float); | ||
output.copy_(res); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So ideally I would refactor host_softmax
so that it takes an out
parameter but that function is also used by softmax_cuda
, which is not structured. It can be, and the structure (no pun intended) of that kernel is very similar to that of log_softmax
, so I think the port would be similar. However, it is not in any of the lists mentioned in #55070.
There are two options:
- Port
_softmax
to structured so that thishost_softmax
function can be refactored to require anout
parameter. - Do not port
_softmax
to structured. Add an optionalout
parameter tohost_softmax
and use that as the outputTensor
if one is provided.
I prefer number 1. Let me know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not in a rush better to do the cluster of ports necessarily to put us in the better end state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but why is _softmax
not on any of the lists in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible the end users are only reporting the top level functions, and not how they are implemented internally
@@ -925,6 +930,5 @@ Tensor softmax_backward_cuda(const Tensor &grad, const Tensor &output, int64_t d | |||
Tensor tmp = grad * output; | |||
return host_softmax_backward<SoftMaxBackwardEpilogue,false>(tmp, output, dim, half_to_float); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I should delete this.
[ghstack-poisoned]
ghstack-source-id: e7b6bde4096855cae3f30f07c964ffa4130962a8 Pull Request resolved: #57374
[ghstack-poisoned]
[ghstack-poisoned]
aten/src/ATen/native/SoftMax.cpp
Outdated
!half_to_float, | ||
"softmax with half to float conversion is not supported on CPU"); | ||
|
||
auto input_ = input.contiguous(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improper use of contiguous in meta function
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
if (input.dim() == 0) | ||
input = input.view(1); | ||
|
||
auto input_ = input.contiguous(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nb: if we ever wanna penny pinch refcounts here, use MaybeOwned, e.g., as in https://github.com/pytorch/pytorch/pull/56115/files this does not block merging this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a minor style point; personally, I like underscoring the input parameter and then having it un-underscored in the body function; this is mostly based on which variable I'll be referring to a lot and giving it the "better" name. We're very uneven on this everywhere. (also does not block merging the PR)
[ghstack-poisoned]
@SplitInfinity has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@SplitInfinity merged this pull request in ba60359. |
Summary: Pull Request resolved: #57374 Test Plan: Imported from OSS Reviewed By: saketh-are Differential Revision: D30240243 Pulled By: SplitInfinity fbshipit-source-id: de6617c75d16e26d607a884c25b8752b7b561737
Stack from ghstack:
native_batch_norm_backward
to structured #62513native_batch_norm
to structured #62452log_softmax_backward_data
to structured kernel #62372log_softmax
to structured kernel #57374Differential Revision: D30240243