Nothing Special   »   [go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port log_softmax to structured kernel #57374

Closed

Conversation

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@SplitInfinity SplitInfinity requested a review from ezyang as a code owner April 30, 2021 18:27
SplitInfinity pushed a commit that referenced this pull request Apr 30, 2021
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 79444182a5b85cf603960c64b33d1ba6eb9c8e7f
Pull Request resolved: #57374
@facebook-github-bot
Copy link
Contributor
facebook-github-bot commented Apr 30, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit e1165ba (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Aug 11 02:13:16 *** WARNING: renaming "_hashlib...open shared object file: No such file or directory
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut8-p4-perm-x72.c.o
Aug 11 02:13:16 building '_hashlib' extension
Aug 11 02:13:16 gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -fPIC -fPIC -I/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/include -I./Include -I/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/include -I. -I/usr/include/x86_64-linux-gnu -I/usr/local/include -I/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Include -I/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython -c /var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_hashopenssl.c -o build/temp.linux-x86_64-3.8/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_hashopenssl.o
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut8-p4-perm-x80.c.o
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut16-p3-gather-x8.c.o
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut16-p3-gather-x16.c.o
Aug 11 02:13:16 gcc -pthread -shared build/temp.linux-x86_64-3.8/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_hashopenssl.o -L/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/lib -L/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/lib -L/usr/lib/x86_64-linux-gnu -L/usr/local/lib -lssl -lcrypto -o build/lib.linux-x86_64-3.8/_hashlib.cpython-38-x86_64-linux-gnu.so
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut16-p3-gather-x24.c.o
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut16-p3-gather-x32.c.o
Aug 11 02:13:16 *** WARNING: renaming "_ssl" since importing it failed: libssl.so.1.1: cannot open shared object file: No such file or directory
Aug 11 02:13:16 *** WARNING: renaming "_hashlib" since importing it failed: libcrypto.so.1.1: cannot open shared object file: No such file or directory
Aug 11 02:13:16 
Aug 11 02:13:16 The following modules found by detect_modules() in setup.py, have been
Aug 11 02:13:16 built by the Makefile instead, as configured by the Setup files:
Aug 11 02:13:16 _abc                  atexit                pwd                
Aug 11 02:13:16 time                                                           
Aug 11 02:13:16 
Aug 11 02:13:16 
Aug 11 02:13:16 Following modules built successfully but were removed because they could not be imported:
Aug 11 02:13:16 _hashlib              _ssl                                     
Aug 11 02:13:16 

1 job timed out:

  • pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@SplitInfinity SplitInfinity removed the request for review from ezyang April 30, 2021 18:27
@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
Stale pull requests will automatically be closed 30 days after being marked Stale

@github-actions github-actions bot added the Stale label Jun 29, 2021
@SplitInfinity SplitInfinity changed the title [WIP] Port log_softmax to structured kernel Port log_softmax to structured kernel Jul 28, 2021
Comment on lines 907 to 908
auto res = host_softmax<LogSoftMaxForwardEpilogue,true>(input, dim, half_to_float);
output.copy_(res);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ideally I would refactor host_softmax so that it takes an out parameter but that function is also used by softmax_cuda, which is not structured. It can be, and the structure (no pun intended) of that kernel is very similar to that of log_softmax, so I think the port would be similar. However, it is not in any of the lists mentioned in #55070.

There are two options:

  1. Port _softmax to structured so that this host_softmax function can be refactored to require an out parameter.
  2. Do not port _softmax to structured. Add an optional out parameter to host_softmax and use that as the output Tensor if one is provided.

I prefer number 1. Let me know your thoughts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not in a rush better to do the cluster of ports necessarily to put us in the better end state.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but why is _softmax not on any of the lists in the first place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible the end users are only reporting the top level functions, and not how they are implemented internally

@@ -925,6 +930,5 @@ Tensor softmax_backward_cuda(const Tensor &grad, const Tensor &output, int64_t d
Tensor tmp = grad * output;
return host_softmax_backward<SoftMaxBackwardEpilogue,false>(tmp, output, dim, half_to_float);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I should delete this.

SplitInfinity pushed a commit that referenced this pull request Jul 28, 2021
ghstack-source-id: e7b6bde4096855cae3f30f07c964ffa4130962a8
Pull Request resolved: #57374
!half_to_float,
"softmax with half to float conversion is not supported on CPU");

auto input_ = input.contiguous();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improper use of contiguous in meta function

@SplitInfinity SplitInfinity requested a review from ezyang August 4, 2021 15:31
if (input.dim() == 0)
input = input.view(1);

auto input_ = input.contiguous();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nb: if we ever wanna penny pinch refcounts here, use MaybeOwned, e.g., as in https://github.com/pytorch/pytorch/pull/56115/files this does not block merging this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a minor style point; personally, I like underscoring the input parameter and then having it un-underscored in the body function; this is mostly based on which variable I'll be referring to a lot and giving it the "better" name. We're very uneven on this everywhere. (also does not block merging the PR)

@SplitInfinity
Copy link
Author

@SplitInfinity has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@SplitInfinity merged this pull request in ba60359.

@facebook-github-bot facebook-github-bot deleted the gh/splitinfinity/138/head branch August 15, 2021 14:17
alanwaketan pushed a commit that referenced this pull request Aug 17, 2021
Summary: Pull Request resolved: #57374

Test Plan: Imported from OSS

Reviewed By: saketh-are

Differential Revision: D30240243

Pulled By: SplitInfinity

fbshipit-source-id: de6617c75d16e26d607a884c25b8752b7b561737
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants