Port `log_softmax` to structured kernel #57374

SplitInfinity · 2021-04-30T18:27:22Z

Stack from ghstack:

Differential Revision: D30240243

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 79444182a5b85cf603960c64b33d1ba6eb9c8e7f Pull Request resolved: #57374

facebook-github-bot · 2021-04-30T18:27:36Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/57374
📄 Preview docs built from this PR

💊 CI failures summary and remediations

As of commit e1165ba (more details on the Dr. CI page):

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Aug 11 02:13:16 *** WARNING: renaming "_hashlib...open shared object file: No such file or directory

Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut8-p4-perm-x72.c.o
Aug 11 02:13:16 building '_hashlib' extension
Aug 11 02:13:16 gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -fPIC -fPIC -I/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/include -I./Include -I/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/include -I. -I/usr/include/x86_64-linux-gnu -I/usr/local/include -I/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Include -I/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython -c /var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_hashopenssl.c -o build/temp.linux-x86_64-3.8/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_hashopenssl.o
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut8-p4-perm-x80.c.o
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut16-p3-gather-x8.c.o
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut16-p3-gather-x16.c.o
Aug 11 02:13:16 gcc -pthread -shared build/temp.linux-x86_64-3.8/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_hashopenssl.o -L/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/lib -L/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/lib -L/usr/lib/x86_64-linux-gnu -L/usr/local/lib -lssl -lcrypto -o build/lib.linux-x86_64-3.8/_hashlib.cpython-38-x86_64-linux-gnu.so
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut16-p3-gather-x24.c.o
Aug 11 02:13:16 [ 53%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/f32-velu/gen/velu-avx2-rr1-lut16-p3-gather-x32.c.o
Aug 11 02:13:16 *** WARNING: renaming "_ssl" since importing it failed: libssl.so.1.1: cannot open shared object file: No such file or directory
Aug 11 02:13:16 *** WARNING: renaming "_hashlib" since importing it failed: libcrypto.so.1.1: cannot open shared object file: No such file or directory
Aug 11 02:13:16 
Aug 11 02:13:16 The following modules found by detect_modules() in setup.py, have been
Aug 11 02:13:16 built by the Makefile instead, as configured by the Setup files:
Aug 11 02:13:16 _abc                  atexit                pwd                
Aug 11 02:13:16 time                                                           
Aug 11 02:13:16 
Aug 11 02:13:16 
Aug 11 02:13:16 Following modules built successfully but were removed because they could not be imported:
Aug 11 02:13:16 _hashlib              _ssl                                     
Aug 11 02:13:16

1 job timed out:

pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

github-actions · 2021-06-29T22:02:52Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
Stale pull requests will automatically be closed 30 days after being marked Stale

[ghstack-poisoned]

SplitInfinity · 2021-07-28T20:20:47Z

aten/src/ATen/native/cuda/SoftMax.cu

+  auto res = host_softmax<LogSoftMaxForwardEpilogue,true>(input, dim, half_to_float);
+  output.copy_(res);


So ideally I would refactor host_softmax so that it takes an out parameter but that function is also used by softmax_cuda, which is not structured. It can be, and the structure (no pun intended) of that kernel is very similar to that of log_softmax, so I think the port would be similar. However, it is not in any of the lists mentioned in #55070.

There are two options:

Port _softmax to structured so that this host_softmax function can be refactored to require an out parameter.

Do not port _softmax to structured. Add an optional out parameter to host_softmax and use that as the output Tensor if one is provided.

I prefer number 1. Let me know your thoughts.

If not in a rush better to do the cluster of ports necessarily to put us in the better end state.

I agree, but why is _softmax not on any of the lists in the first place?

It's possible the end users are only reporting the top level functions, and not how they are implemented internally

SplitInfinity · 2021-07-28T20:20:59Z

aten/src/ATen/native/cuda/SoftMax.cu

@@ -925,6 +930,5 @@ Tensor softmax_backward_cuda(const Tensor &grad, const Tensor &output, int64_t d
  Tensor tmp = grad * output;
  return host_softmax_backward<SoftMaxBackwardEpilogue,false>(tmp, output, dim, half_to_float);
 }
-


Oops, I should delete this.

[ghstack-poisoned]

ghstack-source-id: e7b6bde4096855cae3f30f07c964ffa4130962a8 Pull Request resolved: #57374

[ghstack-poisoned]

ezyang · 2021-07-30T13:35:28Z

aten/src/ATen/native/SoftMax.cpp

+      !half_to_float,
+      "softmax with half to float conversion is not supported on CPU");
+
+  auto input_ = input.contiguous();


improper use of contiguous in meta function

[ghstack-poisoned]

ezyang · 2021-08-09T14:32:44Z

aten/src/ATen/native/SoftMax.cpp

-  if (input.dim() == 0)
-    input = input.view(1);
+
+  auto input_ = input.contiguous();


nb: if we ever wanna penny pinch refcounts here, use MaybeOwned, e.g., as in https://github.com/pytorch/pytorch/pull/56115/files this does not block merging this PR

a minor style point; personally, I like underscoring the input parameter and then having it un-underscored in the body function; this is mostly based on which variable I'll be referring to a lot and giving it the "better" name. We're very uneven on this everywhere. (also does not block merging the PR)

[ghstack-poisoned]

SplitInfinity · 2021-08-11T01:59:59Z

@SplitInfinity has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-08-12T04:04:14Z

@SplitInfinity merged this pull request in ba60359.

Summary: Pull Request resolved: #57374 Test Plan: Imported from OSS Reviewed By: saketh-are Differential Revision: D30240243 Pulled By: SplitInfinity fbshipit-source-id: de6617c75d16e26d607a884c25b8752b7b561737

[WIP] Port log_softmax to structured kernel

9526d88

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

SplitInfinity requested a review from ezyang as a code owner April 30, 2021 18:27

SplitInfinity pushed a commit that referenced this pull request Apr 30, 2021

[WIP] Port log_softmax to structured kernel

78112da

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 79444182a5b85cf603960c64b33d1ba6eb9c8e7f Pull Request resolved: #57374

facebook-github-bot added the cla signed label Apr 30, 2021

SplitInfinity removed the request for review from ezyang April 30, 2021 18:27

github-actions bot added the Stale label Jun 29, 2021

SplitInfinity mentioned this pull request Jul 6, 2021

Port kernels to be structured [tracker] #55070

Closed

SplitInfinity changed the title ~~[WIP] Port log_softmax to structured kernel~~ Port log_softmax to structured kernel Jul 28, 2021

Update on "Port log_softmax to structured kernel"

0d2047d

[ghstack-poisoned]

Update on "Port log_softmax to structured kernel"

774ed08

[ghstack-poisoned]

Update on "Port log_softmax to structured kernel"

3ea31bf

[ghstack-poisoned]

SplitInfinity commented Jul 28, 2021

View reviewed changes

Update on "Port log_softmax to structured kernel"

a89a3d3

[ghstack-poisoned]

SplitInfinity pushed a commit that referenced this pull request Jul 28, 2021

Port log_softmax to structured kernel

68909bf

ghstack-source-id: e7b6bde4096855cae3f30f07c964ffa4130962a8 Pull Request resolved: #57374

SplitInfinity mentioned this pull request Jul 29, 2021

Port log_softmax_backward_data to structured kernel #62372

Closed

Meghan Lele added 2 commits July 28, 2021 18:57

Update on "Port log_softmax to structured kernel"

a7d7354

[ghstack-poisoned]

Update on "Port log_softmax to structured kernel"

f7b74ed

[ghstack-poisoned]

SplitInfinity mentioned this pull request Jul 30, 2021

[WIP] Port native_batch_norm to structured #62452

Closed

ezyang reviewed Jul 30, 2021

View reviewed changes

SplitInfinity mentioned this pull request Jul 30, 2021

[WIP] Port native_batch_norm_backward to structured #62513

Closed

Meghan Lele added 4 commits July 30, 2021 18:19

Update on "Port log_softmax to structured kernel"

9d6ef6f

[ghstack-poisoned]

Update on "Port log_softmax to structured kernel"

4813f3e

[ghstack-poisoned]

Update on "Port log_softmax to structured kernel"

d7b05da

[ghstack-poisoned]

Update on "Port log_softmax to structured kernel"

99e909a

[ghstack-poisoned]

SplitInfinity requested a review from ezyang August 4, 2021 15:31

Update on "Port log_softmax to structured kernel"

c1544dd

[ghstack-poisoned]

ezyang reviewed Aug 9, 2021

View reviewed changes

ezyang approved these changes Aug 9, 2021

View reviewed changes

Update on "Port log_softmax to structured kernel"

e1165ba

[ghstack-poisoned]

facebook-github-bot closed this in ba60359 Aug 12, 2021

facebook-github-bot added the Merged label Aug 12, 2021

facebook-github-bot deleted the gh/splitinfinity/138/head branch August 15, 2021 14:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port `log_softmax` to structured kernel #57374

Port `log_softmax` to structured kernel #57374

		auto res = host_softmax<LogSoftMaxForwardEpilogue,true>(input, dim, half_to_float);
		output.copy_(res);

Port log_softmax to structured kernel #57374

Port log_softmax to structured kernel #57374

Conversation

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (1/1)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Port `log_softmax` to structured kernel #57374

Port `log_softmax` to structured kernel #57374