-
Notifications
You must be signed in to change notification settings - Fork 23k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add build support for kineto + rocm #58401
Conversation
💊 CI failures summary and remediationsAs of commit 58e3366 (more details on the Dr. CI page):
🕵️ 3 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_linux_xenial_py3_6_gcc5_4_build (1/3)Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)
|
Job | Step | Action |
---|---|---|
clang-format / clang-format | Run clang-format | 🔁 rerun |
This comment was automatically generated by Dr. CI (expand for details).
Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group.
cmake/Dependencies.cmake
Outdated
if(USE_CUDA) | ||
set(LIBKINETO_NOCUPTI OFF CACHE STRING "") | ||
message(STATUS "Using Kineto with CUPTI support") | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part seems wrong if MSVC
is set to true and USE_CUDA
is set to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it is a bit convoluted. I tried to make minimal changes to something that clearly only considers one tracing library.
There is also a hidden relation: USE_CUDA && USE_ROCM == False. These never coexist.
So the 1846 guard can only be reached if 1829 is not entered. If MSVC is set then LIBKINETO_NOCUPTI is ON.
At that point the only way into 1838 is if we are a ROCm build. So then the 1846 is always false.
Also, there is no rocm on windows (i.e. msvc).
Happy to refactor it. Suggestions welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I guess here it is a bit confusing to me because
both LIBKINETO_NOCUPTI
and LIBKINETO_NOROCTRACER
are cached strings. therefore users can override it. and the set statement here are noops if user has explicitly set cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the if else logic here seems convoluted.
Can we separate the logic setting
LIBKINETO_NOROCTRACER
and
LIBKINETO_NOROCTRACER
?
Seems the only part that they need to be used together is printing the message "cpu-only".
@walterddr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Indeed, can get rid of the "else" and replace with:
Embrace the double negatives. Seems easier to follow. Agree? |
Or this:
|
Codecov Report
@@ Coverage Diff @@
## master #58401 +/- ##
==========================================
- Coverage 76.47% 76.46% -0.01%
==========================================
Files 1992 1992
Lines 199860 199860
==========================================
- Hits 152840 152832 -8
- Misses 47020 47028 +8 |
cmake/Dependencies.cmake
Outdated
if(LIBKINETO_NOCUPTI AND LIBKINETO_NOROCTRACER) | ||
message(STATUS "Using CPU-only version of Kineto") | ||
else() | ||
set(LIBKINETO_NOCUPTI OFF CACHE STRING "") | ||
message(STATUS "Using Kineto with CUPTI support") | ||
if(USE_ROCM) | ||
if(NOT ENV{ROCM_SOURCE_DIR}) | ||
set(ENV{ROCM_SOURCE_DIR} "/opt/rocm") | ||
endif() | ||
set(LIBKINETO_NOROCTRACER OFF CACHE STRING "") | ||
message(STATUS "Using Kineto with Roctracer support") | ||
endif() | ||
if(USE_CUDA) | ||
set(LIBKINETO_NOCUPTI OFF CACHE STRING "") | ||
message(STATUS "Using Kineto with CUPTI support") | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(LIBKINETO_NOCUPTI AND LIBKINETO_NOROCTRACER) | |
message(STATUS "Using CPU-only version of Kineto") | |
else() | |
set(LIBKINETO_NOCUPTI OFF CACHE STRING "") | |
message(STATUS "Using Kineto with CUPTI support") | |
if(USE_ROCM) | |
if(NOT ENV{ROCM_SOURCE_DIR}) | |
set(ENV{ROCM_SOURCE_DIR} "/opt/rocm") | |
endif() | |
set(LIBKINETO_NOROCTRACER OFF CACHE STRING "") | |
message(STATUS "Using Kineto with Roctracer support") | |
endif() | |
if(USE_CUDA) | |
set(LIBKINETO_NOCUPTI OFF CACHE STRING "") | |
message(STATUS "Using Kineto with CUPTI support") | |
endif() | |
# Only one tracing library can be used. | |
if(NOT LIBKINETO_NOCUPTI) | |
set(LIBKINETO_NOROCTRACER ON CACHE STRING "" FORCE) | |
if(NOT USE_CUDA) | |
message(WARNING "Kineto tracing requested CUPIT but USE_CUDA not set) | |
set(LIBKINETO_NOCUPTI ON CACHE STRING "" FORCE) | |
else() | |
message(STATUS "Using Kineto with CUPTI support") | |
endif() | |
elif(LIBKINETO_NOROCTRACER) | |
set(LIBKINETO_NOCUPTI ON CACHE STRING "" FORCE) | |
if(NOT USE_CUDA) | |
message(WARNING "Kineto tracing requested ROCTRACER but USE_ROCM not set) | |
set(LIBKINETO_NOROCTRACER ON CACHE STRING "" FORCE) | |
else() | |
message(STATUS "Using Kineto with Roctracer support") | |
endif() | |
else() | |
message(STATUS "Using CPU-only version of Kineto") | |
endif() |
I think I have this adequately cleaned up now.
|
@walterddr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@walterddr merged this pull request in e66015d. |
Thank you. |
Summary: Fixes pytorch#58399 CMake changes to allow kineto to build with rocm support. Pull Request resolved: pytorch#58401 Reviewed By: mruberry Differential Revision: D28479807 Pulled By: walterddr fbshipit-source-id: fc01f05b2a5592ee1d1dbd71d2d4f7aec1bd74f7
Summary: Fixes pytorch#58399 CMake changes to allow kineto to build with rocm support. Pull Request resolved: pytorch#58401 Reviewed By: mruberry Differential Revision: D28479807 Pulled By: walterddr fbshipit-source-id: fc01f05b2a5592ee1d1dbd71d2d4f7aec1bd74f7
Summary: Fixes pytorch#58399 CMake changes to allow kineto to build with rocm support. Pull Request resolved: pytorch#58401 Reviewed By: mruberry Differential Revision: D28479807 Pulled By: walterddr fbshipit-source-id: fc01f05b2a5592ee1d1dbd71d2d4f7aec1bd74f7
Summary: Issue #426 This adds a RoctracerActivityInterface and hooks it into ActivityProfiler (when a rocm build is detected). Uses GenericTraceActivity records to avoid muddying up the waters. Build support in pytorch is already upstreamed - pytorch/pytorch#58401 Pull Request resolved: #427 Reviewed By: briancoutinho, robieta Differential Revision: D31096888 Pulled By: gdankel fbshipit-source-id: f6295f7c152e7a972511b657107da75d30a9e108
Fixes #58399
CMake changes to allow kineto to build with rocm support.