This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining
ClosedPublic

Authored by gtbercea on Jan 4 2018, 4:50 AM.

Details

Summary

Different NVIDIA GPUs support different compute capabilities. To enable the inlining of runtime functions and the best performance on different generations of NVIDIA GPUs, a bc library for each compute capability needs to be compiled. The same compiler build will then be usable in conjunction with multiple generations of NVIDIA GPUs.
To differentiate between versions of the same bc lib, the output file name will contain the compute capability ID.
Depends on D14254

Event Timeline

gtbercea created this revision.Jan 4 2018, 4:50 AM

Does the bitcode lib work with upstream Clang trunk?

Please rebase on top of the lastest changes in D14254.

Do we want to rename LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY to LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES (plural) to reflect that the user can pass in a list? In any case, please document in README.rst!

Hahnfeld requested changes to this revision.Jan 28 2018, 2:25 AM
This revision now requires changes to proceed.Jan 28 2018, 2:25 AM
gtbercea updated this revision to Diff 133075.Feb 6 2018, 2:06 PM

Rebase on new master.

Hahnfeld requested changes to this revision.Feb 7 2018, 9:10 AM
Hahnfeld added inline comments.
libomptarget/deviceRTLs/nvptx/CMakeLists.txt
63–64

This should definitely stay a CACHE variable. And I still propose to rename it to _CAPABILITIES...

203

Do we want to use eg 35 as suffix or full sm_35?

This revision now requires changes to proceed.Feb 7 2018, 9:10 AM
gtbercea added inline comments.Feb 7 2018, 10:51 AM
libomptarget/deviceRTLs/nvptx/CMakeLists.txt
63–64

Ah so you want me to keep this line. Sure I'll reinstate it.

Regarding the name ... this would lead to many people having to change their compiler build scripts. I'd like to avoid that.

203

35

gtbercea added inline comments.Feb 7 2018, 10:53 AM
libomptarget/deviceRTLs/nvptx/CMakeLists.txt
203

Actually I don't care which one we use. Depending on what we choose here I'll make clang agree with it :)

gtbercea updated this revision to Diff 133262.Feb 7 2018, 11:16 AM

Reinstate cache variable.

gtbercea marked an inline comment as done.Feb 7 2018, 11:16 AM

On whatever name we decide, please document in README.rst that the user can pass multiple values.

libomptarget/deviceRTLs/nvptx/CMakeLists.txt
63–64

Maybe we can add some compatibility? Upstream only has it since a few days and others will need to change their scripts anyway.

set(default_capabilities 35)
if (DEFINED LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY)
  set(default_capabilities ${LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY})
  libomptarget_warning_say("LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY is deprecated, please use LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES")
endif()
set(LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES ${default_capabilities} CACHE STRING
  "List of CUDA Compute Capabilities to be used to compile the NVPTX device RTL.")
203

I agree: The user will probably never see the name anyway. So let's just keep it compatible.

gtbercea updated this revision to Diff 133291.Feb 7 2018, 1:23 PM

Fix documentation.

gtbercea marked an inline comment as done.Feb 7 2018, 1:25 PM
gtbercea updated this revision to Diff 133419.Feb 8 2018, 7:17 AM
gtbercea marked an inline comment as done.

Fix documentation.

Forgot to add @grokos as a reviewer. Fixed now.

grokos added inline comments.Feb 8 2018, 7:49 AM
README.rst
283–286

Maybe add a comment here that the single-capability option is deprecated and the user should use the next option to define compute capabilities (even if they want to define only one)?

gtbercea updated this revision to Diff 133426.Feb 8 2018, 7:53 AM

Add deprecated message to documentation.

Hahnfeld added inline comments.Feb 8 2018, 7:56 AM
README.rst
283–286

I think we should just remove it: There was no (upstream) release with this variable and the only code checking for it is there for compatibility.

gtbercea marked an inline comment as done.Feb 8 2018, 7:56 AM
gtbercea updated this revision to Diff 133427.Feb 8 2018, 7:58 AM

Remove LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY

gtbercea marked an inline comment as done.Feb 8 2018, 7:59 AM
grokos accepted this revision.Feb 8 2018, 3:14 PM
hintonda removed a subscriber: hintonda.Feb 9 2018, 2:46 PM
Hahnfeld accepted this revision.Feb 12 2018, 3:16 AM

Looks good, thanks.

Generally you can't expect me to reply immediately...

This revision is now accepted and ready to land.Feb 12 2018, 3:16 AM
gtbercea updated this revision to Diff 133870.Feb 12 2018, 8:43 AM

Change name to include full compute capability name not just the number.

This revision was automatically updated to reflect the committed changes.