This is an archive of the discontinued LLVM Phabricator instance.

[nfc][libomptarget] Warp size aware logging by using builtinvar abstraction consistently. Last part of D69423
AbandonedPublic

Authored by JonChesterfield on Oct 27 2019, 8:16 AM.

Details

Summary

[nfc][libomptarget] Warp size aware logging by using builtinvar abstraction consistently. Last part of D69423

Uses functions from support.h to derive values used in logging from WARPSIZE.

Also updates the rest of the deviceRTL to use this existing abstraction everywhere.

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptOct 27 2019, 8:16 AM
  • Use functions from support.h instead of adding to target_impl
JonChesterfield edited the summary of this revision. (Show Details)Oct 27 2019, 8:35 AM
  • Use cuda builtin_var abstraction everywhere
JonChesterfield retitled this revision from [nfc][libomptarget] Warp size aware logging. Last part of D69423 to [nfc][libomptarget] Warp size aware logging by using builtinvar abstraction consistently. Last part of D69423.Oct 27 2019, 8:50 AM
JonChesterfield edited the summary of this revision. (Show Details)
JonChesterfield added a comment.EditedOct 27 2019, 8:55 AM

A subsequent patch will be proposed to move the six functions labelled 'Calls to the NVPTX layer' in support.h into target_impl.

That will allow me to delete the ugliest third of the api shim used by amdgcn to work around not having a cuda compiler. Code like:

#define __CUDA_DEVICE_BUILTIN(FIELD, INTRINSIC)                                \
  __declspec(property(get = __fetch_builtin_##FIELD)) unsigned int FIELD;      \
  static inline __attribute__((always_inline))                                 \
      __attribute__((device)) unsigned int __fetch_builtin_##FIELD(void) {     \
    return INTRINSIC;                                                          \
  }
//...
__CUDA_DEVICE_BUILTIN(x, __nvvm_read_ptx_sreg_tid_x());
//...
define i32 @__nvvm_read_ptx_sreg_tid_x() #2 {
  %id.i = tail call i32 @llvm.amdgcn.workitem.id.x() #3
  ret i32 %id.i
}
JonChesterfield marked an inline comment as done.Oct 27 2019, 9:00 AM
JonChesterfield added inline comments.
openmp/libomptarget/deviceRTLs/nvptx/src/supporti.h
109

LLVM transforms the mod with constant to mask with constant - 1 at O1.

ABataev added inline comments.Oct 27 2019, 11:59 AM
openmp/libomptarget/deviceRTLs/nvptx/src/supporti.h
126

Hmm, why here it is not ttansformed into -1?

JonChesterfield marked an inline comment as done.Oct 27 2019, 12:41 PM
JonChesterfield added inline comments.
openmp/libomptarget/deviceRTLs/nvptx/src/supporti.h
126

The ~. Change only converts the variable access into a function call.

Codegens as (blockDim.x - 1) & -32 which seems reasonable.

JonChesterfield abandoned this revision.Oct 27 2019, 1:29 PM

There's some collateral from this. Support.h is missing include guards and isn't safe to include multiple times. If guards are introduced, the inline functions that are defined in supporti.h generate warnings from Wundefined-inline. If supporti.h is appended to support, it turns out to be missing a number of headers.

I'm going detour to fix up some header dependencies then recreate an equivalent diff to this one.

Fixing the headers is disproportionately expensive and will make a mess of the source dir. Lots of little headers.

Posted an alternate to openmp-dev suggesting we implement functions in source files instead, working around nvcc lacking lto with a unity build.