Page MenuHomePhabricator

[libomptarget] Refactor macros from omptarget-nvptx to inline functions
AbandonedPublic

Authored by JonChesterfield on Aug 27 2019, 4:37 PM.

Details

Summary

[libomptarget] Refactor macros from omptarget-nvptx to inline functions

Removes architecture dependent macros from omptarget-nvptx.h in favour of inline
functions in target-impl.h. Inlines the sole use of #define __SYNCTHREADS_N.

Uses explicit types in preference to the int / unsigned of the cuda API.

An alternative to the inline #ifdef would be:
#if CUDA_VERSION >= 9000
#include "cuda_9/target_impl.h"
#else
#include "tbd/target_impl.h"
#endif

Tested by disassembling libomptarget-nvptx.a with and without. No codegen change.

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptAug 27 2019, 4:37 PM
Herald added a subscriber: dexonsmith. · View Herald Transcript

I'm not sure about the signal to noise of the C style __kmpc_impl namespace. Perhaps C++ namespace __kmpc_impl {} and using namespace __kmpc_impl; in the source would be better?

I would suggest to split this patch into several small patches, one for syncwarp, one for activemask, etc.

openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h
78

This is wrong, just call __syncthreads() here.

  • from review - fix bug when compiling with clang
JonChesterfield marked 2 inline comments as done.Aug 27 2019, 5:48 PM

Will split as recommended

openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h
78

Ouch. Thanks

ABataev added inline comments.Aug 27 2019, 6:04 PM
openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h
86

No need for return here, just function call

JonChesterfield marked an inline comment as done.Aug 27 2019, 6:06 PM
JonChesterfield marked 2 inline comments as done.Aug 27 2019, 6:24 PM
JonChesterfield added inline comments.
openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h
86

Yep, thanks

JonChesterfield abandoned this revision.Aug 27 2019, 6:59 PM
JonChesterfield marked an inline comment as done.

Abandoning in favour of split revisions (one per macro)