[OpenMP] Remove compilation warning when using clang to compile bc files.
Needs ReviewPublic

Authored by guansong on Wed, Apr 11, 10:21 AM.

Details

Reviewers
grokos
Summary

Minor printf format correction. NVCC ignore those. Clang will give warning on these if debug is enabled.

Diff Detail

guansong created this revision.Wed, Apr 11, 10:21 AM
guansong retitled this revision from Remove compilation warning when using clang to compile bc files. to [OpenMP] Remove compilation warning when using clang to compile bc files..Wed, Apr 11, 10:28 AM
grokos added inline comments.Thu, Apr 12, 1:33 PM
libomptarget/deviceRTLs/nvptx/src/counter_groupi.h
48

The P64 macro casts its argument to an unsigned long long. Use %llu instead of %lld.

libomptarget/deviceRTLs/nvptx/src/libcall.cu
195

Chunk is defined as unsigned long long, shouldn't the modifier be %llu?

264

Same here, %llu?

libomptarget/deviceRTLs/nvptx/src/loop.cu
304

For consistency with the rest of libomptarget, use the PRId64 macro to print an int64_t:

"dispatch init (static chunk) : num threads = %d, ub = %" PRId64 ","
325

PRId64

340–341

Same here, PRId64 for ub, %llu for chunk.

libomptarget/deviceRTLs/nvptx/src/supporti.h
180–182

size_t can be printed in a portable way with the %zu modifier. For the nvptx RTL this is not a problem, but in general size_t is not guaranteed to be a long int, so usage of %zu is preferred. This is what we use in the rest of libomptarget. Can you use %zu here as well for consistency?

guansong updated this revision to Diff 142451.Fri, Apr 13, 11:53 AM

update as suggested (except PRId64)

update as suggested (except PRId64)

Isn't PRId64 working?

update as suggested (except PRId64)

Isn't PRId64 working?

I did not find any usage of it on the device side. Not confident on using it. We can remove the warning first, and improve the format later?

libomptarget/deviceRTLs/nvptx/src/counter_groupi.h
48

Ok, will also update the other two lld to llu.

libomptarget/deviceRTLs/nvptx/src/libcall.cu
195

sure

264

Yes

libomptarget/deviceRTLs/nvptx/src/loop.cu
304

Not sure what is this. I can not find this PRId64 example.

libomptarget/deviceRTLs/nvptx/src/supporti.h
180–182

sure.

update as suggested (except PRId64)

Isn't PRId64 working?

I did not find any usage of it on the device side. Not confident on using it. We can remove the warning first, and improve the format later?

It's not giving a warning because in CUDA long long int happens to be 64 bits long, so you can use %lld to print an int64_t. This is not guaranteed to be true on every platform, therefore the portable way to print an int64_t is using the PRId64 macro. That's what we do in the base library.

update as suggested (except PRId64)

Isn't PRId64 working?

I did not find any usage of it on the device side. Not confident on using it. We can remove the warning first, and improve the format later?

It's not giving a warning because in CUDA long long int happens to be 64 bits long, so you can use %lld to print an int64_t. This is not guaranteed to be true on every platform, therefore the portable way to print an int64_t is using the PRId64 macro. That's what we do in the base library.

My understanding of PRId64 is just a pretty format for lld, which I did not master yet, maybe you can help me to understand those two macros better.

But not sure why %lld can be wrong with long long int, this is standard c/c++ which cuda is based on?

update as suggested (except PRId64)

Isn't PRId64 working?

I did not find any usage of it on the device side. Not confident on using it. We can remove the warning first, and improve the format later?

It's not giving a warning because in CUDA long long int happens to be 64 bits long, so you can use %lld to print an int64_t. This is not guaranteed to be true on every platform, therefore the portable way to print an int64_t is using the PRId64 macro. That's what we do in the base library.

My understanding of PRId64 is just a pretty format for lld, which I did not master yet, maybe you can help me to understand those two macros better.

But not sure why %lld can be wrong with long long int, this is standard c/c++ which cuda is based on?

long long int is printed via %lld, but the value you are printing here is not a long long int, it's an int64_t. int64_t is printed via PRId64. int64_t is not necessarily equal to long long int.

PRId64 is not a pretty format for long long int, it's a macro that expands to the correct modifier for int64_t. long long int just happens to be 64 bits long in CUDA, but it's not guaranteed to be 64 bits everywhere. You don't get any compiler warnings here because in CUDA int64_t is typedef-ed as long long int and, consequently, PRId64 expands to lld. That doesn't mean that int64_t can be treated as a long long int everywhere.

E.g. some day we may reuse this code for another GPU architecture (we had a discussion a while ago about creating a GPU-agnostic RTL which then nvptx/amdgcn/some_other_gpu will specialize). If int64_t is defined on that other architecture as long int for instance, then %lld will trigger a warning. If, instead, we use the PRId64 macro, it will expand to the correct ld automatically.

I checked again on the format you suggested. To use it, you need #include <inttypes.h> For example the macro was defined like this in my version of the inttypes.h

# if __WORDSIZE == 64
#  define __PRI64_PREFIX        "l"
#  define __PRIPTR_PREFIX       "l"
# else
#  define __PRI64_PREFIX       "ll"
#  define __PRIPTR_PREFIX
# endif

# define PRId8 "d"
# define PRId16 "d"
# define PRId32 "d"
# define PRId64 __PRI64_PREFIX "d"

But the header causes a compilation issue for cuda clang. I am not sure how to solve that problem.

I checked again on the format you suggested. To use it, you need #include <inttypes.h> For example the macro was defined like this in my version of the inttypes.h

# if __WORDSIZE == 64
#  define __PRI64_PREFIX        "l"
#  define __PRIPTR_PREFIX       "l"
# else
#  define __PRI64_PREFIX       "ll"
#  define __PRIPTR_PREFIX
# endif

# define PRId8 "d"
# define PRId16 "d"
# define PRId32 "d"
# define PRId64 __PRI64_PREFIX "d"

But the header causes a compilation issue for cuda clang. I am not sure how to solve that problem.

I mean the header file inttypes.h will causes issue for cuda clang, (it looks like it uses some features cuda does not support.)