Download Raw Diff

Details

Reviewers

Commits

rGad6c26516b86: [OpenMP] Remove compilation warning when using clang to compile bc files.
rOMP330944: [OpenMP] Remove compilation warning when using clang to compile bc files.
rL330944: [OpenMP] Remove compilation warning when using clang to compile bc files.

Summary

Minor printf format correction. NVCC ignore those. Clang will give warning on these if debug is enabled.

Diff Detail

Event Timeline

guansong created this revision.Apr 11 2018, 10:21 AM

guansong retitled this revision from Remove compilation warning when using clang to compile bc files. to [OpenMP] Remove compilation warning when using clang to compile bc files..Apr 11 2018, 10:28 AM

grokos added inline comments.Apr 12 2018, 1:33 PM

libomptarget/deviceRTLs/nvptx/src/counter_groupi.h
48	The `P64` macro casts its argument to an `unsigned long long`. Use `%llu` instead of `%lld`.
libomptarget/deviceRTLs/nvptx/src/libcall.cu
195	Chunk is defined as `unsigned long long`, shouldn't the modifier be `%llu`?
264	Same here, `%llu`?
libomptarget/deviceRTLs/nvptx/src/loop.cu
303–304	For consistency with the rest of libomptarget, use the `PRId64` macro to print an `int64_t`: "dispatch init (static chunk) : num threads = %d, ub = %" PRId64 ","
324–325	`PRId64`
340–341	Same here, `PRId64` for ub, `%llu` for chunk.
libomptarget/deviceRTLs/nvptx/src/supporti.h
180–182	`size_t` can be printed in a portable way with the `%zu` modifier. For the nvptx RTL this is not a problem, but in general `size_t` is not guaranteed to be a long int, so usage of `%zu` is preferred. This is what we use in the rest of libomptarget. Can you use `%zu` here as well for consistency?

update as suggested (except PRId64)

In D45528#1067487, @guansong wrote:

update as suggested (except PRId64)

Isn't PRId64 working?

In D45528#1067520, @grokos wrote:

In D45528#1067487, @guansong wrote:

update as suggested (except PRId64)

Isn't PRId64 working?

I did not find any usage of it on the device side. Not confident on using it. We can remove the warning first, and improve the format later?

libomptarget/deviceRTLs/nvptx/src/counter_groupi.h
48	Ok, will also update the other two lld to llu.
libomptarget/deviceRTLs/nvptx/src/libcall.cu
195	sure
264	Yes
libomptarget/deviceRTLs/nvptx/src/loop.cu
303–304	Not sure what is this. I can not find this PRId64 example.
libomptarget/deviceRTLs/nvptx/src/supporti.h
180–182	sure.

In D45528#1067615, @guansong wrote:

In D45528#1067520, @grokos wrote:

In D45528#1067487, @guansong wrote:

update as suggested (except PRId64)

Isn't PRId64 working?

I did not find any usage of it on the device side. Not confident on using it. We can remove the warning first, and improve the format later?

It's not giving a warning because in CUDA long long int happens to be 64 bits long, so you can use %lld to print an int64_t. This is not guaranteed to be true on every platform, therefore the portable way to print an int64_t is using the PRId64 macro. That's what we do in the base library.

In D45528#1067671, @grokos wrote:

In D45528#1067615, @guansong wrote:

In D45528#1067520, @grokos wrote:

In D45528#1067487, @guansong wrote:

update as suggested (except PRId64)

Isn't PRId64 working?

I did not find any usage of it on the device side. Not confident on using it. We can remove the warning first, and improve the format later?

It's not giving a warning because in CUDA long long int happens to be 64 bits long, so you can use %lld to print an int64_t. This is not guaranteed to be true on every platform, therefore the portable way to print an int64_t is using the PRId64 macro. That's what we do in the base library.

My understanding of PRId64 is just a pretty format for lld, which I did not master yet, maybe you can help me to understand those two macros better.

But not sure why %lld can be wrong with long long int, this is standard c/c++ which cuda is based on?

In D45528#1068356, @guansong wrote:

In D45528#1067671, @grokos wrote:

In D45528#1067615, @guansong wrote:

In D45528#1067520, @grokos wrote:

In D45528#1067487, @guansong wrote:

update as suggested (except PRId64)

Isn't PRId64 working?

I did not find any usage of it on the device side. Not confident on using it. We can remove the warning first, and improve the format later?

It's not giving a warning because in CUDA long long int happens to be 64 bits long, so you can use %lld to print an int64_t. This is not guaranteed to be true on every platform, therefore the portable way to print an int64_t is using the PRId64 macro. That's what we do in the base library.

My understanding of PRId64 is just a pretty format for lld, which I did not master yet, maybe you can help me to understand those two macros better.

But not sure why %lld can be wrong with long long int, this is standard c/c++ which cuda is based on?

long long int is printed via %lld, but the value you are printing here is not a long long int, it's an int64_t. int64_t is printed via PRId64. int64_t is not necessarily equal to long long int.

PRId64 is not a pretty format for long long int, it's a macro that expands to the correct modifier for int64_t. long long int just happens to be 64 bits long in CUDA, but it's not guaranteed to be 64 bits everywhere. You don't get any compiler warnings here because in CUDA int64_t is typedef-ed as long long int and, consequently, PRId64 expands to lld. That doesn't mean that int64_t can be treated as a long long int everywhere.

E.g. some day we may reuse this code for another GPU architecture (we had a discussion a while ago about creating a GPU-agnostic RTL which then nvptx/amdgcn/some_other_gpu will specialize). If int64_t is defined on that other architecture as long int for instance, then %lld will trigger a warning. If, instead, we use the PRId64 macro, it will expand to the correct ld automatically.

I checked again on the format you suggested. To use it, you need #include <inttypes.h> For example the macro was defined like this in my version of the inttypes.h

# if __WORDSIZE == 64
#  define __PRI64_PREFIX        "l"
#  define __PRIPTR_PREFIX       "l"
# else
#  define __PRI64_PREFIX       "ll"
#  define __PRIPTR_PREFIX
# endif

# define PRId8 "d"
# define PRId16 "d"
# define PRId32 "d"
# define PRId64 __PRI64_PREFIX "d"

But the header causes a compilation issue for cuda clang. I am not sure how to solve that problem.

In D45528#1072542, @guansong wrote:
I checked again on the format you suggested. To use it, you need #include <inttypes.h> For example the macro was defined like this in my version of the inttypes.h
# if __WORDSIZE == 64
#  define __PRI64_PREFIX        "l"
#  define __PRIPTR_PREFIX       "l"
# else
#  define __PRI64_PREFIX       "ll"
#  define __PRIPTR_PREFIX
# endif

# define PRId8 "d"
# define PRId16 "d"
# define PRId32 "d"
# define PRId64 __PRI64_PREFIX "d"
But the header causes a compilation issue for cuda clang. I am not sure how to solve that problem.

I mean the header file inttypes.h will causes issue for cuda clang, (it looks like it uses some features cuda does not support.)

With the proper position of include file, I can use PRId64 now.

Use PRIu64 for chuck

Looks good.

This revision is now accepted and ready to land.Apr 25 2018, 10:52 PM

Closed by commit rL330944: [OpenMP] Remove compilation warning when using clang to compile bc files. (authored by guansong). · Explain WhyApr 26 2018, 7:10 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptApr 26 2018, 7:10 AM

Diff 144057

libomptarget/deviceRTLs/nvptx/src/counter_groupi.h

Show All 39 Lines	INLINE Counter omptarget_nvptx_CounterGroup::Next() {
PRINT(LD_SYNCD, "next event counter 0x%llx with val %lld->%lld\n",		PRINT(LD_SYNCD, "next event counter 0x%llx with val %lld->%lld\n",
P64(&v_event), P64(oldVal), P64(oldVal + 1));		P64(&v_event), P64(oldVal), P64(oldVal + 1));

return oldVal;		return oldVal;
}		}

// set priv to n, to be used in later waitOrRelease		// set priv to n, to be used in later waitOrRelease
INLINE void omptarget_nvptx_CounterGroup::Complete(Counter &priv, Counter n) {		INLINE void omptarget_nvptx_CounterGroup::Complete(Counter &priv, Counter n) {
PRINT(LD_SYNCD, "complete priv counter 0x%llx with val %lld->%lld (+%d)\n",		PRINT(LD_SYNCD, "complete priv counter 0x%llx with val %llu->%llu (+%llu)\n",
		grokosUnsubmitted Not Done Reply Inline Actions The `P64` macro casts its argument to an `unsigned long long`. Use `%llu` instead of `%lld`. grokos: The `P64` macro casts its argument to an `unsigned long long`. Use `%llu` instead of `%lld`.
		guansongAuthorUnsubmitted Not Done Reply Inline Actions Ok, will also update the other two lld to llu. guansong: Ok, will also update the other two lld to llu.
P64(&priv), P64(priv), P64(priv + n), n);		P64(&priv), P64(priv), P64(priv + n), n);
priv += n;		priv += n;
}		}

INLINE void omptarget_nvptx_CounterGroup::Release(Counter priv,		INLINE void omptarget_nvptx_CounterGroup::Release(Counter priv,
Counter current_event_value) {		Counter current_event_value) {
if (priv - 1 == current_event_value) {		if (priv - 1 == current_event_value) {
PRINT(LD_SYNCD, "Release start counter 0x%llx with val %lld->%lld\n",		PRINT(LD_SYNCD, "Release start counter 0x%llx with val %lld->%lld\n",
Show All 26 Lines

libomptarget/deviceRTLs/nvptx/src/libcall.cu

Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
EXTERN int omp_get_max_threads(void) {		EXTERN int omp_get_max_threads(void) {
omptarget_nvptx_TaskDescr *currTaskDescr = getMyTopTaskDescriptor();		omptarget_nvptx_TaskDescr *currTaskDescr = getMyTopTaskDescriptor();
int rc = 1; // default is 1 thread avail		int rc = 1; // default is 1 thread avail
if (!currTaskDescr->InParallelRegion()) {		if (!currTaskDescr->InParallelRegion()) {
// not currently in a parallel region... all are available		// not currently in a parallel region... all are available
rc = GetNumberOfProcsInTeam();		rc = GetNumberOfProcsInTeam();
ASSERT0(LT_FUSSY, rc >= 0, "bad number of threads");		ASSERT0(LT_FUSSY, rc >= 0, "bad number of threads");
}		}
PRINT(LD_IO, "call omp_get_max_threads() return %\n", rc);		PRINT(LD_IO, "call omp_get_max_threads() return %d\n", rc);
return rc;		return rc;
}		}

EXTERN int omp_get_thread_limit(void) {		EXTERN int omp_get_thread_limit(void) {
// per contention group.. meaning threads in current team		// per contention group.. meaning threads in current team
omptarget_nvptx_TaskDescr *currTaskDescr = getMyTopTaskDescriptor();		omptarget_nvptx_TaskDescr *currTaskDescr = getMyTopTaskDescriptor();
int rc = currTaskDescr->ThreadLimit();		int rc = currTaskDescr->ThreadLimit();
PRINT(LD_IO, "call omp_get_thread_limit() return %d\n", rc);		PRINT(LD_IO, "call omp_get_thread_limit() return %d\n", rc);
▲ Show 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	if (level <= totLevel) {
ASSERT0(LT_FUSSY, currTaskDescr,		ASSERT0(LT_FUSSY, currTaskDescr,
"do not expect fct to be called in a non-active thread");		"do not expect fct to be called in a non-active thread");
do {		do {
if (DON(LD_IOD)) {		if (DON(LD_IOD)) {
// print current state		// print current state
omp_sched_t sched = currTaskDescr->GetRuntimeSched();		omp_sched_t sched = currTaskDescr->GetRuntimeSched();
PRINT(LD_ALL,		PRINT(LD_ALL,
"task descr %s %d: %s, in par %d, dyn %d, rt sched %d,"		"task descr %s %d: %s, in par %d, dyn %d, rt sched %d,"
" chunk %lld; tid %d, tnum %d, nthreads %d\n",		" chunk %" PRIu64 "; tid %d, tnum %d, nthreads %d\n",
		grokosUnsubmitted Not Done Reply Inline Actions Chunk is defined as `unsigned long long`, shouldn't the modifier be `%llu`? grokos: Chunk is defined as `unsigned long long`, shouldn't the modifier be `%llu`?
		guansongAuthorUnsubmitted Not Done Reply Inline Actions sure guansong: sure
"ancestor", steps,		"ancestor", steps,
(currTaskDescr->IsParallelConstruct() ? "par" : "task"),		(currTaskDescr->IsParallelConstruct() ? "par" : "task"),
currTaskDescr->InParallelRegion(), currTaskDescr->IsDynamic(),		currTaskDescr->InParallelRegion(), currTaskDescr->IsDynamic(),
sched, currTaskDescr->RuntimeChunkSize(),		sched, currTaskDescr->RuntimeChunkSize(),
currTaskDescr->ThreadId(), currTaskDescr->ThreadsInTeam(),		currTaskDescr->ThreadId(), currTaskDescr->ThreadsInTeam(),
currTaskDescr->NThreads());		currTaskDescr->NThreads());
}		}

▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines

EXTERN void omp_set_schedule(omp_sched_t kind, int modifier) {		EXTERN void omp_set_schedule(omp_sched_t kind, int modifier) {
PRINT(LD_IO, "call omp_set_schedule(sched %d, modif %d)\n", (int)kind,		PRINT(LD_IO, "call omp_set_schedule(sched %d, modif %d)\n", (int)kind,
modifier);		modifier);
if (kind >= omp_sched_static && kind < omp_sched_auto) {		if (kind >= omp_sched_static && kind < omp_sched_auto) {
omptarget_nvptx_TaskDescr *currTaskDescr = getMyTopTaskDescriptor();		omptarget_nvptx_TaskDescr *currTaskDescr = getMyTopTaskDescriptor();
currTaskDescr->SetRuntimeSched(kind);		currTaskDescr->SetRuntimeSched(kind);
currTaskDescr->RuntimeChunkSize() = modifier;		currTaskDescr->RuntimeChunkSize() = modifier;
PRINT(LD_IOD, "omp_set_schedule did set sched %d & modif %d\n",		PRINT(LD_IOD, "omp_set_schedule did set sched %d & modif %" PRIu64 "\n",
		grokosUnsubmitted Not Done Reply Inline Actions Same here, `%llu`? grokos: Same here, `%llu`?
		guansongAuthorUnsubmitted Not Done Reply Inline Actions Yes guansong: Yes
(int)currTaskDescr->GetRuntimeSched(),		(int)currTaskDescr->GetRuntimeSched(),
currTaskDescr->RuntimeChunkSize());		currTaskDescr->RuntimeChunkSize());
}		}
}		}

EXTERN omp_proc_bind_t omp_get_proc_bind(void) {		EXTERN omp_proc_bind_t omp_get_proc_bind(void) {
PRINT0(LD_IO, "call omp_get_proc_bin() is true, regardless on state\n");		PRINT0(LD_IO, "call omp_get_proc_bin() is true, regardless on state\n");
return omp_proc_bind_true;		return omp_proc_bind_true;
▲ Show 20 Lines • Show All 190 Lines • Show Last 20 Lines

libomptarget/deviceRTLs/nvptx/src/loop.cu

Show First 20 Lines • Show All 234 Lines • ▼ Show 20 Lines	INLINE static void dispatch_init(kmp_sched_t schedule, T lb, T ub, ST st,
* various dynamic cases. (In paritcular, whether or not a stealing scheme		* various dynamic cases. (In paritcular, whether or not a stealing scheme
* is legal).		* is legal).
*/		*/
schedule = SCHEDULE_WITHOUT_MODIFIERS(schedule);		schedule = SCHEDULE_WITHOUT_MODIFIERS(schedule);

// Process schedule.		// Process schedule.
if (tnum == 1 \|\| tripCount <= 1 \|\| OrderedSchedule(schedule)) {		if (tnum == 1 \|\| tripCount <= 1 \|\| OrderedSchedule(schedule)) {
PRINT(LD_LOOP,		PRINT(LD_LOOP,
"go sequential as tnum=%d, trip count %lld, ordered sched=%d\n",		"go sequential as tnum=%ld, trip count %lld, ordered sched=%d\n",
tnum, P64(tripCount), schedule);		(long)tnum, P64(tripCount), schedule);
schedule = kmp_sched_static_chunk;		schedule = kmp_sched_static_chunk;
chunk = tripCount; // one thread gets the whole loop		chunk = tripCount; // one thread gets the whole loop

} else if (schedule == kmp_sched_runtime) {		} else if (schedule == kmp_sched_runtime) {
// process runtime		// process runtime
omp_sched_t rtSched = currTaskDescr->GetRuntimeSched();		omp_sched_t rtSched = currTaskDescr->GetRuntimeSched();
chunk = currTaskDescr->RuntimeChunkSize();		chunk = currTaskDescr->RuntimeChunkSize();
switch (rtSched) {		switch (rtSched) {
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	if (schedule == kmp_sched_static_chunk) {
ST stride;		ST stride;
T threadId = GetOmpThreadId(tid, isSPMDMode(), isRuntimeUninitialized());		T threadId = GetOmpThreadId(tid, isSPMDMode(), isRuntimeUninitialized());
int lastiter = 0;		int lastiter = 0;
ForStaticChunk(lastiter, lb, ub, stride, chunk, threadId, tnum);		ForStaticChunk(lastiter, lb, ub, stride, chunk, threadId, tnum);
// save computed params		// save computed params
omptarget_nvptx_threadPrivateContext->Chunk(tid) = chunk;		omptarget_nvptx_threadPrivateContext->Chunk(tid) = chunk;
omptarget_nvptx_threadPrivateContext->NextLowerBound(tid) = lb;		omptarget_nvptx_threadPrivateContext->NextLowerBound(tid) = lb;
omptarget_nvptx_threadPrivateContext->Stride(tid) = stride;		omptarget_nvptx_threadPrivateContext->Stride(tid) = stride;
PRINT(LD_LOOP,		PRINT(LD_LOOP,
"dispatch init (static chunk) : num threads = %d, ub = %lld,"		"dispatch init (static chunk) : num threads = %d, ub = %" PRId64 ","
		grokosUnsubmitted Not Done Reply Inline Actions For consistency with the rest of libomptarget, use the `PRId64` macro to print an `int64_t`: "dispatch init (static chunk) : num threads = %d, ub = %" PRId64 "," grokos: For consistency with the rest of libomptarget, use the `PRId64` macro to print an `int64_t`…
		guansongAuthorUnsubmitted Not Done Reply Inline Actions Not sure what is this. I can not find this PRId64 example. guansong: Not sure what is this. I can not find this PRId64 example.
"next lower bound = %lld, stride = %lld\n",		"next lower bound = %llu, stride = %llu\n",
GetNumberOfOmpThreads(tid, isSPMDMode(), isRuntimeUninitialized()),		GetNumberOfOmpThreads(tid, isSPMDMode(), isRuntimeUninitialized()),
omptarget_nvptx_threadPrivateContext->LoopUpperBound(tid),		omptarget_nvptx_threadPrivateContext->LoopUpperBound(tid),
omptarget_nvptx_threadPrivateContext->NextLowerBound(tid),		omptarget_nvptx_threadPrivateContext->NextLowerBound(tid),
omptarget_nvptx_threadPrivateContext->Stride(tid));		omptarget_nvptx_threadPrivateContext->Stride(tid));

} else if (schedule == kmp_sched_static_nochunk) {		} else if (schedule == kmp_sched_static_nochunk) {
ASSERT0(LT_FUSSY, chunk == 0, "bad chunk value");		ASSERT0(LT_FUSSY, chunk == 0, "bad chunk value");
// save ub		// save ub
omptarget_nvptx_threadPrivateContext->LoopUpperBound(tid) = ub;		omptarget_nvptx_threadPrivateContext->LoopUpperBound(tid) = ub;
// compute static chunk		// compute static chunk
ST stride;		ST stride;
T threadId = GetOmpThreadId(tid, isSPMDMode(), isRuntimeUninitialized());		T threadId = GetOmpThreadId(tid, isSPMDMode(), isRuntimeUninitialized());
int lastiter = 0;		int lastiter = 0;
ForStaticNoChunk(lastiter, lb, ub, stride, chunk, threadId, tnum);		ForStaticNoChunk(lastiter, lb, ub, stride, chunk, threadId, tnum);
// save computed params		// save computed params
omptarget_nvptx_threadPrivateContext->Chunk(tid) = chunk;		omptarget_nvptx_threadPrivateContext->Chunk(tid) = chunk;
omptarget_nvptx_threadPrivateContext->NextLowerBound(tid) = lb;		omptarget_nvptx_threadPrivateContext->NextLowerBound(tid) = lb;
omptarget_nvptx_threadPrivateContext->Stride(tid) = stride;		omptarget_nvptx_threadPrivateContext->Stride(tid) = stride;
PRINT(LD_LOOP,		PRINT(LD_LOOP,
"dispatch init (static nochunk) : num threads = %d, ub = %lld,"		"dispatch init (static nochunk) : num threads = %d, ub = %" PRId64 ","
		grokosUnsubmitted Not Done Reply Inline Actions `PRId64` grokos: `PRId64`
"next lower bound = %lld, stride = %lld\n",		"next lower bound = %llu, stride = %llu\n",
GetNumberOfOmpThreads(tid, isSPMDMode(), isRuntimeUninitialized()),		GetNumberOfOmpThreads(tid, isSPMDMode(), isRuntimeUninitialized()),
omptarget_nvptx_threadPrivateContext->LoopUpperBound(tid),		omptarget_nvptx_threadPrivateContext->LoopUpperBound(tid),
omptarget_nvptx_threadPrivateContext->NextLowerBound(tid),		omptarget_nvptx_threadPrivateContext->NextLowerBound(tid),
omptarget_nvptx_threadPrivateContext->Stride(tid));		omptarget_nvptx_threadPrivateContext->Stride(tid));

} else if (schedule == kmp_sched_dynamic \|\| schedule == kmp_sched_guided) {		} else if (schedule == kmp_sched_dynamic \|\| schedule == kmp_sched_guided) {
if (chunk < 1)		if (chunk < 1)
chunk = 1;		chunk = 1;
Counter eventNum = ((tripCount - 1) / chunk) + 1; // number of chunks		Counter eventNum = ((tripCount - 1) / chunk) + 1; // number of chunks
// but each thread (but one) must discover that it is last		// but each thread (but one) must discover that it is last
eventNum += tnum;		eventNum += tnum;
omptarget_nvptx_threadPrivateContext->Chunk(tid) = chunk;		omptarget_nvptx_threadPrivateContext->Chunk(tid) = chunk;
omptarget_nvptx_threadPrivateContext->EventsNumber(tid) = eventNum;		omptarget_nvptx_threadPrivateContext->EventsNumber(tid) = eventNum;
PRINT(LD_LOOP,		PRINT(LD_LOOP,
"dispatch init (dyn) : num threads = %d, ub = %lld, chunk %lld, "		"dispatch init (dyn) : num threads = %d, ub = %" PRId64 ", chunk %" PRIu64 ", "
		grokosUnsubmitted Not Done Reply Inline Actions Same here, `PRId64` for ub, `%llu` for chunk. grokos: Same here, `PRId64` for ub, `%llu` for chunk.
"events number = %lld\n",		"events number = %llu\n",
GetNumberOfOmpThreads(tid, isSPMDMode(), isRuntimeUninitialized()),		GetNumberOfOmpThreads(tid, isSPMDMode(), isRuntimeUninitialized()),
omptarget_nvptx_threadPrivateContext->LoopUpperBound(tid),		omptarget_nvptx_threadPrivateContext->LoopUpperBound(tid),
omptarget_nvptx_threadPrivateContext->Chunk(tid),		omptarget_nvptx_threadPrivateContext->Chunk(tid),
omptarget_nvptx_threadPrivateContext->EventsNumber(tid));		omptarget_nvptx_threadPrivateContext->EventsNumber(tid));
}		}
}		}

////////////////////////////////////////////////////////////////////////////////		////////////////////////////////////////////////////////////////////////////////
▲ Show 20 Lines • Show All 422 Lines • Show Last 20 Lines

libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h

	Show All 13 Lines

	#ifndef __OMPTARGET_NVPTX_H			#ifndef __OMPTARGET_NVPTX_H
	#define __OMPTARGET_NVPTX_H			#define __OMPTARGET_NVPTX_H

	// std includes			// std includes
	#include <stdint.h>			#include <stdint.h>
	#include <stdlib.h>			#include <stdlib.h>

				#include <inttypes.h>

	// cuda includes			// cuda includes
	#include <cuda.h>			#include <cuda.h>
	#include <math.h>			#include <math.h>

	// local includes			// local includes
	#include "counter_group.h"			#include "counter_group.h"
	#include "debug.h" // debug			#include "debug.h" // debug
	#include "interface.h" // interfaces with omp, compiler, and user			#include "interface.h" // interfaces with omp, compiler, and user
	▲ Show 20 Lines • Show All 386 Lines • Show Last 20 Lines

libomptarget/deviceRTLs/nvptx/src/supporti.h

Show First 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	INLINE unsigned long PadBytes(unsigned long size,
ASSERT(LT_FUSSY, (alignment & (alignment - 1)) == 0,		ASSERT(LT_FUSSY, (alignment & (alignment - 1)) == 0,
"alignment %ld is not a power of 2\n", alignment);		"alignment %ld is not a power of 2\n", alignment);
return (~(unsigned long)size + 1) & (alignment - 1);		return (~(unsigned long)size + 1) & (alignment - 1);
}		}

INLINE void SafeMalloc(size_t size, const char msg) // check if success		INLINE void SafeMalloc(size_t size, const char msg) // check if success
{		{
void *ptr = malloc(size);		void *ptr = malloc(size);
PRINT(LD_MEM, "malloc data of size %d for %s: 0x%llx\n", size, msg, P64(ptr));		PRINT(LD_MEM, "malloc data of size %zu for %s: 0x%llx\n", size, msg, P64(ptr));
ASSERT(LT_SAFETY, ptr, "failed to allocate %d bytes for %s\n", size, msg);		ASSERT(LT_SAFETY, ptr, "failed to allocate %zu bytes for %s\n", size, msg);
return ptr;		return ptr;
		grokosUnsubmitted Not Done Reply Inline Actions `size_t` can be printed in a portable way with the `%zu` modifier. For the nvptx RTL this is not a problem, but in general `size_t` is not guaranteed to be a long int, so usage of `%zu` is preferred. This is what we use in the rest of libomptarget. Can you use `%zu` here as well for consistency? grokos: `size_t` can be printed in a portable way with the `%zu` modifier. For the nvptx RTL this is…
		guansongAuthorUnsubmitted Not Done Reply Inline Actions sure. guansong: sure.
}		}

INLINE void SafeFree(void ptr, const char *msg) {		INLINE void SafeFree(void ptr, const char *msg) {
PRINT(LD_MEM, "free data ptr 0x%llx for %s\n", P64(ptr), msg);		PRINT(LD_MEM, "free data ptr 0x%llx for %s\n", P64(ptr), msg);
free(ptr);		free(ptr);
return NULL;		return NULL;
}		}

Show All 26 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Remove compilation warning when using clang to compile bc files.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 144057

libomptarget/deviceRTLs/nvptx/src/counter_groupi.h

libomptarget/deviceRTLs/nvptx/src/libcall.cu

libomptarget/deviceRTLs/nvptx/src/loop.cu

libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h

libomptarget/deviceRTLs/nvptx/src/supporti.h

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Remove compilation warning when using clang to compile bc files.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 144057

libomptarget/deviceRTLs/nvptx/src/counter_groupi.h

libomptarget/deviceRTLs/nvptx/src/libcall.cu

libomptarget/deviceRTLs/nvptx/src/loop.cu

libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h

libomptarget/deviceRTLs/nvptx/src/supporti.h

[OpenMP] Remove compilation warning when using clang to compile bc files.
ClosedPublic