This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] CUDA has no device-side library builtins.
ClosedPublic

Authored by tra on Jan 19 2018, 2:40 PM.

Details

Summary

We should (almost) never consider a device-side declaration to match a
builtin. If we do, the un-inlined device-side functions provided by
CUDA headers that ship with clang may be ignored. We may end up emitting
as a call to a llvm intrinsic which would typically be lowered as
an external library call. This results in a back-end error because NVPTX
back-end does not support it.

Event Timeline

tra created this revision.Jan 19 2018, 2:40 PM

How does this affect e.g. calling memcpy()? There isn't a standard library implementation of this on nvptx, but we do want calls to memcpy() to be lowered to llvm.memcpy so that they can be optimized.

tra added a comment.Jan 22 2018, 2:55 PM

How does this affect e.g. calling memcpy()? There isn't a standard library implementation of this on nvptx, but we do want calls to memcpy() to be lowered to llvm.memcpy so that they can be optimized.

We implement memcpy as a call to __builtin_memcpy() which gets code-gen-ed as usual. NVPTX also lowers all memcpy/memset/memmove as loads/stores, so those don't need external library. This behavior is not affected by this patch.

This patch's goal is to prevent clang codegen-ing its idea of the library builtin function while ignoring the implementation we've provided in the headers for device side.

Original issue I had was triggered by code roughly similar to this:

extern "C" __device__ int logf(float a) { return __nv_logf(a); }
__global__ void kernel() { logf(0.0f); }

In the AST, the kernel was calling the logf functions above However, when clang generated code, it considered that logf is a library builtin with known semantics and happily codegen'ed a call to @llvm.log.f32, which NVPTX back-end has no way to lower. The patch adds a safety net in clang so it does not generate code for builtins which we have disabled (or can't handle) in NVPTX.

jlebar accepted this revision.Jan 22 2018, 4:12 PM

Got it, thanks for the explanation.

This revision is now accepted and ready to land.Jan 22 2018, 4:12 PM
This revision was automatically updated to reflect the committed changes.