Page MenuHomePhabricator

Sema: Accept pointers to any address space for builtin functions

Authored by tstellarAMD on Mar 5 2015, 7:28 AM.



As long as they don't have an address space explicitly defined.

This allows builtins with pointer arguments to be used with OpenCL.

Diff Detail


Event Timeline

tstellarAMD retitled this revision from to Sema: Accept pointers to any address space for builtin functions.
tstellarAMD updated this object.
tstellarAMD edited the test plan for this revision. (Show Details)
tstellarAMD set the repository for this revision to rL LLVM.
tstellarAMD added a subscriber: Unknown Object (MLST).
tstellarAMD edited subscribers, added: Unknown Object (MLST); removed: Unknown Object (MLST).

Shall TODO/comment be added somewhere saying that handling address space number of pointers in builtins is still to be implemented?

Also what the number should be for OpenCL? Should it be taken from LangAS::ID enum?


Not sure if this cast might create a problem in some OpenCL-GPU architectures, because spec generally disallows conversion between constant and any other address spaces (see OpenCL C v2.0 s6.5.5).

I feel like a better way to handle this would be to create separate builtins overloads for constant and generic address space in OpenCL v2.0 and for all address spaces in OpenCL <v2.0. But this seems more work to me.

Shall TODO/comment be added somewhere saying that handling address space number of pointers in builtins is still to be implemented?

This appears to be implemented already, but there are no builtins using it. What does need to be done is add a way to
specify a named address space, like constant or local, because right now the builtins only accept integer address spaces.


I'm not sure about the cast issue. I do see that for memcpy addrspacecast IR instructions are emitted.

I'm open to adding separate overloads, I just wanted to try a generic solution first.

Anastasia added inline comments.Mar 5 2015, 9:43 AM

Could you please look into this. The solution doesn't seem generic to me, because in OpenCL conversions from constant to other address spaces are not allowed. Which means this would most likely be a problem to handle correctly on the later compiler steps. In Clang however, we can avoid this issue by creating different overloads for such builtins and calling directly the right overload instead of having one overload and adding conversions to match it.

Anastasia added inline comments.Mar 5 2015, 10:09 AM

not sure how to align this with C, but for OpenCL code we would need something like:

if address space of ptr not given, create multiple signatures of the corresponding builtin.

For example for this builtin:

BUILTIN(builtin_test, "vcC*" , "nc"), Clang would have to add the following to the list of known declarations in CL2.0:

void builtin_test(generic const char* ptr);
void builtin_test(constant const char* ptr);

and for all earlier OpenCL versions:

void builtin_test(local const char* ptr);
void builtin_test(global const char* ptr);
void builtin_test(private const char* ptr);
void builtin_test(constant const char* ptr);

However, your fix might be acceptable for C.

Here is an updated patch with an entirely different approach. Instead of inserting addressspace casts, this patch rewrites the function declaration to match the address space of the call arguments.

theraven edited edge metadata.Mar 18 2015, 1:03 AM

Much better approach, a few comments inline.


We need something similar for C. We're already using (in an out-of-tree target) the specified-address-space notation for some things that only apply in one address space, but this causes some issues. For example, we want all of the atomics builtins to support pointers in multiple address spaces, but will need to emit very different machine code (and therefore, different IR) for atomics in different address spaces.

Address space casts have well-defined semantics on our architecture and are not always permitted for arbitrary pointers, so just casting to a different AS would work.


Why not a SmallVector? I don't think this will ever have more than 4 entries (or is it 6 for the atomic compare and exchange?). A SmallVector<QualType, 8> should avoid any heap allocation.


Why not use FT->param_types()?


This check looks expensive. Have you profiled this on large codebases with a lot of builtin calls? I'd imagine that adding a DenseMap cache would be faster.


Is it worth having a fast path for builtins that don't have any pointer params?


I think the fast-path for builtins without pointers would help here. What keys/data were you thinking would go in the DenseMap?

Anastasia added inline comments.Mar 18 2015, 10:20 AM

Should address space of return type also be taken into account?


typo: arguemnts -> arguments


remove empty line


You need "expected-no-diagnostics" only once in the file (for example after RUN line).


remove empty line

tstellarAMD edited edge metadata.

Hi, here is an updated patch. I've addressed all comments except I did not add a DensMap to rewriteBuiltinFunctionDecl(), I wasn't sure exactly how to use this to speed up the function. I can still add this if people think it will help. I did some profiling of a function with 100,000 __builtin_memcpy() calls and this patch doesn't seem to increase compile time.

I also did not add handling for return types. I wasn't able to figure out how to get the actual call expression which had the return type used in the code. I added a TODO for this.

theraven accepted this revision.Mar 20 2015, 12:01 AM
theraven edited edge metadata.

Looks good to me! Thanks for working on this.

This revision is now accepted and ready to land.Mar 20 2015, 12:01 AM