This is an archive of the discontinued LLVM Phabricator instance.

CodeGen: Cast alloca to expected address space
ClosedPublic

Authored by yaxunl on Apr 19 2017, 3:00 PM.

Details

Summary

Alloca always returns a pointer in alloca address space, which may
be different from the type defined by the language. For example,
in C++ the auto variables are in the default address space. Therefore
cast alloca to the expected address space when necessary.

Depends on https://reviews.llvm.org/D32977

Diff Detail

Event Timeline

yaxunl created this revision.Apr 19 2017, 3:00 PM
t-tye added inline comments.Apr 19 2017, 3:56 PM
lib/CodeGen/CGDecl.cpp
1105–1119

Is any assert done to ensure that it is legal to address space cast from variable address space to expected address space? Presumably the language, by definition, will only be causing legal casts. For example from alloca address space to generic (which includes the alloca address space).

For OpenCL, can you explain how the local variable can have the constant address space and use an alloca for allocation? Wouldn't a constant address space mean it was static and so should not be using alloca? And if it is using an alloca, how can it then be accessed as if it was in constant address space?

lib/CodeGen/CodeGenTypes.h
200 ↗(On Diff #95828)

Should the name reflect that the type returned is not the variable type, but a pointer to the variable type? For example, getVariablePointerType or getPointerToVariableType.

yaxunl added inline comments.Apr 19 2017, 9:38 PM
lib/CodeGen/CGDecl.cpp
1105–1119

If the auto var has address space qualifier specified through __attribute__((address_space(n))), there is not much we can check in clang since it is target dependent. We will just emit address space cast when necessary and let the backend check the validity of the address space cast.

Otherwise, for OpenCL, we can assert the expected address space is default (for OpenCL default address space in AST represents private address space in source language) or constant. For other languages we can assert the expected address space qualifier is default (no address space qualifier). It is not convenient to further check whether the emitted LLVM address space cast instruction is valid since it requires target specific information, therefore such check is better deferred to the backend.

For OpenCL, currently automatic variable in constant address space is emitted in private address space. For example, currently Clang does not diagnose the following code

void f(global int* a) {
  global int* constant p = a;
}

Instead, it emits alloca for p, essentially treats it as global int* const p. This seems to be a bug to me (or maybe we can call it a feature? since there seems no better way to translate this to LLVM IR, or simply diagnose this as an error). However, this is better addressed by another patch.

lib/CodeGen/CodeGenTypes.h
200 ↗(On Diff #95828)

I think getPointerToVariableType is better. I will change it.

rjmccall added inline comments.Apr 19 2017, 10:40 PM
lib/CodeGen/CGDecl.cpp
1118

This is a lot of work to be doing in a pretty common routine for the benefit of one unusual target. Maybe there's some way to fast-path it, like you can cache the AST address space of the stack on the CGM (assuming it's always got its own AST address space, I guess) and just check whether Ty.getAddressSpace() is different from that.

Also, I feel like general routines in IRGen should never be calling CreateAddrSpaceCast; they should always be calling getTargetCodeGenInfo().performAddrSpaceCast instead, just in case the target has some more specific idea about how the conversion should be done. I know we're not consistent about that right now, but it's something to aim for.

Also, I agree with Tony Tye: if it's an intended feature that you can declare locals in the constant address space, you need to figure out what the allocation semantics are for that and emit the appropriate code rather than just hand-waving it here after the fact.

lib/CodeGen/CodeGenTypes.h
200 ↗(On Diff #95828)

The only time you use this function is to immediately call getAddressSpace on it. Seems to me you should just call getTargetAddressSpace(D.getType()).

t-tye added a subscriber: b-sumner.Apr 20 2017, 8:42 AM
yaxunl added inline comments.
lib/CodeGen/CGDecl.cpp
1105–1119

Hi Anastasia,

Any comments about the automatic variable in constant address space? Thanks.

Anastasia added inline comments.Apr 24 2017, 7:11 AM
lib/CodeGen/CGDecl.cpp
1105–1119

From the spec s6.5.3 it feels like we should follow the same implementation path in Clang for constant AS inside kernel function as local AS. Because constant AS objects are essentially global objects.

Although, we didn't have any issues up to now because const just does the trick of optimising the objects out eventually. I am not clear if this creates any issue now with your allocation change. It feels though that it should probably work fine just as is?

Anastasia added inline comments.Apr 24 2017, 7:25 AM
lib/CodeGen/CGDecl.cpp
1115

Do you need to check this? I thought your change doesn't affect OpenCL because alloca AS == private AS in OpenCL and constant AS local objects are treated as private const objects anyways.

test/CodeGen/address-space.c
3

CHeCK -> CHECK

test/CodeGenCXX/amdgcn-automatic-variable.cpp
26

I am wondering if all these different test cases are really needed. Are we testing any different program paths from the patch?

Also would it make sense to add a test case with an object in an AS different to 0 (i.e. with __attribute__((address_space(n))))

rjmccall added inline comments.Apr 24 2017, 11:29 AM
lib/CodeGen/CGDecl.cpp
1105–1119

If these __constant locals are required to be const (or are implicitly const?) and have constant initializers, it seems to me the implementation obviously intended by the spec is that you allocate them statically in the constant address space. It's likely that just implicitly making the variable static in Sema, the same way you make it implicitly const, will make IRGen and various other parts of the compiler just do the right thing.

yaxunl added inline comments.Apr 24 2017, 10:07 PM
lib/CodeGen/CGDecl.cpp
1105–1119

My patch does not change the current behaviour of Clang regarding function-scope variable in constant address space. Basically there is no issue if there is no address taken. However, if there is address taken, there will be assertion due to casting private pointer to constant address space. I think probably it is better to emit function-scope variable in constant address space as global variable in constant address space instead of alloca, as John suggested.

1105–1119

I agree function-scope variable in constant address space should be emitted as global variable in constant address space instead of alloca. However, in OpenCL 1.2 section 6.8, "The static storage-class specifier can only be used for non-kernel functions and global variables declared in program scope." Currently Clang diagnoses function-scope variables with static storage class for OpenCL 1.2. Therefore we cannot make function-scope variable in constant address space have static storage class. However, I think we can still emit function-scope variable in constant address space as global variable in CodeGen.

rjmccall added inline comments.Apr 24 2017, 10:31 PM
lib/CodeGen/CGDecl.cpp
1105–1119

There's precedent for treating something as implicitly static in the AST. thread_local variables in C++ are considered implicitly static.

You'll need to change VarDecl::hasLocalStorage() and isStaticLocal(), and you should look around for other uses of getTSCSpec() that look like they're implementing the same logic.

Anastasia added inline comments.Apr 25 2017, 7:40 AM
lib/CodeGen/CGDecl.cpp
1105–1119

I belive we already do similar logic for stack variables with __local AS. Because they are emmited as static globals. There should be no contradictions with spec to do the same for __constant AS objects. They are essentially intended to be located in a "special" memory sections and not on a stack. Also constant AS local variable can only be declared in the kernel functions.

yaxunl edited the summary of this revision. (Show Details)May 8 2017, 1:20 PM
yaxunl added inline comments.
lib/CodeGen/CGDecl.cpp
1105–1119

I created another patch for this issue: https://reviews.llvm.org/D32977

t-tye added inline comments.May 8 2017, 2:12 PM
lib/CodeGen/CGDecl.cpp
1115

Given patch D32977 that makes function local variables marked with the constant address space treated as static, will they ever be processed by this function now? If so then the extra test for opencl_constant would not be needed.

yaxunl marked 18 inline comments as done.May 8 2017, 3:13 PM
yaxunl added inline comments.
lib/CodeGen/CGDecl.cpp
1115

Will remove this.

test/CodeGenCXX/amdgcn-automatic-variable.cpp
26

I think at least I should cover the typical use cases of auto var.

I will add a test for attribute((address_space(n))))

yaxunl marked 2 inline comments as done.May 8 2017, 9:04 PM
yaxunl added inline comments.
test/CodeGenCXX/amdgcn-automatic-variable.cpp
26

Sorry. I just checked that C++ does not allow attribute((address_space(n))) on automatic var.

yaxunl updated this revision to Diff 98314.May 9 2017, 9:53 AM

Revised by reviewers' comments.

rjmccall added inline comments.May 10 2017, 12:00 PM
lib/CodeGen/TargetInfo.cpp
7291

How about, instead of introducing a second method, we just change performAddrSpaceCast to take two AST address spaces and a flag indicating whether the address is known to be non-null? Does your target have an AST-level address space for the stack?

yaxunl added inline comments.May 11 2017, 8:06 AM
lib/CodeGen/TargetInfo.cpp
7291

In both AST and LLVM, the destination pointee type of address space cast may be different from the source pointee type (e.g. addrspacecast i32 addrspace(1)* %a to i8 addrspace(4)* is valid LLVM instruction), so performAddrSpaceCast needs to know the destination QualType, not just the address space.

rjmccall added inline comments.May 11 2017, 10:48 AM
lib/CodeGen/TargetInfo.cpp
7291

Is there any harm to generating separate cast instructions when this happens?

yaxunl added inline comments.May 11 2017, 2:01 PM
lib/CodeGen/TargetInfo.cpp
7291

They should be equivalent but the LLVM IR will be more verbose and take more space, and LLVM passes will take more time since previously they only need to process one instruction but now they need to process two instructions.

The original representation is more concise, why should we change that to emit two instructions instead of one instruction?

rjmccall added inline comments.May 11 2017, 4:32 PM
lib/CodeGen/TargetInfo.cpp
7291

It's a cleaner API.

But fine, whatever, feel free to pass down an optional llvm::PointerType* for the destination type.

yaxunl updated this revision to Diff 99014.May 15 2017, 9:22 AM

Removed adjustAddrSpaceForAutoVar and use performAddrSpaceCast instead.

t-tye added inline comments.May 15 2017, 11:17 AM
lib/CodeGen/CGDecl.cpp
1112

Should allowing specifying an address space on a function local automatic variable in OpenCL be allowed? It seems generating LLVM IR that allocates the variable in the alloca address space then address space casting the pointer to some other address space is not what such a language feature is requesting.

It seems it is really requesting that the variable is allocated in a different address space. That could be indicated by putting a different address space on the alloca itself, but the builder only allows an alloca to use the default alloca address space. No target supports this ability.

So maybe the best solution is to make OpenCL be the same as C and not allow an address space qualifier on function scope automatic variables.

If that is done then this assert will only allow the language default address space.

yaxunl marked 2 inline comments as done.May 15 2017, 11:23 AM
yaxunl added inline comments.
lib/CodeGen/CGDecl.cpp
1112

Agree. I think that is why C++ forbids __attribute__((address_space(n)) on automatic variables. I will make it apply to all languages.

rjmccall added inline comments.May 15 2017, 12:40 PM
lib/CodeGen/CGDecl.cpp
1112

Yeah, the embedded C specification (ISO/IEC TR 18037) says "an address space name cannot be used to qualify an object that has automatic storage duration", so this restriction should be applied regardless of language mode. We can deal with the possibility of auto variables in other address spaces when someone wants to add that capability.

(OpenCL's __constant is an acceptable exception because it implicitly makes the variable static.)

1116

Hmm. I would prefer if we could avoid doing this in the fast path. Suggestion:

  • Please add a variable (ASTAllocaAddressSpace) to CodeGenTypeCache that stores the AST address space of the stack. You can add a function to CodeGenTargetInfo to initialize this; in most targets, it will just return 0, but your target can make it ~0U or something, anything works as long as it doesn't incorrectly match an actual target address space. (But you should really consider adding an actual target-specific address space for the stack! It'll probably be useful in builtin code or something eventually.)
  • Please add an assert that T.getAddressSpace() == 0, and then make this call conditional on whether ASTAllocaAddressSpace is non-zero.

You can just write address.getElementType() instead of address.getPointer()->getType()->getPointerElementType().

Are you intentionally leaving emission.Addr as the unconverted pointer? That will probably mess everything else downstream that uses 'emission', and the only reason you're not seeing that is that OpenCL probably doesn't actually have anything downstream that uses 'emission' — destructors or whatever else.

Please add an inline comment like

, /*non-null*/ true);
lib/CodeGen/TargetInfo.cpp
7294

Spurious newline?

yaxunl updated this revision to Diff 99056.May 15 2017, 1:22 PM
yaxunl marked an inline comment as done.

diagnose automatic var with invalid addr space for OpenCL.

rjmccall added inline comments.May 15 2017, 1:44 PM
lib/Sema/SemaDecl.cpp
7210 ↗(On Diff #99056)

err_opencl_function_variable seems like a better diagnostic, at least for opencl_global. You can fall back on this more general diagnostic for other address spaces.

I think you can just assert here that T.getAddressSpace() != LangAS::opencl_constant instead of checking it.

Thinking about it, it feels like opencl_local should be handled the same way: they shouldn't get here because they don't really have automatic storage duration. They're much more like a thread-local than an actual local. However, we can tackle that some other time.

yaxunl updated this revision to Diff 99061.May 15 2017, 2:21 PM

Fix diagnostic for global addr

rjmccall edited edge metadata.May 15 2017, 2:37 PM

Looking good. We had some overlapping patches and comments there, so I want to make sure you didn't miss my comment on EmitAutoVarAlloca.

lib/Sema/SemaDecl.cpp
7205 ↗(On Diff #99061)

LLVM code style doesn't put a space after assert.

yaxunl updated this revision to Diff 99094.May 15 2017, 7:21 PM

Fix format.

yaxunl added inline comments.May 15 2017, 7:37 PM
lib/CodeGen/CGDecl.cpp
1116

Sorry I missed this comment. I will do the changes.

About emission.Addr, I leave the unconverted pointer intentionally. I need this to work for both C++ and OpenCL. I checked the usage of emission.Addr and they seem to be only load or store, so I think it is safe to use the unconverted pointer. Now you mention destructor. I think I need to change that and I am going to add a lit test also.

yaxunl updated this revision to Diff 99099.May 15 2017, 9:22 PM

Revised by John's comments.

rjmccall added inline comments.May 16 2017, 12:11 AM
lib/CodeGen/CGDecl.cpp
1117

A lot of this line can be address.getElementType().

lib/CodeGen/CodeGenTypes.cpp
95

Did you intend to change this file?

lib/CodeGen/TargetInfo.cpp
7296

Can we rename LangAS::Count to something more meaningful like LangAS::FirstTargetAddressSpace? I think this would clarify a lot of code.

Is the OpenCL special case in ASTContext::getTargetAddressSpace still correct with this patch? A pointer in LangAS::Default should be lowered as a generic pointer instead of a pointer into the alloca address space, right?

yaxunl marked 13 inline comments as done.May 16 2017, 7:48 AM
yaxunl added inline comments.
lib/CodeGen/TargetInfo.cpp
7296

Will do.

The OpenCL special case in ASTContext::getTargetAddressSpace is still correct with this patch. In OpenCL private address space is still represented in AST by LangAS::Default, so a pointer in LangAS::Default should be lowered as a pointer to alloca address space.

yaxunl updated this revision to Diff 99149.May 16 2017, 8:07 AM
yaxunl marked an inline comment as done.

Revised by John's comments.

rjmccall added inline comments.May 16 2017, 11:06 AM
lib/CodeGen/TargetInfo.cpp
7296

Then I don't understand. If __private is always the alloca address space, why does IRGen have to add addrspace casts after allocating local variables in it? IRGen's invariant is that the value stored in LocalDeclMap for a variable of AST type T is a value of type [[T]] addrspace([[AS]]) *, where [[T]] is the lowered IR type for T and [[AS]] is the target address space of the variable. For auto variables, AS is always LangAS::Default, which according to you means that [[AS]] is always LLVM's notion of the alloca address space, which is exactly the type produced by alloca. So why is there ever a cast?

My understanding was that, on your target, private is (at least sometimes) a 64-bit extension of the alloca address space, which is 32-bit. But that makes private a different address space from the alloca address space, so the special case in getTargetAddressSpace is still wrong, because that special case means that a pointer-to-__private type will be lowered to be a 32-bit pointer type.

Also, I'm not sure why the normal AddrSpaceMap mechanism isn't sufficient for mapping LangAS::Default.

yaxunl added inline comments.May 16 2017, 11:28 AM
lib/CodeGen/TargetInfo.cpp
7296

This work is mostly for C++.

For OpenCL, the default address space is mapped to alloca address space (5), there is no address space cast inserted.

For C++, the default address space is mapped to generic address space (0), therefore the address space cast is needed.

rjmccall added inline comments.May 16 2017, 11:29 AM
lib/CodeGen/TargetInfo.cpp
7296

Wait, I think I might understand. You're saying that, when you're compiling OpenCL, you want __private to just be the normal 32-bit address space, but when you're compiling other languages, you want LangAS::Default to be this 64-bit extension, probably because those other languages don't break things down in address spaces as consistently as OpenCL and so it's more important for LangAS::Default to be able to reach across address spaces.

I do not feel that it's correct for this to be hard-coded in ASTContext; it really seems like a target-specific policy that should just be reflected in AddrSpaceMap. That does mean that you'll have to find some way of informing your TargetInfo that it's building OpenCL. The normal design for something like that would be a target option that the driver would automatically set when it recognized that it was building OpenCL; or you could change getAddressSpaceMap to take a LangOptions.

Anastasia added inline comments.May 16 2017, 12:08 PM
lib/CodeGen/CodeGenTypes.cpp
95

Why this change?

lib/CodeGen/TargetInfo.cpp
411

Seems like some parameters are unused.

lib/Sema/SemaDecl.cpp
7206 ↗(On Diff #99149)

I think it was nicer that all OpenCL related changes were grouped in one spot under the same if statement. Even though it isn't quite done everywhere. But this change makes them even more scattered.

test/CodeGenCXX/amdgcn-automatic-variable.cpp
26

I thought address space was undocumented extension. So essentially you can't have a cast to any address space but a default here? Would it work for the C instead to add attribute((address_space(n)))?

test/SemaOpenCL/storageclass-cl20.cl
16 ↗(On Diff #99149)

I am guessing generic should be rejected here as well... although it should only apply to pointers. But it seems we don't handle this properly.

rjmccall added inline comments.May 16 2017, 12:17 PM
lib/CodeGen/TargetInfo.cpp
411

This is just the default implementation. The idea is that targets that need to do something more complex on a particular conversion — e.g. to make sure that null pointers are translated correctly when they have different bit-patterns — can easily do so.

yaxunl marked 14 inline comments as done.May 16 2017, 2:31 PM
yaxunl added inline comments.
lib/CodeGen/CodeGenTypes.cpp
95

Will undo it.

lib/CodeGen/TargetInfo.cpp
7294

will remove it.

lib/Sema/SemaDecl.cpp
7206 ↗(On Diff #99149)

Will move them down.

test/CodeGenCXX/amdgcn-automatic-variable.cpp
26

As John pointed out, this due to the embedded C specification (ISO/IEC TR 18037). Clang applies it to all languages except OpenCL.

test/SemaOpenCL/storageclass-cl20.cl
16 ↗(On Diff #99149)

will add a line for generic.

yaxunl updated this revision to Diff 99204.May 16 2017, 2:34 PM
yaxunl marked 4 inline comments as done.

Removed special handling of OpenCL from getTargetAddressSpace.

yaxunl added inline comments.May 16 2017, 2:36 PM
lib/CodeGen/TargetInfo.cpp
7294

Sorry this line is needed to separate it from the next function.

rjmccall accepted this revision.May 16 2017, 5:41 PM

Looks great, thank you for being patient.

This revision is now accepted and ready to land.May 16 2017, 5:41 PM

Any further comments? Thanks.

This revision was automatically updated to reflect the committed changes.
tstellar added inline comments.
cfe/trunk/lib/Basic/Targets.cpp
2037–2038 ↗(On Diff #99479)

Shouldn't this be 0 ?