This is an archive of the discontinued LLVM Phabricator instance.

Allow creating llvm::Function in non-zero address spaces
ClosedPublic

Authored by arichardson on May 30 2018, 9:34 AM.

Details

Summary

Most users won't have to worry about this as all of the
'getOrInsertFunction' functions on Module will default to the program
address space.

An overload has been added to Function::Create to abstract away the
details for most callers.

This is based on https://reviews.llvm.org/D37054 but without the changes to
make passing a Module to Function::Create() mandatory. I have also added
some more tests and fixed the LLParser to accept call instructions for
types in the program address space.

Diff Detail

Repository
rL LLVM

Event Timeline

arichardson created this revision.May 30 2018, 9:34 AM
bjope added a subscriber: bjope.May 31 2018, 1:26 PM
nhaehnle removed a subscriber: nhaehnle.Jun 4 2018, 7:19 AM
arichardson added a reviewer: bjope.

Ping?

It would be really good if something like this could land soon as I'm currently in the process of doing a merge with upstream LLVM in our fork.
I just ran into the problem that always_inline wasn't working at -O0 for us due to functions being in AS0 but the calls in AS200 and the alwaysInliner only handles direct calls without an addressspacecast. This would be fixed if we could have functions in AS200.

Ping? I've applied this to our fork and it has improved inlining and helps us avoid bugs due to using the wrong address space. I would very much like to have this upstream.

bjope added inline comments.Jul 2 2018, 4:38 PM
lib/AsmParser/LLParser.cpp
1272 ↗(On Diff #149141)

This only works if we already have parsed the DataLayout string.

If we need to rely on having parsed the DataLayout string (or even relying on that there is one in the LL file), why should we then need to decorate functions with addrspace attributes instead of simply using the ProgramAddressSpace from the DataLayout?

Wasn't the usage of addrspace attributes on functions a way to avoid the dependecy to a potentially not yet parsed DataLayout inside LLParser (see RFC discussion here http://lists.llvm.org/pipermail/llvm-dev/2017-July/115388.html).

I think there are at least three alternatives:

  1. Make the DataLayout string with 'P' mandatory (early in LL file) when using non-zero program addrspace. Then maybe the simplest thing is to use M->getDataLayout().getProgramAddressSpace() in ParseFunctionHeader (line 5064) and skip the explicit addrspace attributes on function define/declare.
  1. Skip verifying (during parsing) that the addrspace matches for forward declaration of a function (at least when originating from a call). The define/declare then needs to replace/fixup the Function to use the correct address space. Although I'm not sure exactly how to do this.
  1. Make it possible to indicate addrspace of a function in the call statement. I guess we would need to come up with a new syntax for that.

I'm a little worried that the need to add addrspace attributes on function declarations will make it a bit more cumbersome to write test cases. That is ofcourse not a major problem. Alternative 3 will make it even more complicated. Alternative 1 is probably the simplest, and even if it does not address everything from the RFC, at least it will move us forward a little bit.

bjope added inline comments.Jul 3 2018, 9:07 AM
lib/AsmParser/LLParser.cpp
1272 ↗(On Diff #149141)

We can add at the following as a fourth altenative (as a variant of 1):

  1. Make the DataLayout string with 'P' mandatory (early in LL file or on the command line) when using non-zero program addrspace (at least until we find a better solution). But also require that addrspace is explicitly set on function definition/declarations. I think that makes alternative 4 what we currently have in this patch.
bjope added inline comments.Jul 4 2018, 5:21 AM
lib/AsmParser/LLParser.cpp
1256 ↗(On Diff #149141)

My comments in checkValidVariableType are related to a test case that looked something like this:

; RUN: llc -o /dev/null %s

define void @main() {
entry:
  %0 = tail call i32 @llvm.phx.aload(i16* null)
  ...
}

declare i32 @llvm.phx.aload(i16* nocapture) addrspace(40)

And when creating the fwd ref for @llvm.phx.aload.i32.p21i16 when parsing the call it will be given the default addrspace zero. As far as I can see that will happen here even if we have specified "Px" in the datalayout string. So I'm getting errors when parsing the declare since it will find the function from the forward reference having a different address space.

If I understand correctly this special handling of IsCall in checkValidVariableType would have accepted the missing addrspace notation in the call if it wasn't a fwd ref (e.g. if I had declared @llvm.phx.aload.i32.p21i16 before @main). But with the order above we create the function with addrspace zero here, and then when we get to the declare we get into trouble.

So even with alterntative (1) and (4) from the discussion in checkValidVariableType I think we would need to use the ProgramAddressSpace here when creating the Function. Or am I missing something in my analysis?

arichardson added inline comments.Jul 8 2018, 1:16 PM
lib/AsmParser/LLParser.cpp
1256 ↗(On Diff #149141)

Yes, that's exactly the issue that I ran into after applying the patch to our fork. I've currently added a horrible hack to always allow address space 200 declarations, but I'm trying to find a better solution.

I am tempted to use option 2 since we were previously using an addrspacecast in every function call and that broke stuff like always_inline at -O0. A new syntax for the addressspace could probably avoid that but I'm not sure if that is the best solution.

I wonder if just setting the address space for foward declaration address space to -1 and fixing it up once we have the data layout would work. I'll try this approach and see if it works.

Chose option 3) to handle forward-declarations.
I tried updating the address space later but that caused lots of problems.
Instead the call/invoke instruction can now accept an optional addrspace(N)
qualifier to indicate the address space of the called function.

arsenm added inline comments.Jul 9 2018, 9:07 AM
include/llvm/IR/Function.h
155–156 ↗(On Diff #154614)

Creating this looks like an entirely unrelated change that should be a separate patch

bjope added inline comments.Jul 9 2018, 10:56 AM
test/Bitcode/function-address-space-fwd-decl.ll
6 ↗(On Diff #154614)

Please call this @bar (or something) instead of @llvm.phx.aload.
(I should probably have done that in the example I gave you earlier, because I prefer if this doesn't match with the name of an intrinsic in our out-of-tree target).

Remove unrelated change and don't use llvm.phx.aload()

bjope added a comment.Jul 11 2018, 4:19 AM

This seems to work quite nice for me now! Nice!

I added a few inline nit comments. Apart from that it looks good to me.
Although, I don't know much about the bitcode reader/writer. Anyone else who can help out reviewing that part?

PS. I still have a few worries about how well this plays together with instrprof, we have had some hacks in that area earlier. Methods like appendToGlobalArray (in ModuleUtils.cpp) still uses PointerType::getUnqual(FnTy) to strip off the addrspace from function pointers (and afaik the size of a pointer can be different in different adress spaces).
Although, I don't think we can solve all problems right away. This patch at least gives us the power to set addrspace on functions. And that should help out if I for example want to reproduce instrprof problems on trunk (using an "experimental target" with non-zero program addrspace). So I think we can leave that for later.

lib/AsmParser/LLParser.cpp
1333 ↗(On Diff #154653)

nit: End code comment with a punctuation.

1521 ↗(On Diff #154653)

Fwiw, I created a ParseOptionalProgramAddrSpace method when adapting this to our out-of-tree target. And then I use that method when parsing the optional addrspace string when we know that it is a program address space (in define/declare/call/invoke).
One still needs to be explicit with addrspace when dealing with function pointers (in casts etc), but that is how we have been doing it in the past. And the use of datalayout ofcourse means that I need to make sure the datalayout is specified early in the ll-file.
Anyway, the above gave me the possibility to override the default value 0 (e.g. by using M->getDataLayout().getProgramAddressSpace()) for our target. We will at least use that as a temporary solution until we have adapted all our frontends (and test cases) to use the new syntax.

lib/IR/Function.cpp
224 ↗(On Diff #154653)

nit: punctuation

arichardson marked 2 inline comments as done.Jul 12 2018, 2:38 PM

This seems to work quite nice for me now! Nice!

I added a few inline nit comments. Apart from that it looks good to me.
Although, I don't know much about the bitcode reader/writer. Anyone else who can help out reviewing that part?

PS. I still have a few worries about how well this plays together with instrprof, we have had some hacks in that area earlier. Methods like appendToGlobalArray (in ModuleUtils.cpp) still uses PointerType::getUnqual(FnTy) to strip off the addrspace from function pointers (and afaik the size of a pointer can be different in different adress spaces).
Although, I don't think we can solve all problems right away. This patch at least gives us the power to set addrspace on functions. And that should help out if I for example want to reproduce instrprof problems on trunk (using an "experimental target" with non-zero program addrspace). So I think we can leave that for later.

In our fork I've deleted all uses of PointerType::getUnqual() from clang, and removed those in LLVM that have caused issues. I'm hoping to slowly upstream those patches and eventually remove PointerType::getUnqual().
I agree that those problems should be addressed by a later patch.

lib/AsmParser/LLParser.cpp
1521 ↗(On Diff #154653)

Thanks, that's a great suggestion! It should allow us to keep our existing tests without adding addrspace(200) to every call instr. I'll update the patch to use that instead.

arichardson marked 2 inline comments as done.

Add parseOptionalProgrammAddrSpace() as suggested
Update tests

uabelho added a subscriber: uabelho.Aug 5 2018, 9:56 PM
bjope accepted this revision.Aug 22 2018, 10:42 AM

LGTM!

As mentioned earlier there still are some PointerType::getUnqual calls (mainly related to sanitizers and profiling?) that might need attention in the future. I don't see that as a stopper for this patch, specially since this shouldn't impact targets where function address space is zero (except that there will be an extra field for function address space in the bitcode). And I think it will be easier to do future fixes related to program address space when we have this basic framework in place.

This revision is now accepted and ready to land.Aug 22 2018, 10:42 AM
This revision was automatically updated to reflect the committed changes.