Require address space to be specified when creating functions (2/3)
AbandonedPublic

Authored by dylanmckay on Aug 23 2017, 1:57 AM.

Details

Summary

This makes it necessary to specify address spaces when creating new
functions.

Most users won't have to worry about this as all of the
'getOrInsertFunction' functions on Module will default to the program
address space.

An overload has been added to Function::Create to abstract away the
details for most callers.

Full patchset

  • Add program address space to data layout D37052
  • Require address space to be specified when creating functions D37054
  • [clang] Require address space to be specified when creating functions D37057

Diff Detail

dylanmckay created this revision.Aug 23 2017, 1:57 AM
dylanmckay retitled this revision from Require address space to be specified when creating functions to Require address space to be specified when creating functions (3/4).

Fix a bug where we would attempt to bitcast all AVR function pointers

Don't check address space when looking up global values by name

If we are just looking up @foobar, we shouldn't have to specify the address space.

I believe I also need to update the docs to mention the new bitcode field

  • Remove the switch table stuff for a later patch
  • Rebased on top of trunk
dylanmckay retitled this revision from Require address space to be specified when creating functions (3/4) to Require address space to be specified when creating functions (2/3).Dec 6 2017, 10:12 PM
dylanmckay edited the summary of this revision. (Show Details)
dylanmckay updated this revision to Diff 126268.Dec 9 2017, 1:11 AM

Move GlobalValue::getAddressSpace() from previous patch into here

I've applied your patches to my downstream repo, trying to figure out how much of our own patches for having different address space for functions that can be removed and still getting existing test cases to work when only having your patches.
Lots of new things happens now when there are addrspace on all function definitions. Our old implementation kind of added the addrspace information when fetching function pointers from the symbol table instead of adding addrspace when creating the functions. So I still have a couple of issues to analyse. I think that I at least tracked down one problem with calls to a function given a local function pointer variable (see my inline comment in LLParser.cpp).

There are no testcases included here with ll-files including calls/bitcasts etc with non-zero addrspace, so it is a little bit tricky to understand how far I'm supposed to get by only using your set of patches.
Are you testing this with some other out-of-tree target?

lib/AsmParser/LLParser.cpp
2691

I think we need the same patch as in GetGlobalVal here (line 1228).
I have a test case with function pointers that did not work otherwise. Here is an extract of the IR:

define void @f2(i16 %n, void (i16) addrspace(40)* %f) addrspace(40) #0 !dbg !16 {
entry:
  %n.addr = alloca i16, align 1
  %f.addr = alloca void (i16) addrspace(40)*, align 1
  %gap = alloca i16, align 1
  store i16 %n, i16* %n.addr, align 1, !tbaa !9
  store void (i16) addrspace(40)* %f, void (i16) addrspace(40)** %f.addr, align 1, !tbaa !17
  %0 = bitcast i16* %gap to i8*, !dbg !19
  call void @llvm.lifetime.start.p0i8(i64 1, i8* %0) #3, !dbg !19
  %1 = load i16, i16* %n.addr, align 1, !dbg !20, !tbaa !9
  %div = sdiv i16 %1, 2, !dbg !21
  store i16 %div, i16* %gap, align 1, !dbg !22, !tbaa !9
  br label %for.cond, !dbg !23

for.cond:                                         ; preds = %for.inc, %entry
  %2 = load i16, i16* %gap, align 1, !dbg !24, !tbaa !9
  %cmp = icmp sgt i16 %2, 0, !dbg !25
  br i1 %cmp, label %for.body, label %for.end, !dbg !26

for.body:                                         ; preds = %for.cond
  %3 = load void (i16) addrspace(40)*, void (i16) addrspace(40)** %f.addr, align 1, !dbg !27, !tbaa !17
  call void %3(i16 8), !dbg !28
  br label %for.inc, !dbg !29

And I got this error (unless I added the same patch as in GetGlobalVal):

opt: ./func_ptr.ll:43:13: error: '%3' defined with type 'void (i16) addrspace(40)*'
  call void %3(i16 8), !dbg !28
            ^
bjope added a comment.Feb 8 2018, 3:25 AM

I've applied your patches to my downstream repo, trying to figure out how much of our own patches for having different address space for functions that can be removed and still getting existing test cases to work when only having your patches.
Lots of new things happens now when there are addrspace on all function definitions. Our old implementation kind of added the addrspace information when fetching function pointers from the symbol table instead of adding addrspace when creating the functions. So I still have a couple of issues to analyse. I think that I at least tracked down one problem with calls to a function given a local function pointer variable (see my inline comment in LLParser.cpp).

There are no testcases included here with ll-files including calls/bitcasts etc with non-zero addrspace, so it is a little bit tricky to understand how far I'm supposed to get by only using your set of patches.
Are you testing this with some other out-of-tree target?

ping!

bjope added a comment.Feb 20 2018, 1:14 AM

I've applied your patches to my downstream repo, trying to figure out how much of our own patches for having different address space for functions that can be removed and still getting existing test cases to work when only having your patches.
Lots of new things happens now when there are addrspace on all function definitions. Our old implementation kind of added the addrspace information when fetching function pointers from the symbol table instead of adding addrspace when creating the functions. So I still have a couple of issues to analyse. I think that I at least tracked down one problem with calls to a function given a local function pointer variable (see my inline comment in LLParser.cpp).

There are no testcases included here with ll-files including calls/bitcasts etc with non-zero addrspace, so it is a little bit tricky to understand how far I'm supposed to get by only using your set of patches.
Are you testing this with some other out-of-tree target?

ping!

I see that part 1/3 has been landed now. As I mentioned above this second part at least breaks things for our out-of-tree target.
I've not analysed this any more since I discovered the problems in LLParser in December. I'm still kind of waiting for some response to understand more about the status of this patch (is it tested? is it supposed to work? do you have some test cases?).

Looks good once the IR parser accepts calls to non-AS0 pointers.

lib/AsmParser/LLParser.cpp
1254

I believe loading a global in a different address space should still be an error, only calling is fine.

2691

Yes, we run into the same issue with CHERI and in our fork I added a hack to simply accept any pointer in AS200.

However, I believe this should only be okay for call instructions since we really don't want any implicit conversions from AS200 to AS0 pointers.

nhaehnle removed a subscriber: nhaehnle.Feb 21 2018, 9:16 AM

@bjope does D43645 fix the LLParser issues for you?

bjope added a comment.EditedFeb 23 2018, 7:55 AM

@bjope does D43645 fix the LLParser issues for you?

Thanks for the feedback @arichardson.
The D43645 solves some problems. I will do some more analysis of the problems I see in our local test suite (and hopefully I can create similar patches).

One problem I've seen is related to globals with forward reference of a function pointer (although I've not tested it with a clean top-of-tree yet):

%fun1 = type i16 (i16)
%funptr1 = type %fun1 addrspace(40)*

@fun_ptr = global %funptr1 @fun

define i16 @fun(i16 %arg) addrspace(40) {
  ret i16 %arg
}

I think that we should replace

PointerType *PFT = PointerType::getUnqual(FT);

by

PointerType *PFT = PointerType::get(FT, AddrSpace);

in LLParser::ParseFunctionHeader. That seems to solve the above problem at least.

Another problem I have is related to the declare construct. The test case is doing something like this:

define %int4 @fn1 () addrspace(40) {
...
%_tmp13 = call %int4 (%ptr6, %int4) bitcast ( %int4 (...)addrspace(40)* @fn2 to %int4 (%ptr6, %int4)addrspace(40)*)(%ptr6 %_tmp9, %int4 %_tmp12)
...
}

declare %int4 @fn2 (...);

and I get

error: constant expression type mismatch
%_tmp13 = call %int4 (%ptr6, %int4) bitcast ( %int4 (...)addrspace(40)* @fn2 to %int4 (%ptr6, %int4)addrspace(40)*)(%ptr6 %_tmp9, %int4 %_tmp12)
                                    ^

Maybe there should be an addrspace attribute on the declare statement? At the moment it does not help, because LLParser does not parse it.

Thanks for working on this, it will make life a lot easier for us when it lands.

lib/AsmParser/LLParser.cpp
5051

As @bjope says this should use PointerType::get(FT, AddrSpace)

In our fork (https://github.com/CTSRD-CHERI/llvm / https://github.com/CTSRD-CHERI/clang) I have removed all uses of PointerType::getUnqual() from clang and most of them from LLVM since they almost always cause issues for us.

There are no testcases included here with ll-files including calls/bitcasts etc with non-zero addrspace, so it is a little bit tricky to understand how far I'm supposed to get by only using your set of patches.

I've not analysed this any more since I discovered the problems in LLParser in December. I'm still kind of waiting for some response to understand more about the status of this patch (is it tested? is it supposed to work? do you have some test cases?).

I will work on adding some more test cases @bjope, sorry for being so slow with this!

@arichardson

Thanks a lot for looking into and submitting D43645! I will happily review it

@bjope

One problem I've seen is related to globals with forward reference of a function pointer (although I've not tested it with a clean top-of-tree yet)
..
Maybe there should be an addrspace attribute on the declare statement? At the moment it does not help, because LLParser does not parse it.

Your writeup and suggested change SGTM, I will fix the patch up and resubmit

Gentle rebase

arichardson added inline comments.Feb 27 2018, 5:38 AM
docs/LangRef.rst
772

Just wondering if it would be better to put functions without an explicit address space in the program address space?
I think this may be more consistent since creating a new FunctionType without an address space will put it in the program address space and not address space zero.
I am not sure if there are any backwards compatibility constraints here since IR generated by an older version will never have the program address space in the datalayout string so they should also end up in address space zero.

Putting everything in the program address space should make it easier for us to keep the existing IR testcases once we expect all functions to be in address space 200.

bjope added inline comments.Feb 27 2018, 9:50 AM
docs/LangRef.rst
772

That sounds interesting. I'll probably try to do that for our out-of-tree target. Because we currently have two different frontends (one based on clang). With this commit the clang frontend will start to add explicit address spaces on functions, but it will take some time for me to adapt the other frontend to do the same in all situations.

The result would be quite similar to what we have out-of-tree today. We have patched both frontends to set a non-zero program address space in the DataLayout. But we do not set the address space explicitly on function declarations.

Without having the AS explicit when parsing the LL file we need to parse the DataLayout string first, to know about the targets default program address space when parsing the functions declarations. I think that is one problem those explicit address space annotations was supposed to help out with. So I'm not sure if it is feasible for the upstream LLVM trunk. However, normally the DataLayout is in the beginning of the .ll-file (no idea about .bc).

dylanmckay updated this revision to Diff 136733.Mar 2 2018, 6:27 AM
dylanmckay marked 4 inline comments as done.

Rebase and remove cherry-picked function now in trunk

LGTM but I guess someone else should approve

lib/AsmParser/LLParser.cpp
1244

This line can be removed since isValidVariableType() does that check too

bjope added a comment.Mar 2 2018, 10:10 AM

Another problem I have is related to the declare construct. The test case is doing something like this:

define %int4 @fn1 () addrspace(40) {
...
%_tmp13 = call %int4 (%ptr6, %int4) bitcast ( %int4 (...)addrspace(40)* @fn2 to %int4 (%ptr6, %int4)addrspace(40)*)(%ptr6 %_tmp9, %int4 %_tmp12)
...
}

declare %int4 @fn2 (...);

and I get

error: constant expression type mismatch
%_tmp13 = call %int4 (%ptr6, %int4) bitcast ( %int4 (...)addrspace(40)* @fn2 to %int4 (%ptr6, %int4)addrspace(40)*)(%ptr6 %_tmp9, %int4 %_tmp12)
                                    ^

Maybe there should be an addrspace attribute on the declare statement? At the moment it does not help, because LLParser does not parse it.

What about this? How is it supposed to work? Is it supposed to work? Or is it a question for the future?
(for me it will be a question for the present, because I think our out-of-tree tests starts to fail if I merge this patch...)

I still think there are lots of test cases that should be added to verify scenarios involving functions with non-zero address space.
The only new test case, test/Bitcode/function-nonzero-address-spaces.ll, verifies that it is possible to have a function declaration with the address space attribute.
Maybe that is all that is expected to work with this patch, or should I expect that I can have a more complete test case where I also call that function, or saves the function pointer in a variable, etc?

arichardson added inline comments.Apr 6 2018, 6:42 AM
test/Bitcode/function-nonzero-address-spaces.ll
6

Could you extend this test to also check that declare void @foo() addrspace(3) works?

I've extracted a patch without the Module* -> Module& changes in D47541. It would be really good if something like this could land soon since I just ran into the problem that always_inline wasn't working at -O0 for us due to functions being in AS0 but the calls in AS200 and the alwaysInliner only handles direct calls without an addressspacecast.

dylanmckay abandoned this revision.Jul 14 2018, 8:55 AM
dylanmckay added inline comments.
lib/AsmParser/LLParser.cpp
5051

It looks like there are quite a few getUnqual calls around the codebase which are likely to lose information.

A quick grep shows <~80

I wonder if it's worth refactoring the getUnqual method to make it harder to misuse.

Ignore the review-related comments, they were leftover draft messages from a while back.