Page MenuHomePhabricator

[AIX][XCOFF]emit extern linkage for the llvm intrinsic symbol
Needs ReviewPublic

Authored by DiggerLin on Apr 27 2020, 7:47 AM.

Details

Summary
  1. when we call memset, memcopy,memmove etc(this are llvm intrinsic function) in the c source code. the llvm will generate IR

like call call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
for c source code
bash> cat test_memset.call

struct S{
 int a;
 int b;
};
extern struct  S s;
void bar() {
  memset(&s, s.b, s.b);
}

like

%struct.S = type { i32, i32 }
@s = external global %struct.S, align 4
; Function Attrs: noinline nounwind optnone
define void @bar() #0 {
entry:
  %0 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  %1 = trunc i32 %0 to i8
  %2 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
  call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
  ret void
}
declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #1

If we want to let the aix as assembly compile pass without -u
it need to has following assembly code.
.extern .memset
(we do not output extern linkage for llvm instrinsic function.
even if we output the extern linkage for llvm intrinsic function, we should not out .extern llvm.memset.p0i8.i32,
instead of we should emit .extern memset)

  1. for other llvm buildin function floatdidf . even if we do not call these function floatdidf in the c source code(the generated IR also do not the call __floatdidf . the function call

was generated in the LLVM optimized.
the function is not in the functions list of Module, but we still need to emit extern .__floatdidf

The solution for it as :
We record all the lllvm intrinsic extern symbol when transformCallee(), and emit all these symbol in the AsmPrinter::doFinalization(Module &M)

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jasonliu added inline comments.May 5 2020, 8:25 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5319

If there exists a user-declared function whose name is the same as the
ExternalSymbol's, then we pick up the user-declared version.

How do we handle this scenario under the new logic? Are we still able to pick up the user-declared version without generating .extern for that function?

jasonliu added inline comments.May 5 2020, 8:57 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5319

Please consider the case in llvm/test/CodeGen/PowerPC/aix-user-defined-memcpy.ll

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1756

One issue with auto here is that it's not obvious that we have a pointer type (and even knowing how Sym is used does not rule out the possibility that it is merely convertible to the pointer type). Using auto * might not have caused the comment to be raised, but since the comment has been raised, please get rid of the auto.

DiggerLin updated this revision to Diff 262155.May 5 2020, 10:33 AM
DiggerLin marked an inline comment as done.

address comment

jhenderson added inline comments.Wed, May 6, 12:12 AM
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1756

By the way, there's a related style guide section on usage of auto: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable

llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
3

One more related request: Indent the FileCheck directive by an extra two spaces to make it obvious it's a continuation of the previous line:

; RUN: llc ... | \
; RUN:   FileCheck ...

Same below.

DiggerLin updated this revision to Diff 262379.Wed, May 6, 7:55 AM
DiggerLin marked 2 inline comments as done.
DiggerLin added inline comments.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1756

thanks for the information.

It's a bit awkward to use the emitInstruction function to record the special functions, as it is actually not doing anything related to "emit instruction".
But I'm not sure if there is any better way to do it if we want to hide the logic in PPCAsmPrinter instead of exposing it in target independent MCContext.
I will see what other people think.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5332

Could we have a lambda function to return a MCSymbolXCOFF with the correct name for it? Then replace the similar logic in getAIXFuncEntryPointSymbolSDNode with this lambda function call as well.
We should limit the places to do the "." + func_name dance.

๐Ÿ‘ Thanks for trying this out Digger, I think this approach is the right one to take. I've had an initial look and left a few comments inline. I'll need to spend some time understanding ExternalSymbolSDNodes a bit better before being able to review the patch thoroughly enough to approve.

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
152

IIUC this isn't directly related to intrinsics, but to ExternalSymbolSD nodes, with the lowering of intrinsics being just one of the ways we can end up with an ExternalSymbolSDNodes. Intrinsic lowering might be the only way we expect to get one of these nodes on AIX right now but that can change in the future. The naming and comments need to reflect that this is explicitly for handling emitting the linkage of ExternalSymbolSDNodes.

1731

Choosing to implement this override and handle here instead of adding to the base classes emitInstruction makes this really nice and clean. Good job.

1738

What is the state of tail-calls on AIX? I know we can't tail call in many of the situations where we can on Linux, however if they aren't completely disabled we might need to add the tail-call opcodes here for similar transformation.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5320

I'm not too familiar with when we get ExternalSymbolSDNodes or what the semantic should be if we have an ExternalSymbolSDNode and a IR declaration for the same function, but it seems odd to treat the situation differently based on XCOFF vs ELF. @Xiangling_L Was aix-user-defined-memcpy.ll motivated by real code? If so I'd like to investigate it a bit more.

5331

Why not leave the name as is, and prepend the '.' when we create/lookup the symbol in the AsmPrinter?

jasonliu added inline comments.Wed, May 6, 9:05 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5320

It seems for real C code. If there is a user-provided memset/memcpy, clang would transform the call to that user-provided memset/memcpy instead of generating llvm.memset/memcpy intrinsic.

DiggerLin marked 3 inline comments as done.Wed, May 6, 1:01 PM
DiggerLin added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5331

if leave name as is, the branch will be
bl memset
not
bl .memset

if leave name as is , we want to generate bl .memset . we need to modify the operand(ExternalSymbol) of branch to generate correct bl .memset

DiggerLin marked 5 inline comments as done.Thu, May 7, 9:05 AM
DiggerLin updated this revision to Diff 262677.Thu, May 7, 9:09 AM

address comment

sfertile added inline comments.Fri, May 8, 9:04 AM
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
152

Sorry I wasn't more explicit in my previous comment. I don't think we should be using the name 'intrinsic' here at all.

/// Symbols lowered from ExternalSymbolSDNodes.
SmallPtrSet<MCSymbol *, 8> ExternalSymbols;
1739

We haven't implemented TLS on AIX yet, and I'm not sure which if any of these call pseudos will be used. How about adding them as a separate case with a report_fatal_error which can be revisited as part of TLS implementation on AIX.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5301

I'm don't think this is what Jason meant from his comment.

5331

Yes, which is why I said you would move where you prepend the '.' from here, to where you create the MCSymbol from the MO_ExternalSymbol in the ASM printer.

I think our handling of when there is deceleration in the IR matching the ExtrernalSymbol name is wrong, and it should either be done for all subtargets, or none (but I'm not 100% sure of that yet and investigating). If they are to be common-ed, then by moving modifying the symbols name too the ASMprinter, it lets us have identical code for all subtargets for transforming ExternalSymbolSDNode callees.

I'd say hold off on making this change until we better understand the IR Decl/ExternalSym situation though.

DiggerLin marked 8 inline comments as done.Mon, May 11, 8:10 AM
DiggerLin added inline comments.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
152

thanks

1739

I will delete the TLS related opcode.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5301

thanks.

5331

Discussed with Sean offline, I will keep the implement.

DiggerLin updated this revision to Diff 263179.Mon, May 11, 8:13 AM
DiggerLin marked 4 inline comments as done.

address comment

sfertile added inline comments.Fri, May 15, 9:22 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5331

Expanding a bit: I though we could create the MCSymbol in PPCAIXAsmPrinter::emitInstruction and then create a new MCSymbol MachineOperand and change the operand to that. Digger explained how I was mistaken and we can not change the operand on the MachineInstr, so we have to stick with this approach.

sfertile added inline comments.Thu, May 21, 10:25 AM
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
153

Not all intrinsics get handled this way, and not all of these symbols necessarily come from lowering an intrinsic: so we shouldn't be using 'Intrinsic' in the naming. I suggested 'ExternalSymbols' above but that sprobably too generic.

1738

You have added PPC::TAILB so I am assuming that means we do support tail-calls on AIX. Why do we not need to add any of the other tail call pseudos. At the very least i would expect we need PPC::TAILB8 for 64-bit support if we expect TAILB for 32-bit. What about TAILBA[8] or TAILBCTR[8]?

Can you also please update the testcase aix-user-defined-memcpy.ll to include:
CHECK-NOT: .extern .memcpy

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5320

Just as a record. As we discussed offline, a real C code is like:

int memcpy ( void * destination, int num ) { return 3; }
typedef struct cpy_array {
  int array[20];
} cpy;
int main() {
  cpy A = {1,2,3,4,5,6,7,8,9,10, 11, 12, 13, 14, 15,16};
  cpy B = A;
  return B.array[0];
}

And Clang FE will generate the following IR for AIX:

; Function Attrs: noinline nounwind optnone
define i32 @memcpy(i8* %destination, i32 %num) #0 {
entry:
  %destination.addr = alloca i8*, align 4
  %num.addr = alloca i32, align 4
  store i8* %destination, i8** %destination.addr, align 4
  store i32 %num, i32* %num.addr, align 4
  ret i32 3
}
; Function Attrs: noinline nounwind optnone
define i32 @main() #0 {
entry:
  %retval = alloca i32, align 4
  %A = alloca %struct.cpy_array, align 4
  %B = alloca %struct.cpy_array, align 4
  store i32 0, i32* %retval, align 4
  %0 = bitcast %struct.cpy_array* %A to i8*
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %0, i8* align 4 bitcast (%struct.cpy_array* @__const.main.A to i8*), i32 80, i1 false)
  %1 = bitcast %struct.cpy_array* %B to i8*
  %2 = bitcast %struct.cpy_array* %A to i8*
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %1, i8* align 4 %2, i32 80, i1 false)
  %array = getelementptr inbounds %struct.cpy_array, %struct.cpy_array* %B, i32 0, i32 0
  %arrayidx = getelementptr inbounds [20 x i32], [20 x i32]* %array, i32 0, i32 0
  %3 = load i32, i32* %arrayidx, align 4
  ret i32 %3
}
llvm/test/CodeGen/PowerPC/aix-cc-byval-mem.ll
91

Though unrelated to this patch, I am wondering if we can replace & to something more meaningful like extsym etc. like what we have for MCSymbol as mcsymbol?

sfertile added inline comments.Mon, May 25, 10:48 AM
llvm/test/CodeGen/PowerPC/aix-cc-byval-mem.ll
91

Good suggestion, send a quick email to the llvm-dev list and see if there interest for this, or any objections.

DiggerLin marked 4 inline comments as done.Mon, May 25, 2:04 PM
DiggerLin added inline comments.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
153

thanks . change to "ExtSymSDNodeSymbols" as you discussed on the slack

1738

as discussed with Sean offline, since wo do not support tail call for the extern symbol now . I added a report_fatal_error("Tail call for extern symbol not yet implemented.");

DiggerLin updated this revision to Diff 266075.Mon, May 25, 2:06 PM
DiggerLin marked 2 inline comments as done.

address comment

jhenderson added inline comments.Wed, May 27, 12:55 AM
llvm/test/CodeGen/PowerPC/aix-user-defined-memcpy.ll
11โ€“12

I know this is what was done elsewhere in the test, but I don't think we should blindly copy bad practice here, and instead you should just reflow the line like I suggested in the other test. If you do think the llc line is too long and needs to broken up, I'd recommend the following:

; RUN: llc -verify-machineinstrs -mtriple powerpc-ibm-aix-xcoff \
; RUN:     -mcpu=pwr4 -mattr=-altivec < %s | \
; RUN:   FileCheck %s
DiggerLin updated this revision to Diff 266563.Wed, May 27, 8:54 AM

No more style/test comments from me now. I don't know enough about other things to further review this.

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1739

Can we have an update on why the TLS call pseudos were simply removed instead of being tracked by report_fatal_error as @sfertile suggested? I think that leaving some breadcrumbs in the code would be useful.

1769

Please use the immediate base class name to refer to the base class implementation.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5276

Should the name be adjusted to match that of the function introduced by https://reviews.llvm.org/D80402? @jasonliu?

llvm/test/CodeGen/PowerPC/aix-cc-byval-mem.ll
91

@sfertile, do we have further news on this?

llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
9

We've been indenting options to line up with options on the previous line. That usually avoids indenting by only two spaces (which we may want to reserve for indentation after things like shell operators).

13

Same comment.

jasonliu added inline comments.Fri, May 29, 12:18 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5276

Yeah, I think matching them would be useful for grepping.

llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
43

Start with COMMON-LABEL with this line and omit the lines above. The existence of the function descriptor is not the purpose of this test. After this, there would be no difference between 32-bit and 64-bit and we should be able to replace the prefix with plain CHECK.

46

I think the object code generation path test needs to verify that the relocation refers to the correct symbol table entry.

52

I think we should similarly omit the lines in this block before this last one.

75

I think we can omit the symbol table entries aside from the one for .memset.

DiggerLin updated this revision to Diff 267645.Mon, Jun 1, 9:12 AM
DiggerLin marked 9 inline comments as done.

address comment

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5276

I do not think we can use MCSymbol *getFunctionEntryPointSymbol(const Function *F,const TargetMachine &TM) function , for the .memset or memcpy, we do not have Function object for it.

jasonliu added inline comments.Mon, Jun 1, 9:43 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
5276

I don't think you can actually call that function, as the function is in TLOF and you do not have access to that from here.
Hubert and I are talking about matching the name.

jhenderson added inline comments.Tue, Jun 2, 12:08 AM
llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
10โ€“12

You can fold these two lines into one. No need for them to be separate:

; RUN: llvm-readobj --symbols --relocations --expand-relocs %t.o | \
; RUN:   FileCheck --check-prefix=SYMRELOC %s
DiggerLin updated this revision to Diff 267932.Tue, Jun 2, 10:38 AM
DiggerLin marked 2 inline comments as done.

address comment

llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
10โ€“12

Thanks for your suggestion, but I think separating in two is based is more reasonable ,One test for test Symbol. other test for test relocation. it is more readable.

jhenderson added inline comments.Wed, Jun 3, 12:09 AM
llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
10โ€“12

I'm not sure how it's more readable. It's very clear from my suggested command-line that we want to test the symbols and relocations (otherwise the options wouldn't be present), and would reduce the amount of work required to figure out which checks below correspond to which FileCheck line.

It would also be more in keeping with most tests I've worked with, and would reduce the number of executable invocations from 4 to 2, improving test run time.

DiggerLin marked 2 inline comments as done.Wed, Jun 3, 2:12 PM
DiggerLin added inline comments.
llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
10โ€“12

I tried it with llvm-readobj --symbols --relocations --expand-relocs
it display the relocation information first and than symbol information.

in the test case we want to verify symbol first and than relocation.
I have to keep current test.

DiggerLin marked 2 inline comments as done.Wed, Jun 3, 2:18 PM
DiggerLin added inline comments.
llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
10โ€“12

@hubert.reinterpretcast , what is your option on combine the

llvm-readobj --symbols

and

llvm-readobj --relocations --expand-relocs

into

llvm-readobj --symbols --relocations --expand-relocs

and than verify relocation information first as

; SYMRELOC: Relocation {{[{][[:space:]] *}}Virtual Address: 0x18
; SYMRELOC-NEXT: Symbol: .memset (0)
; SYMRELOC-NEXT: IsSigned: Yes
; SYMRELOC-NEXT: FixupBitValue: 0
; SYMRELOC-NEXT: Length: 26
; SYMRELOC-NEXT: Type: R_RBR (0x1A)
; SYMRELOC-NEXT: }

; SYMRELOC: Symbol {
; SYMRELOC-NEXT: Index: 0
; SYMRELOC-NEXT: Name: .memset
; SYMRELOC-NEXT: Value (RelocatableAddress): 0x0
; SYMRELOC-NEXT: Section: N_UNDEF
; SYMRELOC-NEXT: Type: 0x0
; SYMRELOC-NEXT: StorageCl
......

llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
10โ€“12

I suggest using llvm-objdump -d --symbol-description for the relocation check. As it is, we can't tell whether the relocation we found is in the right place. This happens to make it so that we end up with two separate tool invocations.

33

I believe I requested CHECK-LABEL: here.

35

Please indent the instructions in relation to the label.

61

Moot point, but this should be aligned to match the closing brace.

jasonliu added inline comments.Wed, Jun 3, 8:28 PM
llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
10โ€“12

I think you meant: llvm-objdump -d -r --symbol-description

jhenderson added inline comments.Thu, Jun 4, 1:55 AM
llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
10โ€“12

It's moot, given the llvm-objdump suggestion, but I'm not sure it really matters which order symbols or relocations are printed in.

DiggerLin marked 6 inline comments as done.Thu, Jun 4, 7:23 AM
DiggerLin added inline comments.
llvm/test/CodeGen/PowerPC/aix-llvm-intrinsic.ll
10โ€“12

because there is symbol index in the relocation information. we prefer to verify the symbol first and then relocation. otherwise, when we verified relocation, we have to go down the test file to search the symbol information.

DiggerLin updated this revision to Diff 268466.Thu, Jun 4, 7:29 AM
DiggerLin marked an inline comment as done.

change test case based on comment