This is an archive of the discontinued LLVM Phabricator instance.

[AIX] Implement caller arguments passed in stack memory
ClosedPublic

Authored by cebowleratibm on Jan 22 2020, 9:04 AM.

Details

Summary

This patch implements the caller side of placing function call arguments in stack memory. This removes the current limitation where LLVM on AIX will report fatal error when arguments can't be contained in registers.

There is a particular oddity that a float argument that passes in a register and also in stack memory requires that the caller initialize both. From what AIX "ABI" documentation I have it's not clear that this needs to be done, however, it is necessary for compatibility with the AIX XL compiler so I think it's best to implement it the same way.

Note a later patch will follow to address the callee side.

Diff Detail

Event Timeline

cebowleratibm created this revision.Jan 22 2020, 9:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 22 2020, 9:04 AM
ZarkoCA added inline comments.Jan 23 2020, 6:59 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6867–6868

mark this as const as well?

6881–6886

Can we add a test for this? Specifically checking if the full memory for the arg is initialized.

7092–7100

I would prefer to have the MemLoc code after the RegLoc. To me it's more intuitive that way since we use up registers first and then stack. It's purely a personal preference though and have no issue with keeping it as is.

7148–7152

Can we have a testcase for this?

cebowleratibm marked 7 inline comments as done.Jan 27 2020, 10:17 AM
cebowleratibm added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6881–6886

call_test_stackarg_float2 covers this scenario. There are STFS and STFD for the trailing float and double args.

7092–7100

I think this is a personal preference but I'll reorder the logic.

7148–7152

That's fair. test_stackarg_float2 covers has a double arg split across that last available GPR. I can add a case for the scenario where the double arg will fit in the last 2 available GPRs.

cebowleratibm marked 3 inline comments as done.Jan 28 2020, 7:33 AM

Added a test for a double that passes in fpr and 2 gpr.

Addressed non functional issues identified in the review.

ZarkoCA added inline comments.Jan 29 2020, 6:46 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6881–6886

Yep, I missed it the first time, thanks for pointing it out.

7068

Does the value given by CCInfo.getNextStackOffset() account for the linkage save area? I've found that it doesn't and only gives the size of the PSA when working on the callee side.

cebowleratibm marked 3 inline comments as done.Jan 29 2020, 7:24 AM
cebowleratibm added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
7068

LowerCall_AIX does:

CCInfo.AllocateStack(LinkageSize, PtrByteSize);
CCInfo.AnalyzeCallOperands(Outs, CC_AIX);

It needs to be done this way so that any stack arguments will get the correct stack offset. When AnalyzeCallOperands is done, the stack offset accounts for the linkage save area.

LowerFormalArguments_AIX should also AllocateStack the linkage save area as well. I think this can be done with the next patch for the callee work.

ZarkoCA added inline comments.Jan 29 2020, 8:19 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
7068

Got it, missed the part CCInfo.AllocateStack(LinkageSize, PtrByteSize); of the code previously.

cebowleratibm marked 2 inline comments as done.Jan 29 2020, 8:38 AM

fyi@cebowleratibm To me, this patch looks good now. I'm holding on approving to give other reviewers a chance to have a look.

Code changes look good. I 'll start on reviewing the tests now.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6981

Moving the increment of the index into the loop makes this a lot more complicated even though its a small change to the code. If we have to peek the next memloc then its unavoidable, but did we consider marking the memloc we want to skip as custom and having the normal memloc handling skip it?

7147–7158

Really minor nit: there is a bunch of extra spaces on this blank line.

7151

Really minor nit: Same thing here, a bunch of spaces or a tab after the '{'.

sfertile added inline comments.Feb 4 2020, 10:50 AM
llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1098

This is unused in the test.

1117

Just to make sure I understand correctly:

This store is just for moving the double value into the gprs. We don't have to initialize the register image in the parameter save area (even though the call is variadic) because it maps into the gprs available for argument passing. If the gprs were exhausted but we still have available fprs then we have to initialize both the fpr and the register image in the PSA (this case being tested by the following test).

1265

Where do we initialize $r10? I don't think we are checking for the asm that should be generated from

; 64BIT-DAG:   STFD renamable $f1, 0, %stack.0 :: (store 8 into %stack.0)
; 64BIT-DAG:   renamable $x10 = LD 0, %stack.0 :: (load 8 from %stack.0)
cebowleratibm marked 10 inline comments as done.Feb 5 2020, 7:07 AM

Addressed Sean's comments. Notable change: when a float arg passes in FPR as well as PSA memory we will mark the memLoc custom. LowerFormalArguments_AIX can ignore custom memLoc. If a float arg passes in PSA memory and there was no FPR available, then the memLoc will be normal (non-custom.) I've updated LowerFormalArguments_AIX such that it will pass over custom memloc but continue to report_fatal_error on non-custom memloc. Zarko will have a patch coming to remove the report_fatal_error and implement memLoc for LowerFormalArguments_AIX.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6981

I can use the customMem to make it cleaner to skip over these. I will use custom mem when the argument is also passed in FPR. If the argument is only passed in memory, it will remain a regular memLoc.

llvm/test/CodeGen/PowerPC/aix-cc-abi.ll
1098

carry over from call_test_stackarg_float. Good catch.

1117

Your understanding is correct. call_test_stackarg_float3, passes float args in FPR as well as PSA memory. call_test_stackarg_float2 tests that the required GPRs are initialized. The stack store is only an autolocal artifact required to initialize the GPRs. On Pwr8 up the operation can be done without going through memory but we can't test it yet because it requires vector instructions that we've blocked on AIX.

1265

I missed it. We're generating and it belongs in the test.

cebowleratibm marked 4 inline comments as done.

Used custom mem to tag memLocs for float args that also pass in register so the LowerFormalArguments_AIX can skip them easily.

Addressed minor cleanup and testcase updates from review.

sfertile accepted this revision.Feb 5 2020, 9:45 AM

LGTM.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6981

👍 Thank you. I really like this approach.

This revision is now accepted and ready to land.Feb 5 2020, 9:45 AM
This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/PowerPC/aix-cc-abi.ll