Page MenuHomePhabricator

[AIX] Add support for lowering int, float and double formal arguments.
ClosedPublic

Authored by ZarkoCA on Oct 29 2019, 11:47 AM.

Details

Summary

This patch adds LowerFormalArguments_AIX, support is added for lowering int, float, and double formal arguments onto general purpose and floating point registers only.

The aix calling convention testcase have been redone to test for caller and callee functionality.

Diff Detail

Event Timeline

ZarkoCA created this revision.Oct 29 2019, 11:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 29 2019, 11:47 AM
sfertile added inline comments.Oct 30 2019, 10:52 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6745

My understanding is that when targeting 32-bit codegen any i64 will be split into 2 i32s. If thats correct then we should keep the i64 case separate, with an assert along the lines of assert(IsPPC64 && "i64 should have been split for 32-bit codegen.");.

6836

I believe this is breaking some of the existing AIX lit testing.

6865

Shouldn't we be doing the same thing extendArgForPPC64 is doing, but with the extra generality that the zext/zext type should be i32 when targeting 32-bit codegen?

ZarkoCA added inline comments.Oct 31 2019, 6:14 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6745

I can change it to:

case MVT::i1:
case MVT::i32:
  (!isPPC64)
  return &PPC::GPRCRegClass;
  LLVM_FALLTHROUGH;
case MVT::i64:
  assert(IsPPC64 && "i64 should have been split for 32-bit codegen.");
  return &PPC::G8RCRegClass;

I think a defensive assert like you're suggesting is a good idea. it does complicate the switch a bit but that may be necessary?

6836

Thanks, I missed those in my testing. Adding -mattr=-altivec gets them to pass. I can add that to the testcases and ask the author to review.

6865

All other PPC targets look like they just truncate i1s. Adding a condition to check for any extension flags for all types, including i1s, led to some nodes missing. I will investigate further, however.

ZarkoCA updated this revision to Diff 227918.Nov 5 2019, 10:37 AM
ZarkoCA added a reviewer: Xiangling_L.

1 .Fixed broken testcases for AIX because of explicit error with altivec, and added reviewer for these.

  1. Added additional guard to assert if i64 are not split into i32 before register allocation.
  2. Rebased patch on master.
ZarkoCA marked 2 inline comments as done.Nov 5 2019, 10:38 AM

Patch is getting pretty close, I am just going over the tests more thoroughly now. One observation for the gpr test: We have a lot of cases passing i8, but none passing i16.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6745

Missing an 'if'.
!isPPC64 --> !IsPPC64

6865

All other PPC targets look like they just truncate i1s.

The 32-bit targets only truncate, the 64-bit targets add the Assert[SZ]Ext nodes appropriate for the ArgFlags.

Adding a condition to check for any extension flags for all types, including i1s, led to some nodes missing.

Consider the following test case for 32-bit though:

@b = common local_unnamed_addr global i8 0, align 1

define void @callee(i1 zeroext %_b) local_unnamed_addr {
entry:
  %frombool = zext i1 %_b to i8
  store i8 %frombool, i8* @b, align 1
  ret void
}

If we add the assert nodes, then we would expect a DAG that kind of looks like:

t1: i8 ZeroExt(i1 Truncate(i32 AssertZext(i32 CopyFromReg %_b)))
store<(store 1 into @b)> t1, GlobalAddress:i32<i8* @b>

Without the assert nodes:

t1: i8 ZeroExt(i1 Truncate(i32 CopyFromReg %_b))
store<(store 1 into @b)> t1, GlobalAddress:i32<i8* @b>

In the first DAG we have preserved the information from the IR: namely that the value was original created by zero extending an i1 into an i32. Becuase of the extra info we can optimize away the truncate and zero extend. In the second DAG we have thrown the extra info away, and have to insert an extra clear instruction. If the Argument doesn't have the zeroext or signext attribute, then we end up with the second DAG anyway, and we would end up with a clear instruction as expected.

6879

I think we can get the stack size from the CCState object, since you allocated the linkage area and parameter saver area already.

A couple high level comments on the tests.

  • I like how you extended the 'LowerCall' tests to test both the caller and callee side, as opposed to adding new tests just for the formal argument lowering.
  • There are a lot of tests covering quite a few cases. Its good to have thorough coverage. Having separate test cases for things like 1 float vs 3 floats, and 1 i8, vs 4 sext i8s, vs 4 zext i8s might be a little too verbose though. Having a couple calls that mix various extensions of the i8/i16 types i32 and i64 types should give adequate test coverage. Similarly for the fpr tests, a call which mixes doubles and floats should be adequate. I think the most descriptive test are those that are mixing floating and integer/pointer values. For example calling a function with a signature like: (i64 %a, double %d, i1 zeroext %b, float %f, i16 signext %c) shows how we split the i64 for 32-bit codegen, skip gprs for doubls/floats and map the smaller then register sized types into a gpr all in 1 test.
  • Most of the test on the callee side we don't really care about the instructions in the callee, the liveins is all we are really interested in. I find the allocas and dead stores a little odd. There is no passes to remove them between llc start and where we dump mir so its OK, but I would prefer just having some arbitrary arthritic with the arguments so they are all used to calculate a return value instead, and don't need any checks other then the liveins except in a couple select tests.
  • We probably want a test like the i1 signext/zext/no flags test I posted in one of the comments, and maybe a similar style test for the codegen we expect for say a type that is smaller then register size but not an i1.
  • With the reduction in test verbosity (less tests and less CHECK directives) we should be able to combine aix_gpr_param.ll and aix_fpr_param.ll into a single file. We will limit that to passing aruments in registers, and will have seperate files for when we spill arguments to memory.

I think a few more tests that need us to disable the altivec attr have landed since your last update. If you rebase you will pick them up.

ZarkoCA updated this revision to Diff 229104.Nov 13 2019, 8:07 AM
  1. Addressed second round of comments.
    • Added truncateScalarIntegerArgs function, now we add sext, zext nodes if they are needed and truncate whenever an int size is smaller than the register size.
    • use CCinfo to allocate stack.
    • consolidated calling convention testcases and removed some non-essential tests
ZarkoCA marked 3 inline comments as done.Nov 13 2019, 8:09 AM
ZarkoCA added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6865

I added a function that will use the flags passed in and work in a more general way than extendArgForPPC64, this way we do the same thing for 32 and 64Bit.

A couple high level comments on the tests.

  • I like how you extended the 'LowerCall' tests to test both the caller and callee side, as opposed to adding new tests just for the formal argument lowering.
  • There are a lot of tests covering quite a few cases. Its good to have thorough coverage. Having separate test cases for things like 1 float vs 3 floats, and 1 i8, vs 4 sext i8s, vs 4 zext i8s might be a little too verbose though. Having a couple calls that mix various extensions of the i8/i16 types i32 and i64 types should give adequate test coverage. Similarly for the fpr tests, a call which mixes doubles and floats should be adequate. I think the most descriptive test are those that are mixing floating and integer/pointer values. For example calling a function with a signature like: (i64 %a, double %d, i1 zeroext %b, float %f, i16 signext %c) shows how we split the i64 for 32-bit codegen, skip gprs for doubls/floats and map the smaller then register sized types into a gpr all in 1 test.

I removed a few of the somewhat redundant ones, added a few more.

  • Most of the test on the callee side we don't really care about the instructions in the callee, the liveins is all we are really interested in. I find the allocas and dead stores a little odd. There is no passes to remove them between llc start and where we dump mir so its OK, but I would prefer just having some arbitrary arthritic with the arguments so they are all used to calculate a return value instead, and don't need any checks other then the liveins except in a couple select tests.

I only check for the liveins now tests, and don't check the lines that follow. I've had some trouble getting tests that have some basic arithmetic to work. The .ll files are correct but using Filecheck and what it reads and doesn't read has been giving me a lot of trouble. I can fix them later since I plan to do more work on the callee side but this patch is holding up some other calling convention work and I'd like to get it in before spending more time figuring Filecheck out.

  • We probably want a test like the i1 signext/zext/no flags test I posted in one of the comments, and maybe a similar style test for the codegen we expect for say a type that is smaller then register size but not an i1.
  • With the reduction in test verbosity (less tests and less CHECK directives) we should be able to combine aix_gpr_param.ll and aix_fpr_param.ll into a single file. We will limit that to passing aruments in registers, and will have seperate files for when we spill arguments to memory.

    I think a few more tests that need us to disable the altivec attr have landed since your last update. If you rebase you will pick them up.
ZarkoCA updated this revision to Diff 229162.Nov 13 2019, 12:48 PM

Updates testcases to use all of the parameters in the function definition.

sfertile added inline comments.Nov 14 2019, 11:34 AM
llvm/include/llvm/ADT/DirectedGraph.h
51 ↗(On Diff #229162)

Unrelated change.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6832

Does it need to be a member of PPCTargetLowering? It looks like we are passing in everything we use. If we can reduce its scope to a static in PPCISelLowering.cpp I would prefer that.

6874

Reporting an error for an unsupported construct is fine, but this ends up being very intrusive since we infer altivec support for pretty much any cpu we would target with AIX. We now have to add -mattr=-altivec on every llc invocation targteing AIX which is not very convenient. If we compare it to the other fatal erros we have here:
QPX, softFloat --> Will never be supported for AIX and we should error if those options are enabled.
isVarArg --> will trigger only if lowering a variadic function.
hasAltivec --> is inferred by the backend for any cpu we target, will be true even if we don't have any vector constructs.

The 'hasAltivec' restriction is simply too restrictive. Instead we should only issue an error there is a use of a vector as an argument. CC_AIX does that already I belive so we can omit any further error handling here.

6906

Why the extra CopyFromReg?

ZarkoCA updated this revision to Diff 229379.Nov 14 2019, 12:27 PM

Made truncateScalarIntegers not a member of PPCISelLowering class.
Removed fatal error for altivec, CC_AIX already errors with vector args.
Removed -mattr=-maltivec on tests where I added it, as we no longer error on those.

ZarkoCA marked 4 inline comments as done.Nov 14 2019, 12:32 PM
ZarkoCA added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6832

Agreed, I was only following the convention of extendArgforPPC64 but there is no reason for it to be a member.

6835

I removed the fatal error and changed back the tests that were failing.

6906

My mistake, removed now.

jasonliu added inline comments.Nov 15 2019, 7:07 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6914

nit: Extra space before "Tail".

6917

nit: Since we have "Set the size that is at least reserved in caller of this function" in the beginning of this paragraph, "reserved for caller" at the end could be removed?

cebowleratibm added inline comments.Nov 18 2019, 7:52 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6814

I'd rather see this switch written:

assert((IsPPC64 || SVT != MVT::i64) && "i64 should have been split for 32-bit codegen.");

switch (SVT) {
...
case MVT::i1:
case MVT::i32:
case MVT::i64:    
  return IsPPC64 ? &PPC::G8RCRegClass : &PPC::GPRCRegClass;
...  
}

It's easier for me to follow that all ints in PPC64 are G8RC and all ints in PPC32 are GPR and the complexity of the i64 in PPC32 assertion stays out of the switch.

sfertile added inline comments.Nov 18 2019, 12:37 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6911

Since you have allocated the linkage area and the parameter save area CCInfo.getNextStackOffset() will be at least LinkageSize + MinParameterSaveArea, so you can omit taking the max.

ZarkoCA marked 9 inline comments as done.Nov 19 2019, 5:31 AM
ZarkoCA added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6814

Thanks, agreed this makes it much simpler and also moves the assert to the earliest possible positin.

6911

Thanks, fixed.

6914

Removed, thanks.

6917

Thanks for catching that.

ZarkoCA updated this revision to Diff 230049.Nov 19 2019, 5:34 AM
ZarkoCA marked 4 inline comments as done.

Moved assert to earliest position in truncateScalarIntger().
Fixed typos.
Removed std::max when calculating MinReservedArea.
Used opt -mem2reg to remove allocas resulting in body of callee functions looking cleaner with fewer unnecessary lines.

Patch is very close. I've added a few inline comments on the test.

llvm/test/CodeGen/PowerPC/aix_cc_abi.ll
48

very minor nit: indentation should match the following lines.

123

Minor nit: This test and @test_ints_64bit overlap enough we just need 1. I would suggest getting rid of this one, and renaming '@test_ints_64bit` to just @test_ints.

261

Move this test to beside the zero extended i1 case, and add checks for the body that show we end up with a clear instruction before the strore.

Minor nit: dead store. We can store to a global instead.

@global_i1 = global i8 0, align 1

define  void @test_i1(i1 %b)  {
entry:
  %frombool = zext i1 %b to i8
  store i8 %frombool, i8* @global_i1, align 1, !tbaa !2
  ret void
}
560

r3 and r4 used in the 'LWZtoc` instructions should be variables. They are renamable scratch registers in these uses. The same variables should be used in place of hardcoded gprs in the LFS and LFD instructions.

The LI is right in useing the hardcoded r4 and r7 registers since we are setting up ABI specified registers for the call with those instructions.

And the same holds for the equivalent 64-bit test.

580

This is a very good test: Documents how a float argument only shadows 1 gpr for 32-bit codegen despite being passed as a double in fpr1.

ZarkoCA updated this revision to Diff 230136.Nov 19 2019, 1:14 PM
ZarkoCA marked 5 inline comments as done.

Fixed testcases.

ZarkoCA marked 3 inline comments as done.Nov 19 2019, 1:17 PM
ZarkoCA added inline comments.
llvm/test/CodeGen/PowerPC/aix_cc_abi.ll
123

Thanks, removed and renamed.

261

Moved and updated the function to store a global.

560

Fixed, that's a good point, I'm using variables for these now.

sfertile accepted this revision.Nov 26 2019, 7:36 AM

Only a couple minor comments. Otherwise LGTM.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6853

Minor nit: This should be defined just before it use (FuncInfo->setMinReservedArea(MinReservedArea);)

llvm/test/CodeGen/PowerPC/aix_cc_abi.ll
131

Looks good. We should also have the same function but with a zeroext arg attribute and show that we don't end up with a clear instruction in this case. The new function doesn't need to be called.

This revision is now accepted and ready to land.Nov 26 2019, 7:36 AM
ZarkoCA updated this revision to Diff 231095.Nov 26 2019, 10:07 AM

Moved variable declarations and definition to just before they are first used.
Added zext i1 testcase and check for absence of clear instruction.

ZarkoCA marked 2 inline comments as done.Nov 26 2019, 10:11 AM

Only a couple minor comments. Otherwise LGTM.

Thank you!

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
6853

Moved this and the MF definition to just before they are used. MF is moved to line 6850:

MachineFunction &MF = DAG.getMachineFunction();
 CCState CCInfo(CallConv, isVarArg, MF, ArgLocs, *DAG.getContext());
llvm/test/CodeGen/PowerPC/aix_cc_abi.ll
131

Thanks added it below as @test_i1zext()

ZarkoCA updated this revision to Diff 231127.Nov 26 2019, 12:58 PM

Rebase of latest masters (Nov 26, 2019) and fix testcase register typo.

ZarkoCA updated this revision to Diff 231130.Nov 26 2019, 1:23 PM

Fixed testcase typo.

This revision was automatically updated to reflect the committed changes.