Page MenuHomePhabricator

[PowerPC][AIX] Spill and restore the non-volatile condition register bits.
ClosedPublic

Authored by sfertile on Feb 10 2020, 11:10 AM.

Details

Summary

Extends the existing support for spilling and restoring the condition register to the linkage area for 32-bit targets, and enable for AIX.

Diff Detail

Event Timeline

sfertile created this revision.Feb 10 2020, 11:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2020, 11:10 AM
sfertile updated this revision to Diff 243886.Feb 11 2020, 8:26 AM
  • Add missed 32-bit AIX mir test.
  • Update comment on creating fixed stack object to linkage area to include AIX.
  • Fix offset fixed stack object for crsaves refrences at for aix32.
sfertile marked an inline comment as done.Feb 11 2020, 8:28 AM
sfertile added inline comments.
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
2339

I didn't realize while working on this patch but I think I can separate this part of the change into a preceding NFC patch. Going to try that out now and commit it if it can be done NFC.

I would like to see some test coverage for the AIX CRSave assembly, whether we adapt the tests you've already updated or add a separate test. It seems 64-bit PPC Linux has a better way of handling a single CR save. If we choose not to use the same approach on AIX I'd like to understand why.

llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
744

const

926–927

Suggest "assert(IsPPC64)" here.

928

Why do we use MFOCRF8 for 64-bit Linux with 1 CR save but not on AIX?

952

I can't help but notice this block of logic is redundant to the block at 946. Can this be commoned under an "if (MustSaveCR)" block?

llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
932

"thhe"

llvm/test/CodeGen/PowerPC/aix32-crsave.mir
2

I'd rather see an aix-crsave that handles both 32/64 or one crsave test that handles all subtargets.

22

Can you use CHECK-LABEL to ensure the expected block binds to the intended section?

llvm/test/CodeGen/PowerPC/ppc64-alloca-crspill.ll
2

I think it makes sense to change this to "alloca-crspill.ll" and add the 32-bit expected output for AIX.

sfertile updated this revision to Diff 244216.Feb 12 2020, 10:18 AM
sfertile marked 6 inline comments as done.
  • Add const on local
  • add an assert to express the codegen inside an if is meant for 64-bit only.
  • add an aix-crspill test which checks the assembly output for both 64-bit and 32-bit AIX.
  • Use CHECK-LABEL in aix32-crspill.mir test to restict the scope of other CHECK directives.
  • Add checking of the fixed stack object in ppc64-crsave.mir test to match the aix32 test.

Forgot to exetend the alloca test to include 32-bit AIX.

llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
928

Because I'm not sure if its safe to do so on AIX. The V2 ELF ABI specifically allows saving only the clobbered crfields, the V1 ELF ABI does not. I can't find any documentation for AIX that specifies whether its safe to save only the clobbered fields, or if CR2,CR3 and CR4 must all be saved. If we are allowed to save a single field but save the entire register anyway we are pessimistically correct, if we have to save all 3 non-volatile cr fields and we only save the clobbered ones then our codegen is incorrect, so unless we can verify using 'mfocrf' is allowed on AIX we have to match the ELF V1 implementation.

FWIW I tested out clobbering a CR field with both XL and gcc on AIX and both use 'mfcr`, so I strongly doubt we can save only the clobbered non-volatile fields.

952

I don't think so because of the ordering with respect to the 'mflr' instruction inserted at line 961 above. I could pull the code out into a static helper or lambda if you like though.

llvm/test/CodeGen/PowerPC/aix32-crsave.mir
2

I added this test to mirror the ppc64 test. I can't handle both 32-bit and 64-bit targets with the same input mir. I think its important to have a test with mir as input which runs only the prologepilog pass to highlight exactly what is expected from the pass. I've added a separate IR to assembly test that exercises both 32-bit and 64-bit codegen on AIX.

22

Good suggestion. updated.

sfertile updated this revision to Diff 244440.Feb 13 2020, 8:21 AM
  • Rebased to pick up NFC commit for the ELF 32 cr save changes.
  • Extended the alloca + cr save test to include 32-bit AIX.
Xiangling_L added inline comments.Feb 13 2020, 8:53 AM
llvm/test/CodeGen/PowerPC/ppc64-crsave.mir
10

Usually, the default cpu level we set in the testcase is pwr4, can I ask why we want to test pwr7 here instead?

18

In case we might forget to update the testcase when we support AIX, can we leave an assertion in the code instead?

sfertile added inline comments.Feb 13 2020, 9:23 AM
llvm/test/CodeGen/PowerPC/ppc64-crsave.mir
10

Its just to match the ELF V1 runstep. There should be no particular difference between pwr4 and pwr7 for this test.

18

In what code?

Xiangling_L added inline comments.Feb 13 2020, 1:34 PM
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
1573–1574

If we don't expect this block is PPC64 bit only, should we remove the comment as well?

1631

same comment as above

llvm/test/CodeGen/PowerPC/ppc64-crsave.mir
10

I see. I slightly prefer checking pwr4 here though. Because if we are not consistent with our default cpu level, people review this in the future may have same doubt as me Is pwr7 triggering anything special in this testcase?

18

Sorry, I thought there was some boilerplate of save and restores of the callee saved gpr for AIX based on properly implement, bt it seems I am wrong.

34

Does this id difference matter? Can we use regex to replace it and combine these two lines?

70

Same comment as above

sfertile updated this revision to Diff 244556.Feb 13 2020, 6:04 PM
sfertile marked 6 inline comments as done.
  • Removed stale comments.
  • Renamed the check prefixes in the 64-bit crsave test related to mfocr and` mfcr` difference becuase it erroneously made it seem the difference was due to power8 vs power7 ISA.
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
1573–1574

Good catch.

llvm/test/CodeGen/PowerPC/ppc64-crsave.mir
10

Good point. I picked poor names for the related checks too becuase PWR7/PWR8 makes it seem like only Power8 has the mfocrf instruction which isn't the case. Its just that the V2 abi allows saving a single field while the other ABIS do not (AFAIK). I've change the aix test to run for power4, updated the check prefixes and added a comment to help clear it up.

18

I see. So the problem is we don't flag the registers as having a fixed spill offset, which causes us to fall into the target independent spill/restore code which does something completely sensible but not ABI compliant. I don't want to disable spilling csrs completely because it will block a lot of other meaningful work: for example nothing in spec or lnt would compile anymore. I already have a follow on patch that fixes them that I will post once I commit this patch, so it won't be forgotten :)

34

Its not important now, but when I fix the callee saved gprs/fprs for AIX I expect them to end up with the same id: We are creating a stack object for the gpr csr when we should be creating a fixed stack object for it. I intend to update the test to consolidate all the ELF/AIX checks into normal CHECKS as part of that work.

I believe the current patch is sound and correct. I had one test suggestion and one suggestion to use a lambda to reuse some of the cr spill logic.

llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
952

I think the lambda is a good idea.

1770–1771

With removal of isSVR4ABI() on this code path, presumably this code behaves differently for 32-bit SVR4. Does this comment need to be updated or is this a bug?

llvm/test/CodeGen/PowerPC/aix-crspill.ll
48

I like this test because it validates we apply the correct mask on the restore, though I think a test that kills all 3 cr bits is also meaningful for assembly..

sfertile updated this revision to Diff 245401.Feb 19 2020, 7:58 AM
sfertile marked 2 inline comments as done.
  • Added a lambda for building the instruction which moves the condition register into a scratch register to avoid dupication.
  • Added an IR --> assembly test that verifies we restore all 3 non-volatile cr-fields.
  • Removed a couple stale comments, and reworded another comment.
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
1770–1771

Good catch, updated the comment.

ZarkoCA added inline comments.Feb 19 2020, 10:17 AM
llvm/test/CodeGen/PowerPC/aix32-crsave.mir
66

formatting

llvm/test/CodeGen/PowerPC/ppc64-crsave.mir
88

formatting

sfertile updated this revision to Diff 245685.Feb 20 2020, 10:12 AM

Fix formatting of CHECK lines in several of the lit tests.

sfertile marked 2 inline comments as done.Feb 20 2020, 10:13 AM
Xiangling_L accepted this revision.Feb 21 2020, 8:26 AM

LGTM with a minor comment

llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
929

minor issue: inconsistency between how to get instruction TII.get(PPC::MFOCRF8) and MoveFromCondRegInst?

This revision is now accepted and ready to land.Feb 21 2020, 8:26 AM
cebowleratibm added inline comments.Feb 24 2020, 5:00 AM
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
929

I think the current code is the preferred form. I had asked for the IsPPC64 assertion above to clarify to the reader that it's ok to hard code MFOCRF8.

sfertile marked an inline comment as done.Feb 24 2020, 7:29 AM
sfertile added inline comments.
llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
929

Like Chris says: MoveFromCondRegInst --> opcode depends on if we are targeting 32-bit or 64-bit codegen, while the hard-coded opcode is to call out explicitly that this code is 64-bit only (in conjunction with the assert).

This revision was automatically updated to reflect the committed changes.