Page MenuHomePhabricator

[PowerPC] Add a peephole post RA to transform the inst that fed by add
ClosedPublic

Authored by steven.zhang on Jul 6 2018, 1:11 AM.

Details

Summary

If the arch is P8, we will select XFLOAD to load the floating point, and then, expand it to vsx and non-vsx X-form instruction post RA. This patch is trying to convert the X-form to D-form if it meets the requirement that one operand of the x-form inst is the special Zero register, and another operand fed by add inst. i.e.
y = add imm, reg
LFDX. 0, y
-->
LFD imm(reg)

Diff Detail

Event Timeline

steven.zhang created this revision.Jul 6 2018, 1:11 AM
nemanjai requested changes to this revision.Jul 11 2018, 11:54 AM

This needs to be beefed up to account for the two unsafe situations I outlined in the comment.

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
2406

I just realized that this is not safe. There needs to be a check that the register being forwarded to the user isn't clobbered between DefMI and MI.

Example:

addis r5, r2, .LCPI0_0@toc@ha
addi r4, r5, .LCPI0_0@toc@l
lfdx f3, 0, r4

is safe to transform to

addis r5, r2, .LCPI0_0@toc@ha
lfd f3, .LCPI0_0@toc@l(r5)

However,

addis r6, r2, .LCPI0_1@toc@ha
addis r5, r2, .LCPI0_0@toc@ha
addi r4, r5, .LCPI0_0@toc@l
addi r5, r6, .LCPI0_1@toc@l
lfdx f3, 0, r4

is not safe to transform to

addis r6, r2, .LCPI0_1@toc@ha
addis r5, r2, .LCPI0_0@toc@ha
addi r5, r6, .LCPI0_1@toc@l
lfd f3, .LCPI0_0@toc@l(r5)

because r5 that's being forwarded is already clobbered.
Furthermore, we can't even transform

addis r4, r2, .LCPI0_0@toc@ha
addi r4, r4, .LCPI0_0@toc@l
lfdx f3, 0, r4

into

addis r4, r2, .LCPI0_0@toc@ha
addi r4, r4, .LCPI0_0@toc@l
lfd f3, .LCPI0_0@toc@l(r4)

(i.e. we can only transform this case if we're sure that addi r4, r4, .LCPI0_0@toc@l will be deleted).

This revision now requires changes to proceed.Jul 11 2018, 11:54 AM
steven.zhang added a comment.EditedJul 16 2018, 12:27 AM

Yes, you are right.
y = ADDI x, imm.
w = LFDX 0, y

We have the code to do the check for "y", but miss to do the check for "x". And this also caused some failure inside llvm test suite. (ie. Benchmarks/Stanford/FloatMM)
I will try to clean up all the failure first and then, continue the code review. Sorry for this, I should have addressed this issue early before.

Clean up all the failures from llvm test-suite. We miss to check the RegMO operand for the ADDI.

I have updated the patch and we can continue the review now.

I think that this is semantically correct now. There are just a few other stylistic comments.
Also, since this is a rather high-risk patch, can you describe on the patch the functional testing you have done with the patch (list of applications, bootstrap, etc.)?

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3014

s/describe/described

3096

s/reversely/in reverse

3103

I think the reader has to do a fair bit of thinking to convince themselves that this loop does the right thing. I certainly had to.
Would it not be easier to just:

  • Exit immediately if a def is found that is not the DefMI
  • Break out of the loop if we get to DefMI since we don't care what happens prior to it

The idea would be that we simply cannot transform if we found a register def prior to DefMI in the reverse walk. Also, I don't really understand why we care about any other uses of the register being forwarded? If we have shown that the only visible def of the register is DefMI, we don't care about any uses - we just need to make sure that if DefMI->readsRegister(Reg,...), we can only transform if KilledDefMI.
So the simplified sequence would be:

for (; It != E; ++It) {
  if (It->modifiesRegister(...) && (&*It) != DefMI)
    return false;
  // Made it to DefMI without encountering a clobber.
  if ((&*It) == DefMI)
    break;
}

// If DefMI also uses the register to be forwarded, we can only forward it if
// DefMI is being erased.
if (DefMI->readsRegister(Reg, ...))
  return KillDefMI;
return true;

Or did I miss something and that code doesn't sufficiently specify the constraints?

steven.zhang marked 4 inline comments as done.Aug 17 2018, 2:04 AM
steven.zhang added inline comments.
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3103

You are right. If the DefMI could be erased, there won't any use between the DefMI and MI. We don't need to check the use here.

steven.zhang marked an inline comment as done.Aug 17 2018, 2:07 AM

I have done the following tests with this patch:

  1. LLVM check-all uniitests
  2. LLVM test-suites
  3. bootstrap build the clang with the clang including my fix, and then, run the llvm check-all and test-suites, and it passed.

The patch passed all these tests.

nemanjai accepted this revision.Aug 17 2018, 5:46 AM

I have done the following tests with this patch:

  1. LLVM check-all uniitests
  2. LLVM test-suites
  3. bootstrap build the clang with the clang including my fix, and then, run the llvm check-all and test-suites, and it passed.

    The patch passed all these tests.

Thank you for the update to the patch as well as the update regarding testing that was done.
LGTM.

This revision is now accepted and ready to land.Aug 17 2018, 5:46 AM
This revision was automatically updated to reflect the committed changes.