Page MenuHomePhabricator

[PowerPC] Fold redundant load immediates of zero and delete if possible

Authored by kamaub on Oct 18 2019, 7:29 AM.



This patch folds redundant load immediates into a zero for instructions
which recognise this as the value zero and not the register. If the load
immediate is no longer in use it is then deleted.

This is already done in earlier passes but the ppc-mi-peephole allows for
a more general implementation.

Event Timeline

kamaub created this revision.Oct 18 2019, 7:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 18 2019, 7:29 AM
NeHuang added inline comments.Oct 18 2019, 10:50 AM

nit: typo - "delete the caller"


Is this function defined for load immediate 0 case only? Better to add it into function comments.


Any reason not adding a big endian test here? : )

amyk added inline comments.Oct 19 2019, 10:16 PM

End with period.


Is MRI needed in this function/is it being used?


Can we add a comment here describing the following?


nit: Add a space here maybe for better readability.


if it is not needed


Correct me if I am wrong, but isn't this being checked inside FoldImmediateWithoutDelete?

if (!DefMI.getOperand(1).isImm())
   return false;
 if (DefMI.getOperand(1).getImm() != 0)
   return false;

Do we need to check this in both places?

nemanjai requested changes to this revision.Oct 21 2019, 7:14 PM

Please refactor the common logic out and address the comments.


We shouldn't do this. We now have two nearly identical implementations of this function. One deletes the def the other one does not. That is the only difference as far as I can tell.

Please refactor the common logic into a single function to form something like this:

bool PPCInstrInfo::onlyFoldImm(...) { 
  // All the folding logic.
bool PPCInstrInfo::FoldImmediate(...) {
  bool Changed = onlyFoldImm(...);
  if (Changed && MRI->use_nodbg_empty(Reg))
    // Delete the def
  return Changed;

This run-on sentence is hard to follow. Break it up a bit please. Perhaps:

// If we are materializing a zero, look for any use operands for which zero
// means immediate zero. All such operands can be replaced with PPC::ZERO.

We do want to keep it here as well because there may be many users. It is pointless to go through all of them if the materialized value is non-zero. But please do combine these into a single condition:

if (!MI.getOperand(1).isImm() || MI.getOperand(1).getImm() != 0)

Let's not check this at each iteration. We can just check for no remaining uses after the loop.
Then the body of the loop becomes a single line:

Simplified |=
  TII->FoldImmediateWithoutDelete(UseMI, MI, MIDestReg, MRI);

Here, we can check for no remaining uses of MIDestReg and mark MI for deletion if so.

This revision now requires changes to proceed.Oct 21 2019, 7:14 PM
lkail added a reviewer: Restricted Project.Oct 21 2019, 8:20 PM
kamaub updated this revision to Diff 226899.Oct 29 2019, 7:37 AM
  • Refractoring this patch to removed duplicated function

Mostly minor comments and a couple of questions.


I know that this is how it was done in the past but how about just checking:


Or maybe there was a good reason to have done it this way in the past and I can't see it. :)


Why do you need to check && Folded here? The output register of this LI instruction has no non-debug uses so you can probably just schedule it to be deleted anyway. If you do that you also don't need to keep Folded around and just do what Nemanja mentioned:

Simplified |=
  TII->FoldImmediateWithoutDelete(UseMI, MI, MIDestReg, MRI);

Add -mcpu to these tests. If you don't I think that the default for LE is pwr8 and the default for BE is pwr4 (?). Anyway, I think you will get a more reliable test if you specify the cpu.


Why do we lose a store here? Is it because we remove an LI instruction and then that reduces register pressure?

nemanjai requested changes to this revision.Nov 6 2019, 7:50 AM
nemanjai added inline comments.

In fact, as implemented, Folded will be set to whatever TII->onlyFoldImmediate() returns for the last use which may be false even if you have folded it in some previous uses. This needs to be fixed.

This revision now requires changes to proceed.Nov 6 2019, 7:50 AM
kamaub marked 14 inline comments as done.Nov 11 2019, 11:19 AM

Thank you, comments were addressing in most recent commit.

kamaub updated this revision to Diff 229149.Nov 13 2019, 11:49 AM
kamaub marked 7 inline comments as done.
  • Changing some condition and fixing test cases.

This change edits some conditions to be more robust in their logic.
Of note is the change to clean up load immediates even if they weren't
folded, a side-effect of this is that one test case needs to fixed up.

All comments have been address, thank you.


That right, I shouldn't used Folded |= instead, I've gone with your (earlier suggestion however, thank you.


Yes, an Li is removed which does reduce register pressure but that isn't why this test cases changes. r28 is spilled because zero is loaded into it after the prologue but, with this patch, it is folded into and addi and removed so the stack is reduced from 28 -> 24

stefanp accepted this revision as: stefanp.Nov 15 2019, 1:32 PM


nemanjai accepted this revision.Nov 19 2019, 4:40 AM

LGTM except for formatting nits. Thanks for addressing the comments.


Similar issue with alignment.


Minor nit: alignment of the last parameter is off, please fix on the commit.

This revision is now accepted and ready to land.Nov 19 2019, 4:40 AM
amyk added inline comments.Nov 23 2019, 3:21 PM

Minor nit: indentation here and on line 6.

This revision was automatically updated to reflect the committed changes.
kamaub marked 3 inline comments as done.May 12 2020, 11:18 AM