Page MenuHomePhabricator

[PowerPC] Extend folding RLWINM + RLWINM to post-RA.
ClosedPublic

Authored by Esme on Oct 21 2020, 12:51 AM.

Details

Summary

This patch depends on D89846.
We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too.

Diff Detail

Event Timeline

Esme created this revision.Oct 21 2020, 12:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2020, 12:51 AM
Esme requested review of this revision.Oct 21 2020, 12:51 AM

It would be great if we could split this patch into two parts:
1: extend rlwinm + rlwinm folding to post-ra and add some post-ra testcases, MIR cases are better.
2: add new simplication for rlwinm + andi_rec and add some post-ra and pre-ra testcases.

Esme updated this revision to Diff 299857.Oct 21 2020, 9:42 PM
Esme retitled this revision from [PowerPC] Combine RLWINM and RLWINM/ANDI_rec before or after RA. to [PowerPC] Extend folding RLWINM + RLWINM to post-RA..
Esme edited the summary of this revision. (Show Details)
Esme updated this revision to Diff 299864.Oct 21 2020, 10:27 PM

Some nits.

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3234–3250

Comments need to be updated.

3238

How about use the MachineInstr *&ToErase to avoid the null check and default arguments ?

3253

It is preferred to initialize the stack variable.

3254

Remove such kind of comments as the code is self explained.

3259

Common line 3209 and 3217 then, add a check in 3220

3269

A better code style to avoid the duplicate check of the SrcOpCode.

switch (SrcOpCode) {
case RLWINM8
case RLWINM8_rec:
    Is64Bit = true;
    break;
case ...
}
3412

null check?

llvm/lib/Target/PowerPC/PPCPreEmitPeephole.cpp
40

Change the name as it is extended to all rotate combining.

481

No {} if there is only one statement.

Esme updated this revision to Diff 299884.Oct 22 2020, 12:15 AM

Updated as Steven's comments.

Esme marked 8 inline comments as done.Oct 22 2020, 12:25 AM
Esme updated this revision to Diff 300181.Oct 23 2020, 12:30 AM
shchenz added inline comments.Oct 23 2020, 7:49 AM
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3234–3250

The strange character before "We"

3239

Can we initial Is64Bit to true or false and then we only need to handle 32-bit or 64-bit opcodes in following switch?

3263

Any special reason that we don't do the folding after RA when there are multiple uses for SrcMI? Before RA, we do such kind of transformation as it is still benefit. We eliminate the dependancy between SrcMI and MI.

3268

What case will have rlwinm input is rlwinm8 or rlwinm8 input is rlwinm? Just curious.

3411

Same as above, after RA, even SrcMI has multiple uses, we still should do the transform?

Esme updated this revision to Diff 300566.Oct 25 2020, 6:48 PM
Esme added inline comments.Oct 25 2020, 7:04 PM
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3268

There is a case in llvm/test/CodeGen/PowerPC/popcnt-zext.ll

declare i16 @llvm.ctpop.i16(i16) nounwind readnone
define i64 @popa_i16_i64(i16 %x) {
  %pop = call i16 @llvm.ctpop.i16(i16 %x)
  %z = zext i16 %pop to i64 ; SimplifyDemandedBits may turn zext (or sext) into aext
  %a = and i64 %z, 16
  ret i64 %a
}

llc -verify-machineinstrs -mtriple=powerpc64-- -mattr=+slow-popcntd < cnt16.ll -debug
We will see such transformation:
From

368B	  %22:gprc = RLWINM %21:gprc, 8, 24, 31
384B	  undef %23.sub_32:g8rc = COPY %22:gprc
400B	  %25:g8rc = RLWINM8 %23:g8rc, 0, 27, 27
416B	  $x3 = COPY %25:g8rc

To

undef %23.sub_32:g8rc = RLWINM %21:gprc, 8, 24, 31
%25:g8rc = RLWINM8 %23:g8rc, 0, 27, 27
$x3 = COPY %25:g8rc

After RA

renamable $r3 = RLWINM killed renamable $r3, 8, 24, 31, implicit-def $x3
renamable $x3 = RLWINM8 killed renamable $x3, 0, 27, 27

We will treat them as DefMI and MI after RA, but they shouldn't be folded. Please correct me if I'm wrong.

shchenz added inline comments.Oct 25 2020, 8:13 PM
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3260

This is redundant or can be changed to an assert?

3268

Thanks, I see, this is just a post-ra issue. Before RA, there are always register copy instructions between RLWINM and RLWINM8.

After Post-RA,
1: for RLWINM8(RLWINM), it should be ok to fold it to single RLWINM8,
2: for RLWINM(RLWINM8), it should be ok to fold it to a single RLWINM.

Esme added inline comments.Oct 25 2020, 8:45 PM
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3268

Thanks! I will mark this as a follow-up work after this patch submitted.

Esme updated this revision to Diff 300573.Oct 25 2020, 9:37 PM
steven.zhang accepted this revision.Oct 29 2020, 5:51 PM

LGTM overall with one minor comment.

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3410

Can we set the NoUse for pre-ra also,so that just check the NoUse here?

This revision is now accepted and ready to land.Oct 29 2020, 5:51 PM
shchenz accepted this revision.Nov 1 2020, 2:48 AM

LGTM too. Please remember to rebase this patch after you commit NFC patch D89846

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3268

sounds good to me.

Esme added a comment.Nov 2 2020, 11:25 PM

Thanks for your reviews, and I apologize for not responding in time. I will commit this patch after all tests done.

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3410

For the scenario of pre-ra, we have to check MRI->use_nodbg_empty(FoldingReg) here instead of setting NoUse before folding as post-ra did, since MRI->use_nodbg_empty(FoldingReg) is false before folding MI and SrcMI.

This revision was landed with ongoing or failed builds.Nov 2 2020, 11:44 PM
This revision was automatically updated to reflect the committed changes.
Esme updated this revision to Diff 304410.Tue, Nov 10, 11:39 PM

In post-RA, if SrcMI also defines the register to be forwarded, we can only do the folding if SrcMI is going to be erased.
For example, SrcMI and MI can't not be folded in the scenario:

$r3 = RLWINM_rec $r3, 27, 5, 10, implicit-def $cr0
dead renamable $r3 = RLWINM_rec killed renamable $r3, 8, 5, 10, implicit-def $cr0
BLR8 implicit $lr8, implicit $rm, implicit killed $cr0

I've run the bootstrap-test as well as SEPC.

Esme reopened this revision.Tue, Nov 10, 11:40 PM
This revision is now accepted and ready to land.Tue, Nov 10, 11:40 PM
Esme requested review of this revision.Thu, Nov 12, 9:41 PM

I saw some testcases are removed compared with your committed version. Is there any reason?

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3310

Can these lines be moved after line 3221? So we will bail out early.

3411

Can we move CanErase assignment for SSA to the place where we assign it for post SSA?

Esme marked an inline comment as not done.Tue, Nov 17, 1:44 AM

I saw some testcases are removed compared with your committed version. Is there any reason?

Every specific folding patterns with different masks have been tested in llvm/test/CodeGen/PowerPC/fold-rlwinm.mir
This patch didn't change the patterns, so I supposed it would be clearer to remove some redundant test cases.
Do you think they should be added back?

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3310

It's safe to call SrcMI->getOperand(1).getReg() after filtering the opcode of SrcMI.

3411

The condition MRI->use_nodbg_empty(FoldingReg) has to be checked after the folding of MI and SrcMI.

I saw some testcases are removed compared with your committed version. Is there any reason?

Every specific folding patterns with different masks have been tested in llvm/test/CodeGen/PowerPC/fold-rlwinm.mir
This patch didn't change the patterns, so I supposed it would be clearer to remove some redundant test cases.
Do you think they should be added back?

Understanding your concerns here. I prefer to keep that testcases as well. If new added post ra peephole in pre-emit-peephole pass changes the case, the pre-ra cases will not be aware of? This is up to you.^^

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3310

Is it safe if we add SrcMI->getOperand(1).isReg()?. We can bail out early if the operand 1 of SrcMI is not a register.

3411

Is it ok if we change use_nodbg_empty to hasOneNonDBGUse()?

Esme updated this revision to Diff 306584.Thu, Nov 19, 7:31 PM

Updated as Zheng's suggestions.

shchenz accepted this revision.Thu, Nov 19, 11:12 PM

LGTM. Thanks for the new fix.

This revision is now accepted and ready to land.Thu, Nov 19, 11:12 PM
This revision was landed with ongoing or failed builds.Sat, Nov 21, 11:38 PM
This revision was automatically updated to reflect the committed changes.