Page MenuHomePhabricator

[PowerPC] Add folding patterns for rlwinm + andi_rec.
Needs ReviewPublic

Authored by Esme on Oct 25 2020, 9:48 PM.

Details

Reviewers
steven.zhang
shchenz
jsji
nemanjai
qiucf
lkail
Group Reviewers
Restricted Project
Summary

This patch depends on D89855. We have patterns to fold RLWINM + RLWINM.
Pairs of RLWINM and ANDI_rec can also be folded in some cases.
Following is a scenario in C code:

int tmp = vec_test_swdiv(x,y);
if (((__builtin_rotateleft32(tmp, 62)) & (1)) != 0){
  ...
}

clang -c t.c -O3 generates the sequence:

xvtdivdp cr0,vs34,vs35
mfocrf  r3,128
rlwinm  r3,r3,4,28,31 --->generated in POST-RA
andi.   r3,r3,4

This patch will fold it to:

xvtdivdp cr0,vs34,vs35
mfocrf   r3,128
rlwinm.  r3,r3,4,29,29

Diff Detail

Event Timeline

Esme created this revision.Oct 25 2020, 9:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 25 2020, 9:48 PM
Esme requested review of this revision.Oct 25 2020, 9:48 PM
Esme updated this revision to Diff 307494.Nov 24 2020, 6:46 PM
Esme edited the summary of this revision. (Show Details)
Esme updated this revision to Diff 307952.Nov 26 2020, 7:55 PM

Added the situation of Mask == 0.

Esme planned changes to this revision.Nov 26 2020, 10:23 PM
Esme updated this revision to Diff 308378.Nov 30 2020, 8:22 AM

FIXME: A deg was found in llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll

Esme planned changes to this revision.Nov 30 2020, 8:23 AM
Esme updated this revision to Diff 308533.Nov 30 2020, 9:31 PM

This is more clear than the previous version. Some implementation comments.

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3426

Nice catch! Then should we reuse legacy logic? If ANDI_rec mask is not runofones, we can not fold RLWINM + ANDI_rec to a single RLWINM? If ANDI_rec mask is runofones, can we treat it as special rlwinm_rec with sh = 0, and MB, ME can be got from runofones check?

Esme updated this revision to Diff 308919.Dec 2 2020, 2:22 AM

Reused legacy logic.

nemanjai added inline comments.Dec 2 2020, 3:39 AM
llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
78 ↗(On Diff #308533)

Well this is less than ideal. Perhaps we should only do this if the GPR that the andi. defines is dead?

nemanjai added inline comments.Dec 2 2020, 3:43 AM
llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
78 ↗(On Diff #308533)

Sorry, this was meant to say that we should check that the GPR that andi. uses is killed.

Esme updated this revision to Diff 309169.Dec 3 2020, 12:00 AM

Exclude the cases if SrcMI can't be erased and the Reg used by MI is not kill.

Esme added inline comments.Dec 3 2020, 11:47 PM
llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
78 ↗(On Diff #308533)

Sorry, this was meant to say that we should check that the GPR that andi. uses is killed.

@nemanjai Sorry, although I have updated the revision, I am actually a bit confused about this fix. As the comment added in last version (Diff 308919), I assumed the deg was caused by RA.

; FIXME: A redundant COPY was generated during RA.
; i.e.   rlwinm r29, r30, 0, 24, 22
;        mr r30, r29

Checking the GPR that andi. uses is killed can skip folding the case and avoid the deg, however is it the cause of the deg? I would be thankful if you have more comments about this issue.

Esme updated this revision to Diff 313011.Dec 20 2020, 7:27 PM

Rebase on D89855.

nemanjai added inline comments.Dec 21 2020, 2:07 PM
llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
78 ↗(On Diff #308533)

Well, prior to this change, you had:

0x0: rlwinm r30, r30, 0, 24, 22
0x4: andi. r3, r30, 2
...
# some subsequent use of r30

So when you change it to an rlwinm. that produces what andi. produces, you no longer have the value that was previously produced by the existing rlwinm (on 0x0 above). Therefore, that instruction has to be kept so you get rlwinm r29, r30, 0, 24, 22.
But then you might wonder why doesn't it just end up being rlwinm r30, r30, 0, 24, 22? I suspect that is because the input register is still not marked as killed.