This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Custom lowering for funnel shifts
ClosedPublic

Authored by foad on Jul 16 2020, 6:36 AM.

Details

Reviewers
hfinkel
RKSimon
nemanjai
Group Reviewers
Restricted Project
Commits
rG28e322ea9393: [PowerPC] Custom lowering for funnel shifts
Summary

The custom lowering saves an instruction over the generic expansion, by
taking advantage of the fact that PowerPC shift instructions are well
defined in the shift-by-bitwidth case.

Diff Detail

Event Timeline

foad created this revision.Jul 16 2020, 6:36 AM
foad marked 3 inline comments as done.Jul 16 2020, 6:39 AM
foad added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
616–624

This is just guesswork. I'm really not sure which types we should do this for, under what conditions.

llvm/test/CodeGen/PowerPC/funnel-shift.ll
31–43

Can I pre-commit this new test case, and fshr_i64?

llvm/test/CodeGen/PowerPC/pr44183.ll
16–22

This regressions seems to be caused by not constant folding based on the fact that r3 is known to be 4. Can anyone suggest how to fix it? Do I have to spot known constant shift amounts in PPCTargetLowering::LowerFunnelShift?

lkail added a reviewer: Restricted Project.Jul 16 2020, 7:12 AM
spatel added a subscriber: spatel.Jul 16 2020, 7:16 AM
spatel added inline comments.
llvm/test/CodeGen/PowerPC/funnel-shift.ll
31–43

Yes, please pre-commit those, so we see the diffs.

foad updated this revision to Diff 278470.Jul 16 2020, 7:22 AM

Rebase after precommitting new test cases.

spatel added inline comments.Jul 16 2020, 7:36 AM
llvm/test/CodeGen/PowerPC/pr44183.ll
16–22

This is an unusual test because it contains "i216" types, but that's apparently the way the code was written in:
http://bugs.llvm.org/PR44183

Maybe because of that, the constants are marked opaque from DAG creation time:
Creating constant: t6: i216 = OpaqueConstant<4>
...so default constant folding is bypassed. Someone with more PPC knowledge probably needs to decide what -- if anything -- needs to be done here.

Have you triaged whats going on in the mulfixsat test regressions?

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8998

Maybe also show the pseudocode as used by PPC for comparison?

foad marked 3 inline comments as done.Jul 20 2020, 3:40 AM
foad added inline comments.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8998

This is pseudocode for the PPC expansion. Did you mean, show the more complicated expansion that TargetLowering would have used?

llvm/test/CodeGen/PowerPC/umulfixsat.ll
12

I think the extra instruction here is just due to bad luck in the register allocation or scheduling.

29

It's hard to see because of the register allocation and scheduling differences but there is a slight regression here, from:

rotlwi  3, 3, 31
rlwimi 3, 5, 31, 0, 0

to:

srwi 5, 5, 1
slwi 4, 3, 31
or 4, 4, 5

I suppose I have regressed funnel shifts by a constant amount. I'll see if I can fix it.

RKSimon added inline comments.Jul 20 2020, 4:51 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
8998

Doh! Sorry about that, I misread it this morning - no need to add anything here.

foad updated this revision to Diff 279190.Jul 20 2020, 5:23 AM

Enable combining of PPC-specific shift opcodes.

foad added a subscriber: inouehrs.Jul 20 2020, 5:28 AM

Enable combining of PPC-specific shift opcodes.

My rationale is that in PPCTargetLowering::LowerFunnelShift it's correct to use PPCISD::SHL and PPCISD::SRL (instead of the ISD:: versions) precisely because they are defined in the shift-by-bitwidth case. But I think it's safe for BitPermutationSelector to treat them the same as the ISD:: versions. @hfinkel @inouehrs does this seem reasonable?

This seems reasonable to me @hfinkel @nemanjai any more comments?

nemanjai accepted this revision.Jul 27 2020, 7:16 AM

LGTM.

llvm/test/CodeGen/PowerPC/pr44183.ll
16–22

I agree with Sanjay, this is due to the weird size. However, it is not a concern. I'll post a very small tweak to our Reg+Reg -> Reg+Imm transformation that recovers this.

This revision is now accepted and ready to land.Jul 27 2020, 7:16 AM

The fix to recover the regression: https://reviews.llvm.org/D84659

This revision was automatically updated to reflect the committed changes.