The custom lowering saves an instruction over the generic expansion, by
taking advantage of the fact that PowerPC shift instructions are well
defined in the shift-by-bitwidth case.
Details
- Reviewers
hfinkel RKSimon nemanjai - Group Reviewers
Restricted Project - Commits
- rG28e322ea9393: [PowerPC] Custom lowering for funnel shifts
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
620–628 | This is just guesswork. I'm really not sure which types we should do this for, under what conditions. | |
llvm/test/CodeGen/PowerPC/funnel-shift.ll | ||
31–43 | Can I pre-commit this new test case, and fshr_i64? | |
llvm/test/CodeGen/PowerPC/pr44183.ll | ||
16–22 | This regressions seems to be caused by not constant folding based on the fact that r3 is known to be 4. Can anyone suggest how to fix it? Do I have to spot known constant shift amounts in PPCTargetLowering::LowerFunnelShift? |
llvm/test/CodeGen/PowerPC/funnel-shift.ll | ||
---|---|---|
31–43 | Yes, please pre-commit those, so we see the diffs. |
llvm/test/CodeGen/PowerPC/pr44183.ll | ||
---|---|---|
16–22 | This is an unusual test because it contains "i216" types, but that's apparently the way the code was written in: Maybe because of that, the constants are marked opaque from DAG creation time: |
Have you triaged whats going on in the mulfixsat test regressions?
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
8652 | Maybe also show the pseudocode as used by PPC for comparison? |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
8652 | This is pseudocode for the PPC expansion. Did you mean, show the more complicated expansion that TargetLowering would have used? | |
llvm/test/CodeGen/PowerPC/umulfixsat.ll | ||
12 ↗ | (On Diff #278470) | I think the extra instruction here is just due to bad luck in the register allocation or scheduling. |
29 ↗ | (On Diff #278470) | It's hard to see because of the register allocation and scheduling differences but there is a slight regression here, from: rotlwi 3, 3, 31 rlwimi 3, 5, 31, 0, 0 to: srwi 5, 5, 1 slwi 4, 3, 31 or 4, 4, 5 I suppose I have regressed funnel shifts by a constant amount. I'll see if I can fix it. |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
8652 | Doh! Sorry about that, I misread it this morning - no need to add anything here. |
My rationale is that in PPCTargetLowering::LowerFunnelShift it's correct to use PPCISD::SHL and PPCISD::SRL (instead of the ISD:: versions) precisely because they are defined in the shift-by-bitwidth case. But I think it's safe for BitPermutationSelector to treat them the same as the ISD:: versions. @hfinkel @inouehrs does this seem reasonable?
LGTM.
llvm/test/CodeGen/PowerPC/pr44183.ll | ||
---|---|---|
16–22 | I agree with Sanjay, this is due to the weird size. However, it is not a concern. I'll post a very small tweak to our Reg+Reg -> Reg+Imm transformation that recovers this. |
This is just guesswork. I'm really not sure which types we should do this for, under what conditions.