Fold select_cc x, 0, const0, const1, lt into sub (z, and ((sra x, sizeof(x)-1), z - y)), see https://alive2.llvm.org/ce/z/jqcf8t. On PPC, it's cheap to performance such optimization with rlwinm.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/include/llvm/CodeGen/TargetLowering.h | ||
---|---|---|
2790 ↗ | (On Diff #311484) | We need to determine whether other targets would benefit from this - otherwise I'd suggest the initial implementation was purely inside PPCISelLowering |
@lkail Reverse ping - as I said earlier I'd probably suggest we make this ppc only initially
@RKSimon Sorry for late response. I have moved the transformation into PPCISelLowering.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
16731 | Use same variable names for these comments? c0/c1 or y/z? |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
16736 | I don't think it's always profitable. I've updated the comment to make it clearer. The transformation involves in arithmetic shift and SIGN_EXTEND looks not zero overhead under some circumstances. |
I'm not familiar with current PPC micro-arch details, so someone else should also look at this patch...
But this seems more restrictive than necessary. Don't we want to handle "x > -1" in addition to "x < 0"?
Note: I added regression tests with this comment to "llvm/test/CodeGen/PowerPC/select_const.ll" long ago:
; select Cond, C1, C2 --> add (mul (zext Cond), C1-C2), C2 --> add (and (sext Cond), C1-C2), C2
Also for the signbit compare pattern, we either have:
https://alive2.llvm.org/ce/z/jqcf8t (from the description)
or:
https://alive2.llvm.org/ce/z/YZyLpm
So we can always turn a 4-instruction isel sequence of arbitrary constants:
li 4, -500 cmpdi 3, 0 li 3, -42 isellt 3, 3, 4
into something like this:
sradi 3, 3, 63 andi. 3, 3, 458 addi 3, 3, -500
So it always saves an instruction to convert to ALU ops?
That should be "arbitrary small (signed 16-bit) constants". I agree that it doesn't look profitable if we need "lis" to materialize big constants. :)
Use same variable names for these comments? c0/c1 or y/z?