This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Expand `foldSelectICmpAndOr` -> `foldSelectICmpAndBinOp` to work for more binops
ClosedPublic

Authored by goldstein.w.n on Apr 14 2023, 11:12 PM.

Details

Diff Detail

Event Timeline

goldstein.w.n created this revision.Apr 14 2023, 11:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2023, 11:12 PM
goldstein.w.n requested review of this revision.Apr 14 2023, 11:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2023, 11:12 PM
nikic requested changes to this revision.Apr 23 2023, 1:15 AM

Why is this legal for all binops? Doesn't this transform require that the neutral element of the binop is zero? So it would work for or, xor or add, but not for mul or and for example?

I think this entire transform should probably be handled as a two step process: First, Cond ? X : BinOp(X, C) should become BinOp(X, Cond ? NeutralC : C) and then this fold should work on Cond ? 0 : C as the root. We actually already do the former canonicalization, but not in the case where C is constant. This will cleanly separate out the actual binop handling. Only disadvantage is that we won't be able to handle the case where the binop has multiple uses anymore, but that's a general issue with composable folds that we can ignore unless there is reason to believe that multi-use is motivating for the current handling.

This revision now requires changes to proceed.Apr 23 2023, 1:15 AM

Why is this legal for all binops? Doesn't this transform require that the neutral element of the binop is zero? So it would work for or, xor or add, but not for mul or and for example?

Yeah you are right. Not sure what I was thinking...

I think this entire transform should probably be handled as a two step process: First, Cond ? X : BinOp(X, C) should become BinOp(X, Cond ? NeutralC : C) and then this fold should work on Cond ? 0 : C as the root.
We actually already do the former canonicalization, but not in the case where C is constant.

Where do we do the canonicalization?

This will cleanly separate out the actual binop handling. Only disadvantage is that we won't be able to handle the case where the binop has multiple uses anymore, but that's a general issue with composable folds that we can ignore unless there is reason to believe that multi-use is motivating for the current handling.

Why won't we be able to handle multi-use anymore? If we do it like this, seems this function can be deleted once we get the canonicalization of Cond ? X : BinOp(X, C) -> BinOp(X, Cond ? NeutralC : C)?

I think this entire transform should probably be handled as a two step process: First, Cond ? X : BinOp(X, C) should become BinOp(X, Cond ? NeutralC : C) and then this fold should work on Cond ? 0 : C as the root.
We actually already do the former canonicalization, but not in the case where C is constant.

Where do we do the canonicalization?

In foldSelectIntoOp(), with the current constant restriction at https://github.com/llvm/llvm-project/blob/8d163e5045073a5ac570225cc8e14cc9f6d72f09/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp#L507.

I think this entire transform should probably be handled as a two step process: First, Cond ? X : BinOp(X, C) should become BinOp(X, Cond ? NeutralC : C) and then this fold should work on Cond ? 0 : C as the root.
We actually already do the former canonicalization, but not in the case where C is constant.

Where do we do the canonicalization?

In foldSelectIntoOp(), with the current constant restriction at https://github.com/llvm/llvm-project/blob/8d163e5045073a5ac570225cc8e14cc9f6d72f09/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp#L507.

Thanks. So how does the following sound:

  1. Patch foldSelectICmpAnd ->FoldSelectICmpPow2OrZero so we can handle:
define i8 @should_be_doable(i8 %x) {
  %xm1 = sub i8 %x, 1
  %xa = and i8 %xm1, %x
  %cmp = icmp eq i8 %xa, 16
  %s = select i1 %cmp, i8 64, i8 0
  ret i8 %s
}
  1. Update foldSelectIntoOp to do constants iff isPow2(abs(TC - FC)) and Cond is either (ICmp eq Pow2OrZero, C_Pow2) or a SignTest

At that point this function should be mostly redundant. There may be some edgecases with NeutralC is non-zero (i.e mul) which
we will still miss (currently as well) but I'm going to try and remove this function.

nikic added a comment.Apr 23 2023, 1:22 PM
  1. Update foldSelectIntoOp to do constants iff isPow2(abs(TC - FC)) and Cond is either (ICmp eq Pow2OrZero, C_Pow2) or a SignTest

Ideally we would just remove the limitation entirely -- at the IR level, we would certainly prefer having a select on constants. However, this would likely need a backend undo transform, because that saves materializing the zero constant and likely allows folding the other constant into an immediate operand.

  1. Update foldSelectIntoOp to do constants iff isPow2(abs(TC - FC)) and Cond is either (ICmp eq Pow2OrZero, C_Pow2) or a SignTest

Ideally we would just remove the limitation entirely -- at the IR level, we would certainly prefer having a select on constants. However, this would likely need a backend undo transform, because that saves materializing the zero constant and likely allows folding the other constant into an immediate operand.

Yeah that makes sense.

I'll get started on a series to do that.

  1. Update foldSelectIntoOp to do constants iff isPow2(abs(TC - FC)) and Cond is either (ICmp eq Pow2OrZero, C_Pow2) or a SignTest

Ideally we would just remove the limitation entirely -- at the IR level, we would certainly prefer having a select on constants. However, this would likely need a backend undo transform, because that saves materializing the zero constant and likely allows folding the other constant into an immediate operand.

@nikic so the backend has code to undo this already. Its just not enabled for scalars. I enabled it on a branch for x86 here:
https://github.com/goldsteinn/llvm-project/pull/new/enable-select-backend
but a lot of regressions. Maybe simpler to just handle the known cases here?

nikic added a comment.Apr 24 2023, 2:08 AM
  1. Update foldSelectIntoOp to do constants iff isPow2(abs(TC - FC)) and Cond is either (ICmp eq Pow2OrZero, C_Pow2) or a SignTest

Ideally we would just remove the limitation entirely -- at the IR level, we would certainly prefer having a select on constants. However, this would likely need a backend undo transform, because that saves materializing the zero constant and likely allows folding the other constant into an immediate operand.

@nikic so the backend has code to undo this already. Its just not enabled for scalars. I enabled it on a branch for x86 here:
https://github.com/goldsteinn/llvm-project/pull/new/enable-select-backend
but a lot of regressions. Maybe simpler to just handle the known cases here?

I think your patch handles a few more cases than we are interested in here. Per the code in getSelectFoldableOperands() we don't do this transform for div/rem, so I don't think we need the additional handling for those. More importantly, it looks like your patch will also do the undo transform for the case where the select has the identity as one element and a non-constant as the other. I think the diffs will look better if you limited the scalar case to two constant operands. The constant + non-constant case can probably also be beneficial, but the heuristics for that are less obvious.

  1. Update foldSelectIntoOp to do constants iff isPow2(abs(TC - FC)) and Cond is either (ICmp eq Pow2OrZero, C_Pow2) or a SignTest

Ideally we would just remove the limitation entirely -- at the IR level, we would certainly prefer having a select on constants. However, this would likely need a backend undo transform, because that saves materializing the zero constant and likely allows folding the other constant into an immediate operand.

@nikic so the backend has code to undo this already. Its just not enabled for scalars. I enabled it on a branch for x86 here:
https://github.com/goldsteinn/llvm-project/pull/new/enable-select-backend
but a lot of regressions. Maybe simpler to just handle the known cases here?

I think your patch handles a few more cases than we are interested in here. Per the code in getSelectFoldableOperands() we don't do this transform for div/rem, so I don't think we need the additional handling for those. More importantly, it looks like your patch will also do the undo transform for the case where the select has the identity as one element and a non-constant as the other. I think the diffs will look better if you limited the scalar case to two constant operands. The constant + non-constant case can probably also be beneficial, but the heuristics for that are less obvious.

Posted WIP, then saw this. Updating so we can specify if both arms constant only.

Fix to only use some binops

goldstein.w.n edited the summary of this revision. (Show Details)Jun 27 2023, 2:48 PM

Why is this legal for all binops? Doesn't this transform require that the neutral element of the binop is zero? So it would work for or, xor or add, but not for mul or and for example?

I think this entire transform should probably be handled as a two step process: First, Cond ? X : BinOp(X, C) should become BinOp(X, Cond ? NeutralC : C) and then this fold should work on Cond ? 0 : C as the root. We actually already do the former canonicalization, but not in the case where C is constant. This will cleanly separate out the actual binop handling. Only disadvantage is that we won't be able to handle the case where the binop has multiple uses anymore, but that's a general issue with composable folds that we can ignore unless there is reason to believe that multi-use is motivating for the current handling.

Fixed, have proofs for all the binops I added.

nikic added inline comments.Jun 29 2023, 7:38 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
722

Possibly make this a check that getBinOpIdentity() with AllowRHSConstant=true is zero instead?

Or at least comment on what the criterion for valid binops is...

goldstein.w.n marked an inline comment as done.Jul 9 2023, 7:09 PM
goldstein.w.n edited the summary of this revision. (Show Details)

Use getIdentityBinOp instead of just having a table

goldstein.w.n retitled this revision from [InstCombine] Expand `foldSelectICmpAndOr` -> `foldSelectICmpAndBinOp` to work for any binop to [InstCombine] Expand `foldSelectICmpAndOr` -> `foldSelectICmpAndBinOp` to work for more binops.Jul 9 2023, 7:13 PM
goldstein.w.n edited the summary of this revision. (Show Details)
nikic accepted this revision.Jul 10 2023, 3:34 AM

LGTM

This revision is now accepted and ready to land.Jul 10 2023, 3:34 AM
This revision was landed with ongoing or failed builds.Aug 16 2023, 8:43 PM
This revision was automatically updated to reflect the committed changes.

Since this change, our 2 stage 32 bit Armv7 builder has been failing to build the second stage: https://lab.llvm.org/buildbot/#/builders/182/builds/7193/steps/9/logs/stdio

FAILED: lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ISelLowering.cpp.o 
/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage1.install/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_FILE_OFFSET_BITS=64 -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_LIBCPP_ENABLE_HARDENED_MODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage2/lib/Target/X86 -I/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/llvm/llvm/lib/Target/X86 -I/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/llvm/llvm/include -mcpu=cortex-a15 -mfpu=vfpv3 -marm -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -std=c++17 -MD -MT lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ISelLowering.cpp.o -MF lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ISelLowering.cpp.o.d -o lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ISelLowering.cpp.o -c /home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage1.install/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_FILE_OFFSET_BITS=64 -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_LIBCPP_ENABLE_HARDENED_MODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage2/lib/Target/X86 -I/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/llvm/llvm/lib/Target/X86 -I/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/llvm/llvm/include -mcpu=cortex-a15 -mfpu=vfpv3 -marm -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -fvisibility=hidden -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -std=c++17 -MD -MT lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ISelLowering.cpp.o -MF lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ISelLowering.cpp.o.d -o lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86ISelLowering.cpp.o -c /home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/llvm/llvm/lib/Target/X86/X86ISelLowering.cpp
1.	<eof> parser at end of file
2.	Optimizer
 #0 0x03ae2214 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage1.install/bin/clang+++0x3246214)
 #1 0x03adfb88 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage1.install/bin/clang+++0x3243b88)
 #2 0x03ae1248 llvm::sys::CleanupOnSignal(unsigned int) (/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage1.install/bin/clang+++0x3245248)
 #3 0x03a4998c CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #4 0xf790e530 __default_sa_restorer /build/glibc-9MGTF6/glibc-2.31/signal/../sysdeps/unix/sysv/linux/arm/sigrestorer.S:67:0
 #5 0x033a2ad4 llvm::Constant::isNullValue() const (/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage1.install/bin/clang+++0x2b06ad4)
 #6 0x0376b91c foldSelectICmpAndBinOp(llvm::ICmpInst const*, llvm::Value*, llvm::Value*, llvm::IRBuilder<llvm::TargetFolder, llvm::IRBuilderCallbackInserter>&) InstCombineSelect.cpp:0:0
 #7 0x0376a104 llvm::InstCombinerImpl::foldSelectInstWithICmp(llvm::SelectInst&, llvm::ICmpInst*) (/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage1.install/bin/clang+++0x2ece104)
 #8 0x03770248 llvm::InstCombinerImpl::visitSelectInst(llvm::SelectInst&) (/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage1.install/bin/clang+++0x2ed4248)
 #9 0x036a5f74 llvm::InstCombinerImpl::run() (/home/tcwg-buildbot/worker/clang-armv7-vfpv3-2stage/stage1.install/bin/clang+++0x2e09f74)
<...>

You have many patches in that build and I'm picking out this one just due to foldSelectICmpAndBinOp.

It's not failing on the other 32 bit Arm builders because they are a few hours behind.

I'm not sure if I can cleanly revert this patch but before I look at that I'll get a reproducer for you.

I've confirmed just this commit is enough to hit the issue, and here's the reproducer:

bjope added a subscriber: bjope.Aug 17 2023, 11:03 AM

I've confirmed just this commit is enough to hit the issue, and here's the reproducer:

Thank you for creating the repro and reverting this.
Issue was it was missing a nullptr check when checking the identity constant.

@nikic I've retested this. Going to repush in a few days unless anyone objects.

goldstein.w.n reopened this revision.Aug 24 2023, 11:06 AM
This revision is now accepted and ready to land.Aug 24 2023, 11:06 AM

Add nullptr check when checking identity constant

Can you please also add a test for that case (unless you already committed it)?

Add tests for reproducing prior bug

nikic accepted this revision.Aug 24 2023, 12:14 PM

Thanks, LGTM.

This revision was landed with ongoing or failed builds.Aug 24 2023, 5:43 PM
This revision was automatically updated to reflect the committed changes.

Seems this still miscompiles llvm-rc/ResourceFileWriter.cpp for targeting x86-64. Investigating.

FYI, similar miscompilation in llvm-rc; https://lab.llvm.org/buildbot/#/builders/124/builds/8260

Investigating, if don't see obvious fix shortly will revert.

Seems this still miscompiles llvm-rc/ResourceFileWriter.cpp for targeting x86-64. Investigating.

Are you certain its this? I just checked and the IR for llvm-rc/ResourceFileWriter.cpp is unchanged with/without this commit.

Re-reverted to be safe. Will look into this more tomorrow

I am using ubuntu-20.04 (amd64) and stage2-clang uses libstdc++ for bootstrapping.

Ubuntu (aarch64) didn't complain. (I don't know why)

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
757

Would it be right if BinOp is not commutative?
I saw a malformed lshr in the miscompilation.

%spec.select = lshr i64 %sub78, %foo was transformed to %spec.select = lshr i64 %bar, %sub78.

llvm/test/Transforms/InstCombine/select-with-bitwise-ops.ll
1616

XOR (and %xor) is odd here, even if this checks that instructions are not transformed. (ditto in the next test)

goldstein.w.n added inline comments.Aug 25 2023, 10:26 AM
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
757

Yup, this is it. Didn't see that original code had inverted the operands.

goldstein.w.n reopened this revision.Aug 25 2023, 10:40 AM
This revision is now accepted and ready to land.Aug 25 2023, 10:40 AM

Seems overkill for adding tests for each binop, but ran them offline and all transform verify with alive2:

; $> /home/noah/programs/opensource/llvm-dev/src/alive2/build/alive-tv (-smt-to=200000000)

----------------------------------------
define i8 @select_icmp_eq_and_1_0_add_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %badd = add i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %badd
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_add_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = add i8 %1, %y
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_add_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %badd = add i8 %y, 2
  %select = select i1 %cmp, i8 %badd, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_add_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = add i8 %1, %y
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_or_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %bor = or i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %bor
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_or_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = or i8 %1, %y
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_or_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %bor = or i8 %y, 2
  %select = select i1 %cmp, i8 %bor, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_or_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = or i8 %1, %y
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_xor_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %bxor = xor i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %bxor
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_xor_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = xor i8 %1, %y
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_xor_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %bxor = xor i8 %y, 2
  %select = select i1 %cmp, i8 %bxor, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_xor_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = xor i8 %1, %y
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_mul_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %bmul = mul i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %bmul
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_mul_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %select = shl i8 %y, %and
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_mul_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %bmul = mul i8 %y, 2
  %select = select i1 %cmp, i8 %bmul, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_mul_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %select = shl i8 %y, %and
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_and_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %band = and i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %band
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_and_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %band = and i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %band
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_and_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %band = and i8 %y, 2
  %select = select i1 %cmp, i8 %band, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_and_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %cmp.not = icmp eq i8 %and, 0
  %band = and i8 %y, 2
  %select = select i1 %cmp.not, i8 %y, i8 %band
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_sub_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %bsub = sub i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %bsub
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_sub_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %bsub = add i8 %y, 254
  %select = select i1 %cmp, i8 %y, i8 %bsub
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_sub_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %bsub = sub i8 %y, 2
  %select = select i1 %cmp, i8 %bsub, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_sub_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %cmp.not = icmp eq i8 %and, 0
  %bsub = add i8 %y, 254
  %select = select i1 %cmp.not, i8 %y, i8 %bsub
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_shl_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %bshl = shl i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %bshl
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_shl_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = shl i8 %y, %1
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_shl_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %bshl = shl i8 %y, 2
  %select = select i1 %cmp, i8 %bshl, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_shl_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = shl i8 %y, %1
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_lshr_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %blshr = lshr i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %blshr
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_lshr_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = lshr i8 %y, %1
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_lshr_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %blshr = lshr i8 %y, 2
  %select = select i1 %cmp, i8 %blshr, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_lshr_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = lshr i8 %y, %1
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_ashr_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %bashr = ashr i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %bashr
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_ashr_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = ashr i8 %y, %1
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_ashr_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %bashr = ashr i8 %y, 2
  %select = select i1 %cmp, i8 %bashr, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_ashr_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = shl i8 %x, 1
  %1 = and i8 %and, 2
  %select = ashr i8 %y, %1
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_sdiv_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %bsdiv = sdiv i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %bsdiv
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_sdiv_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %bsdiv = sdiv i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %bsdiv
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_sdiv_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %bsdiv = sdiv i8 %y, 2
  %select = select i1 %cmp, i8 %bsdiv, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_sdiv_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %cmp.not = icmp eq i8 %and, 0
  %bsdiv = sdiv i8 %y, 2
  %select = select i1 %cmp.not, i8 %y, i8 %bsdiv
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_udiv_fv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp eq i8 %and, 0
  %budiv = udiv i8 %y, 2
  %select = select i1 %cmp, i8 %y, i8 %budiv
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_udiv_fv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %select = lshr i8 %y, %and
  ret i8 %select
}
Transformation seems to be correct!


----------------------------------------
define i8 @select_icmp_eq_and_1_0_udiv_tv(i8 %x, i8 %y) {
%0:
  %and = and i8 %x, 1
  %cmp = icmp ne i8 %and, 0
  %budiv = udiv i8 %y, 2
  %select = select i1 %cmp, i8 %budiv, i8 %y
  ret i8 %select
}
=>
define i8 @select_icmp_eq_and_1_0_udiv_tv(i8 %x, i8 %y) nofree willreturn memory(none) {
%0:
  %and = and i8 %x, 1
  %select = lshr i8 %y, %and
  ret i8 %select
}
Transformation seems to be correct!

Summary:
  22 correct transformations
  0 incorrect transformations
  0 failed-to-prove transformations
  0 Alive2 errors

Don't commute V and Y when creating final binop

Thanks for the fix.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
655–660

Could you update expressions in the comment please?

Update comment

goldstein.w.n marked an inline comment as done.Aug 25 2023, 3:24 PM

Okay for push this for attempt #3?

nikic accepted this revision.Aug 29 2023, 11:22 AM

LGTM

This revision was landed with ongoing or failed builds.Sep 1 2023, 3:16 PM
This revision was automatically updated to reflect the committed changes.