This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineSelect.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
1/3
select-of-bittest.ll

Differential D45862

[InstCombine] Remove decanonicalizing transforms of selects
AbandonedPublic

Authored by mkazantsev on Apr 19 2018, 10:15 PM.

Download Raw Diff

Details

Reviewers

craig.topper
lebedev.ri
spatel
reames
majnemer
efriedma
hfinkel
sanjoy
bkramer
javed.absar

Summary

Currently InstCombine tends to recognize patterns like

%cmp = icmp slt iA %x, 0
%select = select i1 %cmp, iB %A, iB %B

as bit-test against highest bit, and convert it into one of multiple shift-based patterns
which may look differently depending on types iA and iB and values of %A and %B.
It only happens if %A and %B are constants. We observe at least three possible sutiatons:

Case 1:

%c2 = icmp slt %a, 0
%select1 = select %c2, -1, %Greater

converts into

(%a s>> 31) | %Greater

Case 2:

%c2 = icmp slt i32 %a, 0
%select1 = select i1 %c2, i32 %Less, i32 %Greater

converts into

(%a s>> 31) - (%Less - %Greater) + %Greater

or, if it is the equivalent, to

(%a s>> 31) - (%Less - %Greater) | %Greater

Case 3:

%c2 = icmp slt i64 %a, 0
%select1 = select i1 %c2, i32 %Less, i32 %Greater

converts to

(trunc (%a s>> 63) to i32) - (%Less - %Greater) + %Greater

There are possibly more variatons of this. The general idea is that select by comparison with 0 or -1
gets transformed into multiple different patterns, and very small changes in conditions and constants
may change this pattern. There is no clear benefits of that transforms, but the obvious downside of them
is decanonicalization. For example, we have a matcher of three-ways comparison that expects to see the
pattern of the following form:

%c1 = icmp eq %a, 0
%c2 = icmp slt %a, 0
%select1 = select %c2, %Less, %Greater
%select2 = select %c1, %Equal, %select2

This pattern is widely seen across the code: it is a canonical representation of three-ways comparators
that return -1, 0 or 1 if %a is less, equal or greater than zero. In the result of transforms described in
cases 1, 2, 3, the pair of %c2 and %select1 gets transformed into something different (with shifts, casts
and different bit operations). So the three-way comparison pattern cannot be recognized anymore.

This patch removes the transforms that make us replace select by comparison against zero with all this bit
magic. We want to prefer selects over random bit magic as a more canonical form of doing that: select
has more transparent semantics and other parts of InstCombine make use it rather than try to recognize
that some sequence of non-trivial bit instructions actually represents a select by comparison against zero.

It is codegen's work to turn selects into bit operations in the end of pipeline to calculate it efficiently if
shifts are faster than cmoves, but doing so too early has no benifits and has obvious pessimization of other
pattern matchers within instcombine.

In this patch we only stop harmful transforms from happening. The code that turns all this bit magic back
to selects will go in a separate patch.

Diff Detail

Event Timeline

mkazantsev created this revision.Apr 19 2018, 10:15 PM

mkazantsev edited the summary of this revision. (Show Details)

mkazantsev added a child revision: D45863: [InstCombine] Support BitTests in ThreeWayComparison. Mismatched types.Apr 19 2018, 10:22 PM

lebedev.ri added inline comments.Apr 19 2018, 11:19 PM

test/Transforms/InstCombine/select-of-bittest.ll
662	Please use `utils/update_test_checks.py`.

Formatted tests.

lebedev.ri added inline comments.Apr 20 2018, 2:58 AM

test/Transforms/InstCombine/select-of-bittest.ll

660

For each of those, could you please add vector tests.
Something like:

define i32 @compare_to_zero_mismatched_types_idiomatic_trunc(i64 %x) {
  %cmp = icmp slt i64 %x, 0
  %select1 = select i1 %cmp, i32 -1, i32 1
  ret i32 %select1
}

define <2 x i32> @compare_to_zero_mismatched_types_idiomatic_trunc_splatvec(<2 x i64> %x) {
  %cmp = icmp slt <2 x i64> %x, <i64 0, i64 0>
  %select1 = select <2 x i1> %cmp, <i32 -1, i32 -1>, <i32 1, i32 1>
  ret <2 x i32> %select1
}

define <3 x i32> @compare_to_zero_mismatched_types_idiomatic_trunc_splatvec_undef(<3 x i64> %x) {
  %cmp = icmp slt <3 x i64> %x, <i64 0, i64 undef, i64 0>
  %select1 = select <3 x i1> %cmp, <i32 -1, i64 undef, i32 -1>, <i32 1, i64 undef, i32 1>
  ret <3 x i32> %select1
}

; May be more than one
define <3 x i32> @compare_to_zero_mismatched_types_idiomatic_trunc_nonsplatvec(<3 x i64> %x) {
  %cmp = icmp slt <3 x i64> %x, <i64 0, i64 undef, i64 0>
  %select1 = select <3 x i1> %cmp, <i32 -1, i64 0, i32 1>, <i32 1, i64 0, i32 -1>
  ret <3 x i32> %select1
}

I don't think we should be doing any of these select-of-constant transforms in instcombine.

It's worse for code analysis, more IR instructions, and may be detrimental to perf. Think about the cases where a conditional move executes at the same speed as a simple add (Ryzen?) or we have profile data for the compare, so branch prediction is perfect.

There's lots of code that does this kind of thing in the DAG, and that's where I think it belongs (using target hooks as needed). There was some discussion about this on llvm-dev here:
https://groups.google.com/forum/#!topic/llvm-dev/pid_thv2X-A

So I think we should be removing some of these transforms from instcombine rather than adding to them.

Other than that inline comment, i think it would be nice to commit
the baseline tests (as of trunk), so the effect of this proposal could be observed.
Even if the code changes won't land, this would at least document the current behavior.

test/Transforms/InstCombine/select-of-bittest.ll

660

Oh, also, this is a really bad place to add these tests.
They should either go to a new file, or at least before

; ============================================================================ ;
; Negative tests. Should not be folded.
; ============================================================================ ;

In D45862#1073841, @spatel wrote:

I don't think we should be doing any of these select-of-constant transforms in instcombine.

It's worse for code analysis, more IR instructions, and may be detrimental to perf. Think about the cases where a conditional move executes at the same speed as a simple add (Ryzen?) or we have profile data for the compare, so branch prediction is perfect.

There's lots of code that does this kind of thing in the DAG, and that's where I think it belongs (using target hooks as needed). There was some discussion about this on llvm-dev here:
https://groups.google.com/forum/#!topic/llvm-dev/pid_thv2X-A

So I think we should be removing some of these transforms from instcombine rather than adding to them.

There is a second side to this though.

Even if all such "performance-degrading" transforms are removed from instcombine
(yes, i think the pass is rather huge and monolithic, this is a problem), it won't solve the problem.
The same 'bad' patterns surely could be produced via some other way.

I don't see how this is any different than lowering intrinsics.
Of course, they are no longer intrinsics, but IR, so optimizations may
break their canonical form, and backend will need to be adjusted
to recognize the IR patters (and potentially recognize the patterns
that did not originate form intrinsics, which is great).

So i'd say this really is backend's (DAGCombine?) problem.
This is just a question of whether or not there is interest to have a high quality codegen,
regardless of the input. It should recognize the 'bad' patterns, and if profitable,
transform to improve 'performance'. In this case, back to select.

And yes, absolutely, this is more complicated than just "stop dealing with it in instcombine" :)

In D45862#1074397, @lebedev.ri wrote:

In D45862#1073841, @spatel wrote:

I don't think we should be doing any of these select-of-constant transforms in instcombine.

It's worse for code analysis, more IR instructions, and may be detrimental to perf. Think about the cases where a conditional move executes at the same speed as a simple add (Ryzen?) or we have profile data for the compare, so branch prediction is perfect.

There's lots of code that does this kind of thing in the DAG, and that's where I think it belongs (using target hooks as needed). There was some discussion about this on llvm-dev here:
https://groups.google.com/forum/#!topic/llvm-dev/pid_thv2X-A

So I think we should be removing some of these transforms from instcombine rather than adding to them.

There is a second side to this though.

Even if all such "performance-degrading" transforms are removed from instcombine
(yes, i think the pass is rather huge and monolithic, this is a problem), it won't solve the problem.
The same 'bad' patterns surely could be produced via some other way.

I don't see how this is any different than lowering intrinsics.
Of course, they are no longer intrinsics, but IR, so optimizations may
break their canonical form, and backend will need to be adjusted
to recognize the IR patters (and potentially recognize the patterns
that did not originate form intrinsics, which is great).

So i'd say this really is backend's (DAGCombine?) problem.
This is just a question of whether or not there is interest to have a high quality codegen,
regardless of the input. It should recognize the 'bad' patterns, and if profitable,
transform to improve 'performance'. In this case, back to select.

And yes, absolutely, this is more complicated than just "stop dealing with it in instcombine" :)

I agree. Let me clarify the suggestion: don't just remove these control-flow to data-flow transforms from instcombine. Canonicalize to the cmp+select form. And then let the DAG convert it back if it's profitable. I did some fraction (what I thought were the most common parts) of this job already following the cited llvm-dev discussion. Here's the direct link to that reply:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.html

Hi @spatel @lebedev.ri,

Actually I am absolutely agreed that these transforms produce really bad code which looks like a canonicalization breach. Let me give you a bit more context about my motivation for this change. I basically need that the following pattern was recognized correctly as three-way comparison:

%cmp1 = icmp eq %x, 0
%cmp2 = icmp slt %x, 0
%s1 = select i1 %cmp2, -1, 1
%s2 = select i1 %cmp1, 0, %s1

It is a typical pattern that we see in all comparators. We have a function matchThreeWayIntCompare that used to detect such a pattern. However at some point InstCombine started recognizing the pattern of %cmp2, %s1 as a bit test of the highest bit, removed it and inserted the shift-based bit test instead. Even worse, it does insert *different* patterns for case when both %x and %s1 are i32 or when they are of different types.

The problem is wide-spread across InstCombine: we seem to have *many* transforms of this kind spread across the code. And which of them will be applied is always a surprise: some of them are implemented for general case and some are limited. As result, we might end up with a dozen of different bit-test patterns that all represent a three-ways comparison.

This patch was an attempt to canonicalize all these transforms, so that at least for the same pattern they were done in the same manner regardless of types. Here I was taking an assumption like "okay, we are doing all this bit-test stuff anyways, let's at least do it uniformly". Because now what we have is a counter-canonicalization of this pattern.

I agree that in ideal world we should just remove all this job connected to adding bit-tests instead of this select, or even do it in opposite way (so that bit-tests are converted to select). I see two reasons why it is impossible.

We have no idea how it impacts performance in the particular cases for which they were added.
We have little understanding of how other transforms and optimizations will react on that, and will we break something or not. I have a crawling suspicion that all this stuff was done for vectorization, and that the vectorizer might expect this bit stuff rather than selects. I need someone familiar with vectorizer's code to confirm or refute that.

I will be totally happy if we decide to not transform selects into bittests. But I am hesitant to make a change this big in the part of code I am unfamiliar with, as well as I am unfamiliar with the reasons why it was done at all and whether it helps the vectorization. So I need the confirmation from vectorizer people that selects there are OK.

For now, my point is the following: yes, we are *already* doing all this bit magic. Yes, it looks like miscanonicalization, and we can discuss getting rid of all such transforms at all. But since we are already doing it and haven't yet decided to clear it off, let's at least do it well. Once we decide that all this stuff is counter-canonical, we can happily remove all this. What do you think of that?

Thanks,
Max

P.S. I've also added Benjamin Kramer who was the original author of this transform to take a look.

You can also have more context of why it was done from the commit messages of: D45854 D45855 D45856 D45863.

In D45862#1074819, @mkazantsev wrote:

For now, my point is the following: yes, we are *already* doing all this bit magic. Yes, it looks like miscanonicalization, and we can discuss getting rid of all such transforms at all. But since we are already doing it and haven't yet decided to clear it off, let's at least do it well. Once we decide that all this stuff is counter-canonical, we can happily remove all this. What do you think of that?

I understand this as a short-term goal and local improvement, but I think if we don't reverse the direction now, we'll never correct it. For example, it took over 5 years to correct:
rL159230
with
rL319964

See also the mentioned commits where we may extend the DAG transforms and possibly solve the patterns that you are looking at:
rL296977
rL311731

You mention the vectorizers as a possible source for wanting to do the bit magic, but I think that most targets would actually do better vectorizing a select at this point because vsel/bsl/blend type of instructions are usually part of a vector ISA. This came up in:
https://bugs.llvm.org/show_bug.cgi?id=6773#c9 (and I'm not locating the review thread, but Roman has proposed tests for the vector cases)

Can you provide an IR -> asm example for one/some of the problem cases that led you to this (or preferably, file bug reports for them)? If we can fix the cases that we know would regress with a reversal of this IR canonicalization, then we should push ahead with that effort and fix instcombine. We can't guarantee that there won't be regressions, but we should be able to fix problems as they are reported.

Also see D24480 and the list of proposals and commits mentioned there.

In D45862#1075265, @spatel wrote:

Can you provide an IR -> asm example for one/some of the problem cases that led you to this (or preferably, file bug reports for them)? If we can fix the cases that we know would regress with a reversal of this IR canonicalization, then we should push ahead with that effort and fix instcombine. We can't guarantee that there won't be regressions, but we should be able to fix problems as they are reported.

Sure, please take a look at https://bugs.llvm.org/show_bug.cgi?id=37147
You can also use tests from D45854 D45855 D45856 D45863 for reference: this is the IR we expect InstCombine to produce.

Just for note: I am not aware of any asm advantages or disadvantages of what we have now in practice. I am only speculating about it in assumption that this bit-test magic was done with some reason. I don't have any real evidence that it is somehow good or bad in asm, but what I know is that does pessimize three-way comparison recognition.

mkazantsev mentioned this in D45854: [InstCombine] Support BitTests in ThreeWayComparison. Trivial case.Apr 23 2018, 10:27 PM

mkazantsev mentioned this in D45855: [InstCombine] Support BitTests in ThreeWayComparison. General case, part 1.

mkazantsev mentioned this in D45856: [InstCombine] Support BitTests in ThreeWayComparison. General case, part 2.

mkazantsev mentioned this in D45863: [InstCombine] Support BitTests in ThreeWayComparison. Mismatched types.

spatel mentioned this in rL330731: [InstCombine] regenerate checks; NFC.Apr 24 2018, 9:11 AM

In D45862#1076327, @mkazantsev wrote:

Just for note: I am not aware of any asm advantages or disadvantages of what we have now in practice. I am only speculating about it in assumption that this bit-test magic was done with some reason. I don't have any real evidence that it is somehow good or bad in asm, but what I know is that does pessimize three-way comparison recognition.

I forgot about this, but I think I asked about the same problem as PR37147 here:
http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html

...and the conclusion was that we want selects (or at least nobody strongly opposed that). And so I went to the DAG and made the necessary changes to allow producing good asm for all targets for those patterns (ie, there's a hook to convert those back to bit math). So there should be little risk in removing/inverting the instcombines at this point.

I'm looking closer now at how to replace some of the instcombines now without regressing anything. I expect this is going to take multiple steps to unravel. For example, we even have a 'm_Signum' matcher that's looking for shifts.

spatel mentioned this in rL330768: [InstCombine] move tests for select with bit-test of condition; NFC.Apr 24 2018, 2:09 PM

Let's try an alternative approach. In this patch we remove harmful transforms that replace selects with random pieces of bit magic.

Here we only stop harmful transforms. The patch that will canonicalize all this bit magic to selects will go separately.

Herald added a reviewer: javed.absar. · View Herald TranscriptApr 24 2018, 9:04 PM

Herald added a subscriber: kristof.beyls. · View Herald Transcript

In D45862#1077652, @mkazantsev wrote:

Let's try an alternative approach. In this patch we remove harmful transforms that replace selects with random pieces of bit magic.

Here we only stop harmful transforms. The patch that will canonicalize all this bit magic to selects will go separately.

I agree with this direction (and I have a patch that would do something similar), but this is going too far. We should remove the transforms that are harmful, but without creating known regressions. This requires writing new, more specific folds as we remove the over-reaching logic.

As you can see from test15g and test15h, there are good folds buried in here. Those correspond to something like this:

Name: set_bit_if_masked_val_is_clear
Pre: isPowerOf2(C1) && ((C2 ^ C3) == C1) && (C2 u> C3)
 %t1 = and i8 %x, C1
 %t2 = icmp eq i8 %t1, 0
 %r = select i1 %t2, i8 C2, i8 C3
=>
 %r = xor i8 %t1, C2

Name: set_bit_if_masked_val_is_set
Pre: isPowerOf2(C1) && ((C2 ^ C3) == C1) && (C2 u> C3)
 %t1 = and i8 %x, C1
 %t2 = icmp ne i8 %t1, 0
 %r = select i1 %t2, i8 C2, i8 C3
=>
 %r = or i8 %t1, C3

Name: clear_bit_if_masked_val_is_clear
Pre: isPowerOf2(C1) && ((C2 ^ C3) == C1) && (C3 u> C2)
 %t1 = and i8 %x, C1
 %t2 = icmp eq i8 %t1, 0
 %r = select i1 %t2, i8 C2, i8 C3
=>
 %r = or i8 %t1, C2
 
Name: clear_bit_if_masked_val_is_set
Pre: isPowerOf2(C1) && ((C2 ^ C3) == C1) && (C3 u> C2)
 %t1 = and i8 %x, C1
 %t2 = icmp ne i8 %t1, 0
 %r = select i1 %t2, i8 C2, i8 C3
=>
 %r = xor i8 %t1, C3

https://rise4fun.com/Alive/XG7

To verify that we're doing this properly, we need to add more tests for patterns like this. The current set isn't providing very good coverage.

spatel mentioned this in rL330837: [InstCombine] add tests for select to logic folds; NFC.Apr 25 2018, 9:03 AM

spatel mentioned this in rL330841: [InstCombine] clean up foldSelectICmpAnd(); NFC.Apr 25 2018, 9:39 AM

spatel mentioned this in D46086: [InstCombine] refine select-of-constants to bitwise ops.Apr 25 2018, 2:34 PM

I am confused. Didn't we agree on that replacement of selects with bit-wise logic, if it's profitable, should be a part of backend's work, and here we want selects because they have a more clear semantics that can be used by other transforms, rather than teaching every single transform to recognize the bit magic as select? You are giving some examples where codegen definitely could do what you are showing.

In D45862#1080423, @mkazantsev wrote:

I am confused. Didn't we agree on that replacement of selects with bit-wise logic, if it's profitable, should be a part of backend's work, and here we want selects because they have a more clear semantics that can be used by other transforms, rather than teaching every single transform to recognize the bit magic as select? You are giving some examples where codegen definitely could do what you are showing.

I think it's clearly better for IR to eliminate the select if we can reduce the instruction count too. It can lead to more reduction here in instcombine as seen in test15g. So that's why I'm proposing to add optimizations for these patterns: https://rise4fun.com/Alive/XG7 in D46086 (note that test15g is not regressed there).
Even if you disagree with having those transforms, I don't think we should remove it as the first step. We're already going to be at risk for revert due to perf regressions with D46086 even though those are all IR improvements based on our reasoning so far. I'm purposely trying to minimize the diffs in that step, so we have a better chance of pushing this through.

In D45862#1081291, @spatel wrote:

I think it's clearly better for IR to eliminate the select if we can reduce the instruction count too.

Not if local improvement on one step will break another step which will break us even more improvement. You save one instruction by doing bit magic and kill the further transform that would eliminate 4 instructions. My point is that the canonicalization is generally a more useful thing in a long run than saving of one or two instructions now and pessimizing another more powerful transform later. Current strategy of saving one instruction by adding this bit magic produces horrible code on the tests I've added.

As for test15g, I will give some more thought to what can be done to it.

In D45862#1081291, @spatel wrote:

So that's why I'm proposing to add optimizations for these patterns: https://rise4fun.com/Alive/XG7 in D46086 (note that test15g is not regressed there).

Why in InstCombine? These transforms are very simple, DAG selector should be able to do what you want, but later. The only possible profit I see from having this done in InstCombine is that it may affect the behavior of inlining/unrolling that calculate cost of the code, but it's only matter of tuning the cost functions if the need shows up.

In D45862#1081907, @mkazantsev wrote:

Current strategy of saving one instruction by adding this bit magic produces horrible code on the tests I've added.

Please commit these tests, so we can see the current output and compare any differences between this patch and the other patch.

In D45862#1081919, @spatel wrote:

In D45862#1081907, @mkazantsev wrote:

Current strategy of saving one instruction by adding this bit magic produces horrible code on the tests I've added.

Please commit these tests, so we can see the current output and compare any differences between this patch and the other patch.

Here it is. Please compare the cases when we compare against zero with cases when we compare against any other value.
https://reviews.llvm.org/rL331100

Updated the patch to show the impact better. Test 15g possibly needs closer evaluation, however it seems to be the only case where the code has become worse. Might be a separate issue.

In D45862#1081927, @mkazantsev wrote:

In D45862#1081919, @spatel wrote:

In D45862#1081907, @mkazantsev wrote:

Current strategy of saving one instruction by adding this bit magic produces horrible code on the tests I've added.

Please commit these tests, so we can see the current output and compare any differences between this patch and the other patch.

Here it is. Please compare the cases when we compare against zero with cases when we compare against any other value.
https://reviews.llvm.org/rL331100

Thanks! I updated D46086. Unless I've missed it, the diffs for the 3-way-compare tests are identical?

Thanks @spatel , pretty much so!

Abandoning in favour of D46086

spatel added inline comments.Nov 29 2018, 11:09 AM

test/Transforms/InstCombine/unrecognized_three-way-comparison.ll
11–13 ↗	(On Diff #144435)	Not sure exactly where the logic hole is that caused this, but I moved a fold to InstSimplify in: rL347896 ...and now we get this case and the similar 'compare_against_arbitrary_value_type_mismatch' test.

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineSelect.cpp

26 lines

test/

Transforms/

InstCombine/

select-of-bittest.ll

41 lines

Diff 143226

lib/Transforms/InstCombine/InstCombineSelect.cpp

Show First 20 Lines • Show All 791 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::foldSelectInstWithICmp(SelectInst &SI,
bool Changed = adjustMinMax(SI, *ICI);		bool Changed = adjustMinMax(SI, *ICI);

ICmpInst::Predicate Pred = ICI->getPredicate();		ICmpInst::Predicate Pred = ICI->getPredicate();
Value *CmpLHS = ICI->getOperand(0);		Value *CmpLHS = ICI->getOperand(0);
Value *CmpRHS = ICI->getOperand(1);		Value *CmpRHS = ICI->getOperand(1);

// Transform (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1		// Transform (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1
// and (X <s 0) ? C2 : C1 --> ((X >>s 31) & (C2 - C1)) + C1		// and (X <s 0) ? C2 : C1 --> ((X >>s 31) & (C2 - C1)) + C1
// FIXME: Type and constness constraints could be lifted, but we have to		// FIXME: Case when CmpLHS is more narrow than TrueVal is also valid, but
// watch code size carefully. We should consider xor instead of		// it seems that with some other transforms it may end up with worse
// sub/add when we decide to do that.		// IR (containing both trunc and sext where we could have none). It
		// looks like a problem in some other transforms (see test71 in
		// select-with-bitwise-ops.ll). Lift this restriction once they are
		// fixed.
// TODO: Merge this with foldSelectICmpAnd somehow.		// TODO: Merge this with foldSelectICmpAnd somehow.
if (CmpLHS->getType()->isIntOrIntVectorTy() &&		if (CmpLHS->getType()->isIntOrIntVectorTy() &&
CmpLHS->getType() == TrueVal->getType()) {		TrueVal->getType()->isIntOrIntVectorTy() &&
		CmpLHS->getType()->getScalarSizeInBits() >=
		TrueVal->getType()->getScalarSizeInBits()) {
const APInt C1, C2;		const APInt C1, C2;
if (match(TrueVal, m_APInt(C1)) && match(FalseVal, m_APInt(C2))) {		if (match(TrueVal, m_APInt(C1)) && match(FalseVal, m_APInt(C2))) {
ICmpInst::Predicate Pred = ICI->getPredicate();		ICmpInst::Predicate Pred = ICI->getPredicate();
Value *X;		Value *X;
APInt Mask;		APInt Mask;
if (decomposeBitTestICmp(CmpLHS, CmpRHS, Pred, X, Mask, false)) {		if (decomposeBitTestICmp(CmpLHS, CmpRHS, Pred, X, Mask, false)) {
if (Mask.isSignMask()) {		if (Mask.isSignMask()) {
assert(X == CmpLHS && "Expected to use the compare input directly");		assert(X == CmpLHS && "Expected to use the compare input directly");
assert(ICmpInst::isEquality(Pred) && "Expected equality predicate");		assert(ICmpInst::isEquality(Pred) && "Expected equality predicate");

if (Pred == ICmpInst::ICMP_NE)		if (Pred == ICmpInst::ICMP_NE)
std::swap(C1, C2);		std::swap(C1, C2);

// This shift results in either -1 or 0.		// This shift results in either -1 or 0.
Value *AShr = Builder.CreateAShr(X, Mask.getBitWidth() - 1);		Value *AShr = Builder.CreateAShr(X, Mask.getBitWidth() - 1);
		// Convert to the destination type if needed.
		Value *AShrTyped = AShr;
		if (CmpLHS->getType() != TrueVal->getType())
		AShrTyped = Builder.CreateSExtOrTrunc(AShr, TrueVal->getType());

// Check if we can express the operation with a single or.		// Check if we can express the operation with a single or.
if (C2->isAllOnesValue())		if (C2->isAllOnesValue())
return replaceInstUsesWith(SI, Builder.CreateOr(AShr, *C1));		return replaceInstUsesWith(SI, Builder.CreateOr(AShrTyped, *C1));

Value And = Builder.CreateAnd(AShr, C2 - *C1);		Value And = Builder.CreateAnd(AShrTyped, C2 - *C1);
return replaceInstUsesWith(SI, Builder.CreateAdd(And,		return replaceInstUsesWith(
ConstantInt::get(And->getType(), *C1)));		SI,
		Builder.CreateAdd(And, ConstantInt::get(And->getType(), *C1)));
}		}
}		}
}		}
}		}

{		{
const APInt TrueValC, FalseValC;		const APInt TrueValC, FalseValC;
if (match(TrueVal, m_APInt(TrueValC)) &&		if (match(TrueVal, m_APInt(TrueValC)) &&
▲ Show 20 Lines • Show All 1,070 Lines • Show Last 20 Lines

test/Transforms/InstCombine/select-of-bittest.ll

	Show First 20 Lines • Show All 650 Lines • ▼ Show 20 Lines
	;			;
	%tmp = and i32 %arg, 1			%tmp = and i32 %arg, 1
	%tmp1 = icmp eq i32 %tmp, 1			%tmp1 = icmp eq i32 %tmp, 1
	%tmp2 = lshr i32 %arg, 2			%tmp2 = lshr i32 %arg, 2
	%tmp3 = and i32 %tmp2, 1			%tmp3 = and i32 %tmp2, 1
	%tmp4 = select i1 %tmp1, i32 %tmp3, i32 1			%tmp4 = select i1 %tmp1, i32 %tmp3, i32 1
	ret i32 %tmp4			ret i32 %tmp4
	}			}

				define i32 @compare_to_zero_mismatched_types_idiomatic_trunc(i64 %x) {
				lebedev.riUnsubmitted Not Done Reply Inline Actions For each of those, could you please add vector tests. Something like: define i32 @compare_to_zero_mismatched_types_idiomatic_trunc(i64 %x) { %cmp = icmp slt i64 %x, 0 %select1 = select i1 %cmp, i32 -1, i32 1 ret i32 %select1 } define <2 x i32> @compare_to_zero_mismatched_types_idiomatic_trunc_splatvec(<2 x i64> %x) { %cmp = icmp slt <2 x i64> %x, <i64 0, i64 0> %select1 = select <2 x i1> %cmp, <i32 -1, i32 -1>, <i32 1, i32 1> ret <2 x i32> %select1 } define <3 x i32> @compare_to_zero_mismatched_types_idiomatic_trunc_splatvec_undef(<3 x i64> %x) { %cmp = icmp slt <3 x i64> %x, <i64 0, i64 undef, i64 0> %select1 = select <3 x i1> %cmp, <i32 -1, i64 undef, i32 -1>, <i32 1, i64 undef, i32 1> ret <3 x i32> %select1 } ; May be more than one define <3 x i32> @compare_to_zero_mismatched_types_idiomatic_trunc_nonsplatvec(<3 x i64> %x) { %cmp = icmp slt <3 x i64> %x, <i64 0, i64 undef, i64 0> %select1 = select <3 x i1> %cmp, <i32 -1, i64 0, i32 1>, <i32 1, i64 0, i32 -1> ret <3 x i32> %select1 } lebedev.ri: For each of those, could you please add vector tests. Something like: ``` define i32…
				lebedev.riUnsubmitted Not Done Reply Inline Actions Oh, also, this is a really bad place to add these tests. They should either go to a new file, or at least before ; ============================================================================ ; ; Negative tests. Should not be folded. ; ============================================================================ ; lebedev.ri: Oh, also, this is a really bad place to add these tests. They should either go to a new…

				; CHECK-LABEL: compare_to_zero_mismatched_types_idiomatic_trunc(
				lebedev.riUnsubmitted Done Reply Inline Actions Please use `utils/update_test_checks.py`. lebedev.ri: Please use `utils/update_test_checks.py`.
				; CHECK-NEXT: [[ASHR:%.*]] = ashr i64 %x, 63
				; CHECK-NEXT: [[TRUNC:%.*]] = trunc i64 [[ASHR]] to i32
				; CHECK-NEXT: [[OR:%.*]] = or i32 [[TRUNC]], 1
				; CHECK-NEXT: ret i32 [[OR]]

				%cmp = icmp slt i64 %x, 0
				%select1 = select i1 %cmp, i32 -1, i32 1
				ret i32 %select1
				}

				define i32 @compare_to_zero_mismatched_types_non_idiomatic_trunc_01(i64 %x) {

				; CHECK-LABEL: compare_to_zero_mismatched_types_non_idiomatic_trunc_01(
				; CHECK-NEXT: [[ASHR:%.*]] = ashr i64 %x, 63
				; CHECK-NEXT: [[TRUNC:%.*]] = trunc i64 [[ASHR]] to i32
				; CHECK-NEXT: [[AND:%.*]] = and i32 [[TRUNC]], -24
				; CHECK-NEXT: [[OR:%.*]] = or i32 [[AND]], 19
				; CHECK-NEXT: ret i32 [[OR]]

				%cmp = icmp slt i64 %x, 0
				%select1 = select i1 %cmp, i32 -5, i32 19
				ret i32 %select1
				}

				define i32 @compare_to_zero_mismatched_types_non_idiomatic_trunc_02(i64 %x) {

				; CHECK-LABEL: compare_to_zero_mismatched_types_non_idiomatic_trunc_02(
				; CHECK-NEXT: [[ASHR:%.*]] = ashr i64 %x, 63
				; CHECK-NEXT: [[TRUNC:%.*]] = trunc i64 [[ASHR]] to i32
				; CHECK-NEXT: [[AND:%.*]] = and i32 [[TRUNC]], -25
				; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[AND]], 20
				; CHECK-NEXT: ret i32 [[ADD]]

				%cmp = icmp slt i64 %x, 0
				%select1 = select i1 %cmp, i32 -5, i32 20
				ret i32 %select1
				}

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Remove decanonicalizing transforms of selectsAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 143226

lib/Transforms/InstCombine/InstCombineSelect.cpp

test/Transforms/InstCombine/select-of-bittest.ll

[InstCombine] Remove decanonicalizing transforms of selects
AbandonedPublic