This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
CMakeLists.txt
3/3
InstCombineAddSub.cpp
1/1
InstCombineInternal.h
13/13
InstCombineNegator.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
ARM/
-
strcmp.ll
-
abs-1.ll
-
and-or-icmps.ll
-
high-bit-signmask-with-trunc.ll
-
high-bit-signmask.ll
-
icmp.ll
-
mul.ll
-
sadd-with-overflow.ll
-
ssub-with-overflow.ll
-
strcmp-1.ll
-
strncmp-1.ll
4/4
sub-of-negatible.ll
1/1
sub.ll
-
unsigned_saturated_sub.ll
-
zext-bool-add-sub.ll

Differential D68408

[InstCombine] Negator - sink sinkable negations
ClosedPublic

Authored by lebedev.ri on Oct 3 2019, 10:27 AM.

Download Raw Diff

Details

Reviewers

spatel
nikic
efriedma
xbolva00
vitalybuka
dvyukov

Commits

rG352fef3f11f5: [InstCombine] Negator - sink sinkable negations

Summary

As we have discussed previously (e.g. in D63992 / D64090 / PR42457), sub instruction
can almost be considered non-canonical. While we do convert sub %x, C -> add %x, -C,
we sparsely do that for non-constants. But we should.

Here, i propose to interpret sub %x, %y as add (sub 0, %y), %x IFF the negation can be sinked into the %y

This has some potential to cause endless combine loops (either around PHI's, or if there are some opposite transforms).
For former there's -instcombine-negator-max-depth option to mitigate it, should this expose any such issues
For latter, if there are still any such opposing folds, we'd need to remove the colliding fold.
In any case, reproducers welcomed!

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lebedev.ri created this revision.Oct 3 2019, 10:27 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptOct 3 2019, 10:27 AM

lebedev.ri edited the summary of this revision. (Show Details)Oct 3 2019, 10:28 AM

lebedev.ri edited the summary of this revision. (Show Details)Oct 3 2019, 10:33 AM

Actually, sub can be freely negated.

This *is* unusual. Is this too ugly to live?

I'd prefer to actually transform the operation tree at the point we decide it's profitable, to make it clear we can't end up in an infinite loop or something like that. As it is, you're depending on some other transforms happening in a particular order, and it's not clear that will happen consistently. (Yes, it's a little more code, but I think that's okay.)

In D68408#1695357, @efriedma wrote:

This *is* unusual. Is this too ugly to live?

I'd prefer to actually transform the operation tree at the point we decide it's profitable, to make it clear we can't end up in an infinite loop or something like that. As it is, you're depending on some other transforms happening in a particular order, and it's not clear that will happen consistently. (Yes, it's a little more code, but I think that's okay.)

Actually, that's presicely why i believed this may be the best approach - we both avoid lot's of code duplication,
and if there are missing folds this *will* shake them loose - if they don't happen it'll "be caught" as a deadlock.

I'm not really sure how more direct approach would look..

In D68408#1695399, @lebedev.ri wrote:

In D68408#1695357, @efriedma wrote:

This *is* unusual. Is this too ugly to live?

I'd prefer to actually transform the operation tree at the point we decide it's profitable, to make it clear we can't end up in an infinite loop or something like that. As it is, you're depending on some other transforms happening in a particular order, and it's not clear that will happen consistently. (Yes, it's a little more code, but I think that's okay.)

I'm not really sure how more direct approach would look..

Okay so this is nowhere near polished-enough, but how about this?

Herald added a subscriber: mgorny. · View Herald TranscriptOct 5 2019, 3:02 PM

Give more thought as to whether the new instructions should be inserted or not, and actually succeed in building test-suite.

@efriedma any high-level feedback on this?

Hmm, that's a bit more complicated than I hoped it would be... but I don't see any obvious way to simplify it.

It looks like this speculatively creates new instructions, then erases them on failure?

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
2	InstCombineAddSub.cpp?

In D68408#1708655, @efriedma wrote:

Hmm, that's a bit more complicated than I hoped it would be... but I don't see any obvious way to simplify it.

The one big missing thing is to use an actual worklist instead of recursion, but i'm not sure how to do that yet.

It looks like this speculatively creates new instructions, then erases them on failure?

Yes. More specifically, while speculatively creating them, it inserts them internal worklist
and in their proper places in basic blocks. If we succeed to negate the entire tree,
that worklist is then propagated to instcombine itself.
If that doesn't happen, then they are 'trivially' dead and deleted.

Thank you for taking a look!
Also handle trunc, fix comments.

If that doesn't happen, then they are 'trivially' dead and deleted.

Deleted where, exactly? If you're expecting the instcombine main loop to delete them, you'll force instcombine into an infinite loop, I think.

In D68408#1710214, @efriedma wrote:

If that doesn't happen, then they are 'trivially' dead and deleted.

Deleted where, exactly? If you're expecting the instcombine main loop to delete them, you'll force instcombine into an infinite loop, I think.

Hmm, they are not added to the instcombine's worklist unless we succeed in negating
the entire tree, and this passes test-suite with no infinite looping.

You are saying that we should instead DCE instructions in *our* worklist if we fail, correct?

Erase newly-created/inserted instruction if negation failed.

Bump

bump

Ping

@efriedma - do you want to continue reviewing?

I've just pointed out a few nits for now.

llvm/lib/Transforms/InstCombine/InstCombineInternal.h
1007	typo: attempt
llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
169	it's -> its
183	it's -> its
193	negatible one -> negatible if one
193	it's -> its
244	typo: temporarily
llvm/test/Transforms/InstCombine/sub-of-negatible.ll
158–160	I didn't follow the diffs here - was one of these tests redundant? The code comment didn't match before, but it still doesn't?
271–272	Add these tests with baseline results as pre-commit?
351	it's -> its

In D68408#1767626, @spatel wrote:

@efriedma - do you want to continue reviewing?

In D68408#1767626, @spatel wrote:

I've just pointed out a few nits for now.

Nits addressed.

There are some more patterns that aren't handled here.
The idea is to ideally move them all here.

llvm/test/Transforms/InstCombine/sub-of-negatible.ll
158–160	The test was too complicated, was checking more than the minimal pattern - subtraction can be freely negated by swapping its operands.

Do you have stats on how often this fires? Impact on compile-time?

Remove more specific pattern-matching from InstCombiner::visitSub() simultaneously with adding this more general functionality, so we don't have redundancy (and limit compile-time impact)?

lebedev.ri mentioned this in rGb89ba5f9399a: [NFC][InstCombine] Autogenerate check lines in a few tests.Dec 4 2019, 2:19 PM

NOT READY FOR REVIEW

In D68408#1769213, @spatel wrote:

Remove more specific pattern-matching from InstCombiner::visitSub() simultaneously with adding this more general functionality, so we don't have redundancy (and limit compile-time impact)?

I was hoping to do that in steps, but i guess that's one way to boost those stats :))
I think i have moved everything relevant from InstCombiner::visitSub() now.
Observations:

There's an artificial one-use restriction that needs to go away (when Depth=0 and we are looking at sub 0, %x)
We loose/do not propagate no wrap flags
InstCombiner::visitAdd() change is needed because of how good we are get at sinking negations - else there are two opposite folds. It should be done beforehand, separately.
Same with InstCombiner::foldAddWithConstant() change, not entirely related to the diff.
There are some other regressions.

lebedev.ri mentioned this in D71064: [InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to `sub A, zext B -> add A, sext B`).Dec 5 2019, 6:21 AM

lebedev.ri mentioned this in rG796fa662f128: [InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to….Dec 5 2019, 10:24 AM

Rebased, slightly better, but not there yet still.

xbolva00 added a subscriber: xbolva00.Dec 5 2019, 10:55 AM

Still not for review

Some more unreachable code removed from InstCombiner::visitSub()

Maybe you can change title to [WIP] or [NOT READY FOR REVIEW] ?

xbolva00 added inline comments.Dec 6 2019, 4:41 AM

llvm/test/Transforms/InstCombine/sub.ll
1183	Remove FIXME?

spatel mentioned this in D72007: [InstCombine] try to pull 'not' of select into compare operands.Jan 4 2020, 6:17 AM

nikic mentioned this in D72978: [InstCombine] Combine neg of shl of sub (PR44529).Jan 18 2020, 8:05 AM

nikic mentioned this in rG0b83c5a78fae: [InstCombine] Combine neg of shl of sub (PR44529).Jan 22 2020, 2:04 PM

spatel mentioned this in D77230: [InstCombine] enhance freelyNegateValue() by handling xor.Apr 1 2020, 10:23 AM

spatel mentioned this in rG3d9004879118: [InstCombine] enhance freelyNegateValue() by handling xor.Apr 1 2020, 12:23 PM

spatel mentioned this in rG1008435f3d47: Revert "[InstCombine] do not exclude min/max from icmp with casted operand fold".Apr 2 2020, 7:01 AM

Rebased.

Finally understood the interfering transforms - this fundamentally interferes with D48754.
Yes, i see the PhaseOrdering regressions if we revert that.
Not yet sure how to deal with this, i see 3 options:

don't do such canonicalization (what i've done here)
don't implement Negator
add Abs IR instruction (i think not)
Don't try to sink negation when it's used by abs/nabs pattern

This doesn't hang check-llvm/test-suite.

Harbormaster failed remote builds in B52802: Diff 256768!Apr 11 2020, 7:27 AM

Handle PHI's, too.
Strangely, i distinctively recall seeing some preexisting test
causing an endless cycle, but i'm no longer observing that problem.

Harbormaster failed remote builds in B52817: Diff 256801!Apr 11 2020, 4:00 PM

xbolva00 added inline comments.Apr 11 2020, 4:11 PM

llvm/test/Transforms/PhaseOrdering/min-max-abs-cse.ll
39 ↗	(On Diff #256801)	Regression

And solve potential endless combine cycle by not trying to sink negations
if that instruction has an user that looks like abs/nabs.

I think this is it for now, so review is welcomed..

llvm/test/Transforms/PhaseOrdering/min-max-abs-cse.ll
39 ↗	(On Diff #256801)	https://reviews.llvm.org/D68408#1976039

In D68408#1976039, @lebedev.ri wrote:

add Abs IR instruction (i think not)

Is the answer the same for an intrinsic? The more we see these kinds of problems, the more I wish we had created intrinsics for abs/smin/smax/umin/umax long ago. We keep trying to work around this limitation in IR, but it doesn't seem worth it. We have abs() functions in source and an abs node in codegen, so we're creating intermediate ops that don't exist on either side of IR.

Harbormaster failed remote builds in B52854: Diff 256855!Apr 12 2020, 8:32 AM

In D68408#1976790, @spatel wrote:

In D68408#1976039, @lebedev.ri wrote:

add Abs IR instruction (i think not)

Is the answer the same for an intrinsic?

Ack, i meant instruction/intrinsic interchangeably here.

The more we see these kinds of problems, the more I wish we had created intrinsics for abs/smin/smax/umin/umax long ago. We keep trying to work around this limitation in IR, but it doesn't seem worth it. We have abs() functions in source and an abs node in codegen, so we're creating intermediate ops that don't exist on either side of IR.

Note that we similarly don't have saturating multiplication, overflowing/saturating left-shift.

In D68408#1976807, @lebedev.ri wrote:

In D68408#1976790, @spatel wrote:

In D68408#1976039, @lebedev.ri wrote:

add Abs IR instruction (i think not)

Is the answer the same for an intrinsic?

Ack, i meant instruction/intrinsic interchangeably here.

We have grown more accepting of intrinsics (overflow, saturating, funnel, etc.) recently, so I'm not sure if the old arguments against an abs intrinsic still hold. Is there anything in particular about abs() that makes it different?

The more we see these kinds of problems, the more I wish we had created intrinsics for abs/smin/smax/umin/umax long ago. We keep trying to work around this limitation in IR, but it doesn't seem worth it. We have abs() functions in source and an abs node in codegen, so we're creating intermediate ops that don't exist on either side of IR.

Note that we similarly don't have saturating multiplication, overflowing/saturating left-shift.

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

In D68408#1977656, @spatel wrote:

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

FWIW I agree that we need to reevaluate this decision and at least introduce min/max intrinsics. These are special-cased in too many places for the current treatment as a simple compare and select to still make sense. Especially when we take into account that there seems to be a new min/max related infinite loop every other month, and that the icmp/select representation has ill-defined behavior when it comes to undef. I expect the effort for introducing these intrinsics will be pretty similar to saturating add/sub, and would be happy to chip in...

+1 for new abs/min/max intrinsics

It'd be great if we could separate the new intrinsics disscussion from this patch..

In D68408#1977656, @spatel wrote:

In D68408#1976807, @lebedev.ri wrote:

In D68408#1976790, @spatel wrote:

In D68408#1976039, @lebedev.ri wrote:

add Abs IR instruction (i think not)

Is the answer the same for an intrinsic?

Ack, i meant instruction/intrinsic interchangeably here.

We have grown more accepting of intrinsics (overflow, saturating, funnel, etc.) recently, so I'm not sure if the old arguments against an abs intrinsic still hold. Is there anything in particular about abs() that makes it different?

No, i just don't want to condition this patch on introduction of whole new set of intrinsics :]

spatel mentioned this in D74484: [AggressiveInstCombine] Add support for ICmp instr that feeds a select intsr's condition operand..Apr 14 2020, 6:55 AM

In D68408#1978925, @nikic wrote:

In D68408#1977656, @spatel wrote:

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

FWIW I agree that we need to reevaluate this decision and at least introduce min/max intrinsics. These are special-cased in too many places for the current treatment as a simple compare and select to still make sense. Especially when we take into account that there seems to be a new min/max related infinite loop every other month, and that the icmp/select representation has ill-defined behavior when it comes to undef. I expect the effort for introducing these intrinsics will be pretty similar to saturating add/sub, and would be happy to chip in...

Sounds like general agreement for the people on this review (and sorry for going off-topic for this specific patch). I'll post something to llvm-dev after trying to dig up the earlier discussions for those intrinsics.

And looks like we have this month's infinite loop here :) -
https://bugs.llvm.org/show_bug.cgi?id=45539

In D68408#1981910, @spatel wrote:

In D68408#1978925, @nikic wrote:

In D68408#1977656, @spatel wrote:

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

FWIW I agree that we need to reevaluate this decision and at least introduce min/max intrinsics. These are special-cased in too many places for the current treatment as a simple compare and select to still make sense. Especially when we take into account that there seems to be a new min/max related infinite loop every other month, and that the icmp/select representation has ill-defined behavior when it comes to undef. I expect the effort for introducing these intrinsics will be pretty similar to saturating add/sub, and would be happy to chip in...

Sounds like general agreement for the people on this review (and sorry for going off-topic for this specific patch). I'll post something to llvm-dev after trying to dig up the earlier discussions for those intrinsics.

Note that i've posted https://github.com/AliveToolkit/alive2/pull/353 with my vision of their modelling.
Notably, i don't think we should have nabs, and i believe abs should be similar to the cttz/ctlz
in the sense that it should take a second param - i1 NSW.

In D68408#1978925, @nikic wrote:

In D68408#1977656, @spatel wrote:

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

FWIW I agree that we need to reevaluate this decision and at least introduce min/max intrinsics. These are special-cased in too many places for the current treatment as a simple compare and select to still make sense. Especially when we take into account that there seems to be a new min/max related infinite loop every other month, and that the icmp/select representation has ill-defined behavior when it comes to undef. I expect the effort for introducing these intrinsics will be pretty similar to saturating add/sub, and would be happy to chip in...

I think i'd like to try to handle that, since out of the people in this disscussion
only i (well, and @xbolva00) haven't dealt with that before, may be good to spread the knowledge.

Ping. What would it take to get this moving? :)

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

Change does not seem to have cost above noise, apart from a -0.6% improvement on tramp3d-v4. There's a corresponding -0.74% reducing in code-size, so this transform is clearly doing (or enabling) something big there. It might be interesting (but not necessary) to take a quick look at what happens there.

In D68408#1991910, @nikic wrote:

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

I was actually trying to get that info, and i'm not sure what step i'm missing other than pushing the [[ https://github.com/LebedevRI/llvm-project/tree/perf/instcombine-negator | perf/* ]] branch?

Change does not seem to have cost above noise, apart from a -0.6% improvement on tramp3d-v4. There's a corresponding -0.74% reducing in code-size, so this transform is clearly doing (or enabling) something big there. It might be interesting (but not necessary) to take a quick look at what happens there.

Ah, interesting.
I actually expected this to have measurable negative(bad) cost,
so it's a pleasant surprise to see beneficial numbers here :)

In D68408#1991930, @lebedev.ri wrote:

In D68408#1991910, @nikic wrote:

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

I was actually trying to get that info, and i'm not sure what step i'm missing other than pushing the [[ https://github.com/LebedevRI/llvm-project/tree/perf/instcombine-negator | perf/* ]] branch?

I need to manually add your fork as a remote first, so branches get picked up (too many forks to listen to all of them). I've done that now.

In D68408#1991910, @nikic wrote:

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

Change does not seem to have cost above noise, apart from a -0.6% improvement on tramp3d-v4. There's a corresponding -0.74% reducing in code-size, so this transform is clearly doing (or enabling) something big there. It might be interesting (but not necessary) to take a quick look at what happens there.

Yes, it would be good to derive a regression test from that benchmark and/or invent some larger tests that show the greater optimization power of the new code. Unless I missed it, all of the current test diffs show that we do no harm, but if we can show that the added code/complexity buys us something immediately, that makes the benefit clear.

In D68408#1992227, @spatel wrote:

In D68408#1991910, @nikic wrote:

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

Change does not seem to have cost above noise, apart from a -0.6% improvement on tramp3d-v4. There's a corresponding -0.74% reducing in code-size, so this transform is clearly doing (or enabling) something big there.
It might be interesting (but not necessary) to take a quick look at what happens there.

Yes, it would be good to derive a regression test from that benchmark

old.ll.xz853 KBDownload

new.ll.xz846 KBDownload

The impact there is quite noticeable as per llvm-diff, which almost immediately crashes.
It appears, a lot more inlining happens (functions now-missing in new.ll),
and some more function 'specialization' (functions now-appearing in new.ll).
I'm not sure i can distill/filter that to make any reasonable test case..

In D68408#1992227, @spatel wrote:

and/or invent some larger tests that show the greater optimization power of the new code.
Unless I missed it, all of the current test diffs show that we do no harm, but if we can show that the added code/complexity buys us something immediately, that makes the benefit clear.

The claim i'm making in the patch's description is that we almost consider sub instruction
non-canonical, and we should be trying to fold it away as much as possible.
These test cases show the common patters, that we miss currently, where we can get rid of it.

This approach is what was requested in

In D68408#1695357, @efriedma wrote:

This *is* unusual. Is this too ugly to live?

I'd prefer to actually transform the operation tree at the point we decide it's profitable, to make it clear we can't end up in an infinite loop or something like that. As it is, you're depending on some other transforms happening in a particular order, and it's not clear that will happen consistently. (Yes, it's a little more code, but I think that's okay.)

Given that we have compile-time data now, LGTM.
The implementation goes beyond my normal casual C++ usage (eg, I'd never seen 'zip' before), so if someone else can take a 2nd/final look too, that would be great.

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
134	Remove/update comment.

This revision is now accepted and ready to land.Apr 20 2020, 1:02 PM

nikic added inline comments.Apr 20 2020, 1:51 PM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	Noticed while looking through the tramp3d-v4 diff: This should be behind a one-use check, to avoid duplicating expensive division instructions.

xbolva00 added inline comments.Apr 20 2020, 1:52 PM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1901	Just wondering if we could use some better name for this lambda than “cleanup”.

lebedev.ri added inline comments.Apr 20 2020, 2:19 PM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1901	I couldn't come up with a better one, any suggestions? `TryToNarrowDeduceFlags()`? This is where `goto` might make sense, but somehow i don't want to use it..

xbolva00 added inline comments.Apr 20 2020, 2:48 PM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1901	Your idea is fine I think.

lebedev.ri added inline comments.Apr 20 2020, 2:52 PM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	I was hesitant about this one indeed, it isn't a typo that one-use check has gone away here, because we generally consider only instruction count. @spatel thoughts? But the main, bigger question this touches is: "but what if all the uses would get negated by us? In future, can we somehow sanely model the whole negatible tree, not giving up at non-single-use instructions, but defer that to after we've finished building new tree?"

spatel added inline comments.Apr 21 2020, 5:52 AM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	I missed that logic difference, and I'm not getting a reviewable diff of the attached files with llvm-diff or other apps. Can you create an IR example/regression test for that?

lebedev.ri added inline comments.Apr 21 2020, 6:06 AM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	@spatel To be clear, the "bigger question" is pretty rhetorical, or at least not for this review. The actual question here is whether we should consider `sdiv` special, and even if we can negate it without increasing instruction count, we should only do so if there are no other uses of old `sdiv`.

spatel added inline comments.Apr 21 2020, 6:45 AM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	There is precedence for this kind of special treatment. In other words, not all opcodes are equal in terms of analysis (and secondary concern of codegen), and we will even increase instruction count to avoid some like div/rem (although those transforms are probably currently not safe with respect to poison). If it would be NFC-ish to keep the one-use check, then we should do that. Then, remove the limitation as a follow-up if that can be shown useful?

Updated: adjust comments, lambda name, guard sdiv with an artificial one-use-check.

lebedev.ri added inline comments.Apr 21 2020, 10:50 AM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	Alright, i'll add one-use check just to get this moving :)

Harbormaster failed remote builds in B54127: Diff 259055!Apr 21 2020, 11:21 AM

Closed by commit rG352fef3f11f5: [InstCombine] Negator - sink sinkable negations (authored by lebedev.ri). · Explain WhyApr 21 2020, 12:27 PM

This revision was automatically updated to reflect the committed changes.

Hi,

This change causes a performance regression in tsan, as detected on our LLVM buildbot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/49850/steps/tsan%20analyze/logs/stdio

The script that comes with tsan checks the number of PUSH, etc in some of the key tsan functions,
where each extra PUSH cases tsan to be slower.

With this change, the number of PUSHes went from 3 to 4.

Please take a look, this might be a performance regression for a wider set of targets.

Before your change:

read1 tot 484; size 1830; rsp 1; push 3; pop 15; call 2; load 24; store  9; sh  46; mov 106; lea   2; cmp  76

After your change:

read1 tot 515; size 1980; rsp 1; push 4; pop 4; call 2; load 24; store  9; sh  46; mov 113; lea   2; cmp  90

Script to reproduce (in the llvm-project root dir, with "build" subdir)

#!/bin/bash

compile() {

clang -c -O2  compiler-rt/lib/tsan/rtl/tsan_rtl.cpp -I compiler-rt/lib -Wall -std=c++14 -Wno-unused-parameter -O2 -g -DNDEBUG    -m64 -fno-lto -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fno-sanitize=safe-stack -fvisibility=hidden -fno-lto -O3 -gline-tables-only -Wno-gnu -Wno-variadic-macros -Wno-c99-extensions -Wno-non-virtual-dtor -fPIE -fno-rtti -msse3 -Wframe-larger-than=530 -Wglobal-constructors

}

git checkout a13dce1d90cba6c55252dee0a2600eab37ffbc44
(cd build; ninja clang 2> /dev/null)
compile
compiler-rt/lib/tsan/analyze_libtsan.sh tsan_rtl.o | grep read1

git checkout 352fef3f11f5ccb2ddc8e16cecb7302a54721e9f
(cd build; ninja clang 2> /dev/null)
compile
compiler-rt/lib/tsan/analyze_libtsan.sh tsan_rtl.o | grep read1

In D68408#1998112, @kcc wrote:

Hi,

Hi.

This change causes a performance regression in tsan, as detected on our LLVM buildbot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/49850/steps/tsan%20analyze/logs/stdio

Looks like the build was red already: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/49808
That explains why i didn't see the new failure: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/49809

The script that comes with tsan checks the number of PUSH, etc in some of the key tsan functions,
where each extra PUSH cases tsan to be slower.

With this change, the number of PUSHes went from 3 to 4.

Please take a look, this might be a performance regression for a wider set of targets.

Before your change:
read1 tot 484; size 1830; rsp 1; push 3; pop 15; call 2; load 24; store  9; sh  46; mov 106; lea   2; cmp  76
After your change:
read1 tot 515; size 1980; rsp 1; push 4; pop 4; call 2; load 24; store  9; sh  46; mov 113; lea   2; cmp  90

Interesting. Not very unexpected, there's always possibility of an avalanche effect with IR changes.

~~Unhelpful answer: wow how things have regressed since rL342092 / D51985~~

Script to reproduce (in the llvm-project root dir, with "build" subdir)

#!/bin/bash

compile() {
clang -c -O2  compiler-rt/lib/tsan/rtl/tsan_rtl.cpp -I compiler-rt/lib -Wall -std=c++14 -Wno-unused-parameter -O2 -g -DNDEBUG    -m64 -fno-lto -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fno-sanitize=safe-stack -fvisibility=hidden -fno-lto -O3 -gline-tables-only -Wno-gnu -Wno-variadic-macros -Wno-c99-extensions -Wno-non-virtual-dtor -fPIE -fno-rtti -msse3 -Wframe-larger-than=530 -Wglobal-constructors
}

git checkout a13dce1d90cba6c55252dee0a2600eab37ffbc44
(cd build; ninja clang 2> /dev/null)
compile
compiler-rt/lib/tsan/analyze_libtsan.sh tsan_rtl.o | grep read1

git checkout 352fef3f11f5ccb2ddc8e16cecb7302a54721e9f
(cd build; ninja clang 2> /dev/null)
compile
compiler-rt/lib/tsan/analyze_libtsan.sh tsan_rtl.o | grep read1

old.ll1 MBDownload

old.o84 KBDownload

new.ll1 MBDownload

new.o85 KBDownload

llvm-diff report is pretty large, IR instruction-wise, this appears to be a win overall (-639 +609 instructions).
Visually, i think can spot only two IR patterns that we now fail to fold:

in function _ZN6__tsan10InitializeEPNS_11ThreadStateE:
  in block %if.then88.i:
    >   %22 = xor i64 %xor.i.i.i, -17592186044417
    >   %mul.i.i194.neg.i = add i64 %22, 1
    >   %sub.i = add i64 %mul.i.i194.neg.i, %mul.i.i203.i
    <   %mul.i.i194.i = xor i64 %xor.i.i.i, 17592186044416
    <   %sub.i = sub i64 %mul.i.i203.i, %mul.i.i194.i
  in block %if.then88.1.i:
    >   %35 = xor i64 %xor.i.i.1.i, -17592186044417
    >   %mul.i.i194.neg.1.i = or i64 %mul.i.i203.1.i, 1
    >   %sub.1.i = add i64 %mul.i.i194.neg.1.i, %35
    <   %mul.i.i194.1.i = xor i64 %xor.i.i.1.i, 17592186044416
    <   %sub.1.i = sub i64 %mul.i.i203.1.i, %mul.i.i194.1.i

Filed https://bugs.llvm.org/show_bug.cgi?id=45647

But i believe, i'm supposed to look at the @__tsan_read1 function, right?
Then the relevant diff is:

in function __tsan_read1:
  in block %if.then.i1390.i.i.i:
    >   %sub.neg.i1385.i.i.i = sub nsw i64 %and.i19.i1382.i.i.i, %and.i.i1380.i.i.i
    >   %sub.neg.highbits.i1388.i.i.i = lshr i64 %sub.neg.i1385.i.i.i, %and.i.i.i1387.i.i.i
    >   %cmp7.i1389.i.i.i = icmp ne i64 %sub.neg.highbits.i1388.i.i.i, 0
    <   %sub6.i1387.i.i.i = sub nsw i64 0, %sub.i1383.i.i.i
    <   %sub6.highbits.i1388.i.i.i = lshr i64 %sub6.i1387.i.i.i, %and.i.i.i1386.i.i.i
    <   %cmp7.i1389.i.i.i = icmp ne i64 %sub6.highbits.i1388.i.i.i, 0
        %cmp.i1378.i.i.i = icmp ult i64 %xor.i1422.i.i.i, 1125899906842624
    >   %or.cond.i.i.i = or i1 %cmp.i1378.i.i.i, %cmp7.i1389.i.i.i
    >   br i1 %or.cond.i.i.i, label %do.body226.i.i.i, label %if.end86.i.i.i
    <   %or.cond.i.i.i = or i1 %cmp.i1378.i.i.i, %cmp7.i1389.i.i.i
    <   br i1 %or.cond.i.i.i, label %do.body226.i.i.i, label %if.end86.i.i.i

So we've traded 0 - %sub.i1383.i.i.i for %and.i19.i1382.i.i.i - %and.i.i1380.i.i.i
That's it, [un]fortunately, there is nothing else going on..
But thankfully, that explains the problem well.
Pushed rG5a159ed2a8e5a9a6ced73f78e4c64b01d76d3493.
Thanks.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

CMakeLists.txt

1 line

InstCombineAddSub.cpp

173 lines

InstCombineInternal.h

49 lines

InstCombineNegator.cpp

334 lines

test/

Transforms/

InstCombine/

ARM/

strcmp.ll

4 lines

abs-1.ll

12 lines

and-or-icmps.ll

18 lines

high-bit-signmask-with-trunc.ll

36 lines

high-bit-signmask.ll

32 lines

icmp.ll

17 lines

mul.ll

9 lines

sadd-with-overflow.ll

5 lines

ssub-with-overflow.ll

18 lines

4 lines

2 lines

46 lines

94 lines

unsigned_saturated_sub.ll

6 lines

zext-bool-add-sub.ll

34 lines

Diff 232532

llvm/lib/Transforms/InstCombine/CMakeLists.txt

	set(LLVM_TARGET_DEFINITIONS InstCombineTables.td)			set(LLVM_TARGET_DEFINITIONS InstCombineTables.td)
	tablegen(LLVM InstCombineTables.inc -gen-searchable-tables)			tablegen(LLVM InstCombineTables.inc -gen-searchable-tables)
	add_public_tablegen_target(InstCombineTableGen)			add_public_tablegen_target(InstCombineTableGen)

	add_llvm_component_library(LLVMInstCombine			add_llvm_component_library(LLVMInstCombine
	InstructionCombining.cpp			InstructionCombining.cpp
	InstCombineAddSub.cpp			InstCombineAddSub.cpp
	InstCombineAtomicRMW.cpp			InstCombineAtomicRMW.cpp
	InstCombineAndOrXor.cpp			InstCombineAndOrXor.cpp
	InstCombineCalls.cpp			InstCombineCalls.cpp
	InstCombineCasts.cpp			InstCombineCasts.cpp
	InstCombineCompares.cpp			InstCombineCompares.cpp
	InstCombineLoadStoreAlloca.cpp			InstCombineLoadStoreAlloca.cpp
	InstCombineMulDivRem.cpp			InstCombineMulDivRem.cpp
				InstCombineNegator.cpp
	InstCombinePHI.cpp			InstCombinePHI.cpp
	InstCombineSelect.cpp			InstCombineSelect.cpp
	InstCombineShifts.cpp			InstCombineShifts.cpp
	InstCombineSimplifyDemanded.cpp			InstCombineSimplifyDemanded.cpp
	InstCombineVectorOps.cpp			InstCombineVectorOps.cpp

	ADDITIONAL_HEADER_DIRS			ADDITIONAL_HEADER_DIRS
	${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms			${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms
	${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms/InstCombine			${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms/InstCombine

	DEPENDS			DEPENDS
	intrinsics_gen			intrinsics_gen
	)			)

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp

Show First 20 Lines • Show All 1,670 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitSub(BinaryOperator &I) {
if (Value *V = SimplifySubInst(I.getOperand(0), I.getOperand(1),		if (Value *V = SimplifySubInst(I.getOperand(0), I.getOperand(1),
I.hasNoSignedWrap(), I.hasNoUnsignedWrap(),		I.hasNoSignedWrap(), I.hasNoUnsignedWrap(),
SQ.getWithInstruction(&I)))		SQ.getWithInstruction(&I)))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (Instruction *X = foldVectorBinop(I))		if (Instruction *X = foldVectorBinop(I))
return X;		return X;

// (AB)-(AC) -> A*(B-C) etc
if (Value *V = SimplifyUsingDistributiveLaws(I))
return replaceInstUsesWith(I, V);

// If this is a 'B = x-(-A)', change to B = x+A.
Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);
if (Value *V = dyn_castNegVal(Op1)) {
BinaryOperator *Res = BinaryOperator::CreateAdd(Op0, V);

if (const auto *BO = dyn_cast<BinaryOperator>(Op1)) {		// First, let's try to interpret `sub a, b` as `add a, (sub 0, b)`,
assert(BO->getOpcode() == Instruction::Sub &&		// and let's try to sink `(sub 0, b)` into `b` itself.
"Expected a subtraction operator!");		if (Value NegOp1 = Negator::Negate(Op1, this))
if (BO->hasNoSignedWrap() && I.hasNoSignedWrap())		return BinaryOperator::CreateAdd(NegOp1, Op0);
Res->setHasNoSignedWrap(true);		if (match(Op0, m_ZeroInt()))
} else {		return nullptr; // Should have been handled in Negator!
if (cast<Constant>(Op1)->isNotMinSignedValue() && I.hasNoSignedWrap())
Res->setHasNoSignedWrap(true);
}

return Res;		// (AB)-(AC) -> A*(B-C) etc
}		if (Value *V = SimplifyUsingDistributiveLaws(I))
		return replaceInstUsesWith(I, V);

if (I.getType()->isIntOrIntVectorTy(1))		if (I.getType()->isIntOrIntVectorTy(1))
return BinaryOperator::CreateXor(Op0, Op1);		return BinaryOperator::CreateXor(Op0, Op1);

// Replace (-1 - A) with (~A).		// Replace (-1 - A) with (~A).
if (match(Op0, m_AllOnes()))		if (match(Op0, m_AllOnes()))
return BinaryOperator::CreateNot(Op1);		return BinaryOperator::CreateNot(Op1);

// (~X) - (~Y) --> Y - X		// (~X) - (~Y) --> Y - X
Value X, Y;		Value X, Y;
if (match(Op0, m_Not(m_Value(X))) && match(Op1, m_Not(m_Value(Y))))		if (match(Op0, m_Not(m_Value(X))) && match(Op1, m_Not(m_Value(Y))))
return BinaryOperator::CreateSub(Y, X);		return BinaryOperator::CreateSub(Y, X);

// (X + -1) - Y --> ~Y + X		// (X + -1) - Y --> ~Y + X
if (match(Op0, m_OneUse(m_Add(m_Value(X), m_AllOnes()))))		if (match(Op0, m_OneUse(m_Add(m_Value(X), m_AllOnes()))))
return BinaryOperator::CreateAdd(Builder.CreateNot(Op1), X);		return BinaryOperator::CreateAdd(Builder.CreateNot(Op1), X);

// Y - (X + 1) --> ~X + Y
if (match(Op1, m_OneUse(m_Add(m_Value(X), m_One()))))
return BinaryOperator::CreateAdd(Builder.CreateNot(X), Op0);

// Y - ~X --> (X + 1) + Y
if (match(Op1, m_OneUse(m_Not(m_Value(X))))) {
return BinaryOperator::CreateAdd(
Builder.CreateAdd(Op0, ConstantInt::get(I.getType(), 1)), X);
}

if (Constant *C = dyn_cast<Constant>(Op0)) {		if (Constant *C = dyn_cast<Constant>(Op0)) {
bool IsNegate = match(C, m_ZeroInt());
Value *X;		Value *X;
if (match(Op1, m_ZExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1)) {
// 0 - (zext bool) --> sext bool
// C - (zext bool) --> bool ? C - 1 : C		// C - (zext bool) --> bool ? C - 1 : C
if (IsNegate)		if (match(Op1, m_ZExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1))
return CastInst::CreateSExtOrBitCast(X, I.getType());
return SelectInst::Create(X, SubOne(C), C);		return SelectInst::Create(X, SubOne(C), C);
}
if (match(Op1, m_SExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1)) {
// 0 - (sext bool) --> zext bool
// C - (sext bool) --> bool ? C + 1 : C		// C - (sext bool) --> bool ? C + 1 : C
if (IsNegate)		if (match(Op1, m_SExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1))
return CastInst::CreateZExtOrBitCast(X, I.getType());
return SelectInst::Create(X, AddOne(C), C);		return SelectInst::Create(X, AddOne(C), C);
}

// C - ~X == X + (1+C)		// C - ~X == X + (1+C)
if (match(Op1, m_Not(m_Value(X))))		if (match(Op1, m_Not(m_Value(X))))
return BinaryOperator::CreateAdd(X, AddOne(C));		return BinaryOperator::CreateAdd(X, AddOne(C));

// Try to fold constant sub into select arguments.		// Try to fold constant sub into select arguments.
if (SelectInst *SI = dyn_cast<SelectInst>(Op1))		if (SelectInst *SI = dyn_cast<SelectInst>(Op1))
if (Instruction *R = FoldOpIntoSelect(I, SI))		if (Instruction *R = FoldOpIntoSelect(I, SI))
Show All 11 Lines	if (match(Op1, m_Sub(m_Constant(C2), m_Value(X))))
return BinaryOperator::CreateAdd(X, ConstantExpr::getSub(C, C2));		return BinaryOperator::CreateAdd(X, ConstantExpr::getSub(C, C2));

// C-(X+C2) --> (C-C2)-X		// C-(X+C2) --> (C-C2)-X
if (match(Op1, m_Add(m_Value(X), m_Constant(C2))))		if (match(Op1, m_Add(m_Value(X), m_Constant(C2))))
return BinaryOperator::CreateSub(ConstantExpr::getSub(C, C2), X);		return BinaryOperator::CreateSub(ConstantExpr::getSub(C, C2), X);
}		}

const APInt *Op0C;		const APInt *Op0C;
if (match(Op0, m_APInt(Op0C))) {		if (match(Op0, m_APInt(Op0C)) && Op0C->isMask()) {

if (Op0C->isNullValue()) {
Value *Op1Wide;
match(Op1, m_TruncOrSelf(m_Value(Op1Wide)));
bool HadTrunc = Op1Wide != Op1;
bool NoTruncOrTruncIsOneUse = !HadTrunc \|\| Op1->hasOneUse();
unsigned BitWidth = Op1Wide->getType()->getScalarSizeInBits();

Value *X;
const APInt *ShAmt;
// -(X >>u 31) -> (X >>s 31)
if (NoTruncOrTruncIsOneUse &&
match(Op1Wide, m_LShr(m_Value(X), m_APInt(ShAmt))) &&
*ShAmt == BitWidth - 1) {
Value *ShAmtOp = cast<Instruction>(Op1Wide)->getOperand(1);
Instruction *NewShift = BinaryOperator::CreateAShr(X, ShAmtOp);
NewShift->copyIRFlags(Op1Wide);
if (!HadTrunc)
return NewShift;
Builder.Insert(NewShift);
return TruncInst::CreateTruncOrBitCast(NewShift, Op1->getType());
}
// -(X >>s 31) -> (X >>u 31)
if (NoTruncOrTruncIsOneUse &&
match(Op1Wide, m_AShr(m_Value(X), m_APInt(ShAmt))) &&
*ShAmt == BitWidth - 1) {
Value *ShAmtOp = cast<Instruction>(Op1Wide)->getOperand(1);
Instruction *NewShift = BinaryOperator::CreateLShr(X, ShAmtOp);
NewShift->copyIRFlags(Op1Wide);
if (!HadTrunc)
return NewShift;
Builder.Insert(NewShift);
return TruncInst::CreateTruncOrBitCast(NewShift, Op1->getType());
}

if (!HadTrunc && Op1->hasOneUse()) {
Value LHS, RHS;
SelectPatternFlavor SPF = matchSelectPattern(Op1, LHS, RHS).Flavor;
if (SPF == SPF_ABS \|\| SPF == SPF_NABS) {
// This is a negate of an ABS/NABS pattern. Just swap the operands
// of the select.
cast<SelectInst>(Op1)->swapValues();
// Don't swap prof metadata, we didn't change the branch behavior.
return replaceInstUsesWith(I, Op1);
}
}
}

// Turn this into a xor if LHS is 2^n-1 and the remaining bits are known		// Turn this into a xor if LHS is 2^n-1 and the remaining bits are known
// zero.		// zero.
if (Op0C->isMask()) {		if (Op0C->isMask()) {
KnownBits RHSKnown = computeKnownBits(Op1, 0, &I);		KnownBits RHSKnown = computeKnownBits(Op1, 0, &I);
if ((*Op0C \| RHSKnown.Zero).isAllOnesValue())		if ((*Op0C \| RHSKnown.Zero).isAllOnesValue())
return BinaryOperator::CreateXor(Op1, Op0);		return BinaryOperator::CreateXor(Op1, Op0);
}		}
}		}
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitSub(BinaryOperator &I) {
{		{
Value *Y;		Value *Y;
// ((X \| Y) - X) --> (~X & Y)		// ((X \| Y) - X) --> (~X & Y)
if (match(Op0, m_OneUse(m_c_Or(m_Value(Y), m_Specific(Op1)))))		if (match(Op0, m_OneUse(m_c_Or(m_Value(Y), m_Specific(Op1)))))
return BinaryOperator::CreateAnd(		return BinaryOperator::CreateAnd(
Y, Builder.CreateNot(Op1, Op1->getName() + ".not"));		Y, Builder.CreateNot(Op1, Op1->getName() + ".not"));
}		}

if (Op1->hasOneUse()) {		{
Value X = nullptr, Y = nullptr, *Z = nullptr;
Constant *C = nullptr;

// (X - (Y - Z)) --> (X + (Z - Y)).
if (match(Op1, m_Sub(m_Value(Y), m_Value(Z))))
return BinaryOperator::CreateAdd(Op0,
Builder.CreateSub(Z, Y, Op1->getName()));

// (X - (X & Y)) --> (X & ~Y)		// (X - (X & Y)) --> (X & ~Y)
if (match(Op1, m_c_And(m_Value(Y), m_Specific(Op0))))		Value *Y;
return BinaryOperator::CreateAnd(Op0,		if (match(Op1, m_OneUse(m_c_And(m_Value(Y), m_Specific(Op0)))))
Builder.CreateNot(Y, Y->getName() + ".not"));		return BinaryOperator::CreateAnd(
		Op0, Builder.CreateNot(Y, Y->getName() + ".not"));
// 0 - (X sdiv C) -> (X sdiv -C) provided the negation doesn't overflow.
if (match(Op0, m_Zero())) {
Constant *Op11C;
if (match(Op1, m_SDiv(m_Value(X), m_Constant(Op11C))) &&
!Op11C->containsUndefElement() && Op11C->isNotMinSignedValue() &&
Op11C->isNotOneValue()) {
Instruction *BO =
BinaryOperator::CreateSDiv(X, ConstantExpr::getNeg(Op11C));
BO->setIsExact(cast<BinaryOperator>(Op1)->isExact());
return BO;
}
}

// 0 - (X << Y) -> (-X << Y) when X is freely negatable.
if (match(Op1, m_Shl(m_Value(X), m_Value(Y))) && match(Op0, m_Zero()))
if (Value *XNeg = dyn_castNegVal(X))
return BinaryOperator::CreateShl(XNeg, Y);

// Subtracting -1/0 is the same as adding 1/0:
// sub [nsw] Op0, sext(bool Y) -> add [nsw] Op0, zext(bool Y)
// 'nuw' is dropped in favor of the canonical form.
if (match(Op1, m_SExt(m_Value(Y))) &&
Y->getType()->getScalarSizeInBits() == 1) {
Value *Zext = Builder.CreateZExt(Y, I.getType());
BinaryOperator *Add = BinaryOperator::CreateAdd(Op0, Zext);
Add->setHasNoSignedWrap(I.hasNoSignedWrap());
return Add;
}
// sub [nsw] X, zext(bool Y) -> add [nsw] X, sext(bool Y)
// 'nuw' is dropped in favor of the canonical form.
if (match(Op1, m_ZExt(m_Value(Y))) && Y->getType()->isIntOrIntVectorTy(1)) {
Value *Sext = Builder.CreateSExt(Y, I.getType());
BinaryOperator *Add = BinaryOperator::CreateAdd(Op0, Sext);
Add->setHasNoSignedWrap(I.hasNoSignedWrap());
return Add;
}

// X - A-B -> X + AB
// X - -AB -> X + AB
Value A, B;
if (match(Op1, m_c_Mul(m_Value(A), m_Neg(m_Value(B)))))
return BinaryOperator::CreateAdd(Op0, Builder.CreateMul(A, B));

// X - AC -> X + A-C
// No need to handle commuted multiply because multiply handling will
// ensure constant will be move to the right hand side.
if (match(Op1, m_Mul(m_Value(A), m_Constant(C))) && !isa<ConstantExpr>(C)) {
Value *NewMul = Builder.CreateMul(A, ConstantExpr::getNeg(C));
return BinaryOperator::CreateAdd(Op0, NewMul);
}
}		}

{		{
// ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A		// ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A
// ~A - Min/Max(O, ~A) -> Max/Min(A, ~O) - A		// ~A - Min/Max(O, ~A) -> Max/Min(A, ~O) - A
// Min/Max(~A, O) - ~A -> A - Max/Min(A, ~O)		// Min/Max(~A, O) - ~A -> A - Max/Min(A, ~O)
// Min/Max(O, ~A) - ~A -> A - Max/Min(A, ~O)		// Min/Max(O, ~A) - ~A -> A - Max/Min(A, ~O)
// So long as O here is freely invertible, this will be neutral or a win.		// So long as O here is freely invertible, this will be neutral or a win.
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	if (!I.hasNoSignedWrap() && willNotOverflowSignedSub(Op0, Op1, I)) {
I.setHasNoSignedWrap(true);		I.setHasNoSignedWrap(true);
}		}
if (!I.hasNoUnsignedWrap() && willNotOverflowUnsignedSub(Op0, Op1, I)) {		if (!I.hasNoUnsignedWrap() && willNotOverflowUnsignedSub(Op0, Op1, I)) {
Changed = true;		Changed = true;
I.setHasNoUnsignedWrap(true);		I.setHasNoUnsignedWrap(true);
}		}

return Changed ? &I : nullptr;		return Changed ? &I : nullptr;
}		}
		xbolva00Unsubmitted Done Reply Inline Actions Just wondering if we could use some better name for this lambda than “cleanup”. xbolva00: Just wondering if we could use some better name for this lambda than “cleanup”.
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions I couldn't come up with a better one, any suggestions? `TryToNarrowDeduceFlags()`? This is where `goto` might make sense, but somehow i don't want to use it.. lebedev.ri: I couldn't come up with a better one, any suggestions? `TryToNarrowDeduceFlags()`? This is…
		xbolva00Unsubmitted Done Reply Inline Actions Your idea is fine I think. xbolva00: Your idea is fine I think.

/// This eliminates floating-point negation in either 'fneg(X)' or		/// This eliminates floating-point negation in either 'fneg(X)' or
/// 'fsub(-0.0, X)' form by combining into a constant operand.		/// 'fsub(-0.0, X)' form by combining into a constant operand.
static Instruction *foldFNegIntoConstant(Instruction &I) {		static Instruction *foldFNegIntoConstant(Instruction &I) {
Value *X;		Value *X;
Constant *C;		Constant *C;

// Fold negation into constant operand. This is limited with one-use because		// Fold negation into constant operand. This is limited with one-use because
▲ Show 20 Lines • Show All 170 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineInternal.h

Show All 10 Lines
/// This file provides internal interfaces used to implement the InstCombine.		/// This file provides internal interfaces used to implement the InstCombine.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H		#ifndef LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H
#define LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H		#define LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H

#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/TargetFolder.h"		#include "llvm/Analysis/TargetFolder.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/Argument.h"		#include "llvm/IR/Argument.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
▲ Show 20 Lines • Show All 479 Lines • ▼ Show 20 Lines	private:
bool transformConstExprCastCall(CallBase &Call);		bool transformConstExprCastCall(CallBase &Call);
Instruction *transformCallThroughTrampoline(CallBase &Call,		Instruction *transformCallThroughTrampoline(CallBase &Call,
IntrinsicInst &Tramp);		IntrinsicInst &Tramp);

Value *simplifyMaskedLoad(IntrinsicInst &II);		Value *simplifyMaskedLoad(IntrinsicInst &II);
Instruction *simplifyMaskedStore(IntrinsicInst &II);		Instruction *simplifyMaskedStore(IntrinsicInst &II);
Instruction *simplifyMaskedGather(IntrinsicInst &II);		Instruction *simplifyMaskedGather(IntrinsicInst &II);
Instruction *simplifyMaskedScatter(IntrinsicInst &II);		Instruction *simplifyMaskedScatter(IntrinsicInst &II);

/// Transform (zext icmp) to bitwise / integer operations in order to		/// Transform (zext icmp) to bitwise / integer operations in order to
/// eliminate it.		/// eliminate it.
///		///
/// \param ICI The icmp of the (zext icmp) pair we are interested in.		/// \param ICI The icmp of the (zext icmp) pair we are interested in.
/// \parem CI The zext of the (zext icmp) pair we are interested in.		/// \parem CI The zext of the (zext icmp) pair we are interested in.
/// \param DoTransform Pass false to just test whether the given (zext icmp)		/// \param DoTransform Pass false to just test whether the given (zext icmp)
/// would be transformed. Pass true to actually perform the transformation.		/// would be transformed. Pass true to actually perform the transformation.
///		///
▲ Show 20 Lines • Show All 472 Lines • ▼ Show 20 Lines	private:
Value EvaluateInDifferentType(Value V, Type *Ty, bool isSigned);		Value EvaluateInDifferentType(Value V, Type *Ty, bool isSigned);

/// Returns a value X such that Val = X * Scale, or null if none.		/// Returns a value X such that Val = X * Scale, or null if none.
///		///
/// If the multiplication is known not to overflow then NoSignedWrap is set.		/// If the multiplication is known not to overflow then NoSignedWrap is set.
Value Descale(Value Val, APInt Scale, bool &NoSignedWrap);		Value Descale(Value Val, APInt Scale, bool &NoSignedWrap);
};		};

		namespace {

		// As a default, let's assume that we want to be somewhat aggressive,
		// and attempt to traverse up to 32 layers in attempt to sink negation.
		spatelUnsubmitted Done Reply Inline Actions typo: attempt spatel: typo: attempt
		static constexpr unsigned NegatorDefaultMaxDepth = 32;

		// Let's guesstimate that most often we will end up producing fairly small
		// number of new instructions.
		static constexpr unsigned NegatorMaxNewNodesSSO = 16;

		} // namespace

		class Negator final {
		/// Top-to-bottom, def-to-use negated instruction tree we produced.
		SmallVector<Instruction *, NegatorMaxNewNodesSSO> NewInstructions;

		using BuilderTy = IRBuilder<TargetFolder, IRBuilderCallbackInserter>;
		BuilderTy Builder;

		Negator(LLVMContext &C, const DataLayout &DL);

		#if LLVM_ENABLE_STATS
		unsigned NumValuesVisitedInThisNegator = 0;
		~Negator();
		#endif

		using Result = std::pair<ArrayRef<Instruction > /NewInstructions*/,
		Value * /NegatedRoot/>;

		LLVM_NODISCARD Value visit(Value V, unsigned Depth);

		/// Recurse depth-first and attempt to sink the negation.
		/// FIXME: use worklist?
		LLVM_NODISCARD Optional<Result> run(Value *Root);

		Negator(const Negator &) = delete;
		Negator(Negator &&) = delete;
		Negator &operator=(const Negator &) = delete;
		Negator &operator=(Negator &&) = delete;

		public:
		/// Attempt to negate \p Root. Retuns nullptr if negation can't be performed,
		/// otherwise returns negated value.
		LLVM_NODISCARD static Value Negate(Value Root, InstCombiner &IC);
		};

} // end namespace llvm		} // end namespace llvm

#undef DEBUG_TYPE		#undef DEBUG_TYPE

#endif // LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H		#endif // LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp

This file was added.

				//===- InstCombineNegator.cpp ------------------------------------ C++ --===//
				//
				efriedmaUnsubmitted Done Reply Inline Actions InstCombineAddSub.cpp? efriedma: InstCombineAddSub.cpp?
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements sinking of negation into expression trees,
				// as long as that can be done without increasing instruction count.
				//
				//===----------------------------------------------------------------------===//

				#include "InstCombineInternal.h"
				#include "llvm/ADT/APInt.h"
				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/ADT/None.h"
				#include "llvm/ADT/Optional.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/ADT/Twine.h"
				#include "llvm/ADT/iterator_range.h"
				#include "llvm/Analysis/TargetFolder.h"
				#include "llvm/Analysis/ValueTracking.h"
				#include "llvm/IR/Constant.h"
				#include "llvm/IR/Constants.h"
				#include "llvm/IR/DataLayout.h"
				#include "llvm/IR/DebugLoc.h"
				#include "llvm/IR/DerivedTypes.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/Instruction.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/LLVMContext.h"
				#include "llvm/IR/PatternMatch.h"
				#include "llvm/IR/Type.h"
				#include "llvm/IR/Value.h"
				#include "llvm/Support/Casting.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/Compiler.h"
				#include "llvm/Support/DebugCounter.h"
				#include "llvm/Support/ErrorHandling.h"
				#include "llvm/Support/raw_ostream.h"
				#include <functional>
				#include <utility>

				using namespace llvm;

				#define DEBUG_TYPE "instcombine"

				STATISTIC(NegatorTotalNegationsAttempted,
				"Negator: Number of negations attempted to be sinked");
				STATISTIC(NegatorNumTreesNegated,
				"Negator: Number of negations successfully sinked");
				STATISTIC(NegatorMaxDepthVisited, "Negator: Maximal traversal depth ever "
				"reached while attempting to sink negation");
				STATISTIC(NegatorTimesDepthLimitReached,
				"Negator: How many times did the traversal depth limit was reached "
				"during sinking");
				STATISTIC(
				NegatorNumValuesVisited,
				"Negator: Total number of values visited during attempts to sink negation");
				STATISTIC(NegatorMaxTotalValuesVisited,
				"Negator: Maximal number of values ever visited while attempting to "
				"sink negation");
				STATISTIC(NegatorNumInstructionsCreatedTotal,
				"Negator: Number of new negated instructions created, total");
				STATISTIC(NegatorMaxInstructionsCreated,
				"Negator: Maximal number of new instructions created during negation "
				"attempt");
				STATISTIC(NegatorNumInstructionsNegatedSuccess,
				"Negator: Number of new negated instructions created in successful "
				"negation sinking attempts");

				DEBUG_COUNTER(NegatorCounter, "instcombine-negator",
				"Controls Negator transformations in InstCombine pass");

				static cl::opt<bool>
				NegatorEnabled("instcombine-negator-enabled", cl::init(true),
				cl::desc("Should we attempt to sink negations?"));

				static cl::opt<unsigned>
				NegatorMaxDepth("instcombine-negator-max-depth",
				cl::init(NegatorDefaultMaxDepth),
				cl::desc("What is the maximal lookup depth when trying to "
				"check for viability of negation sinking."));

				Negator::Negator(LLVMContext &C, const DataLayout &DL)
				: Builder(C, TargetFolder(DL),
				IRBuilderCallbackInserter([&](Instruction *I) {
				++NegatorNumInstructionsCreatedTotal;
				NewInstructions.push_back(I);
				})) {}

				#if LLVM_ENABLE_STATS
				Negator::~Negator() {
				NegatorMaxTotalValuesVisited.updateMax(NumValuesVisitedInThisNegator);
				}
				#endif

				// FIXME: can this be reworked into a worklist-based algorithm while preserving
				// the depth-first, early bailout traversal?
				LLVM_NODISCARD Value Negator::visit(Value V, unsigned Depth) {
				NegatorMaxDepthVisited.updateMax(Depth);
				++NegatorNumValuesVisited;

				#if LLVM_ENABLE_STATS
				++NumValuesVisitedInThisNegator;
				#endif

				Value *X;

				// -(-(X)) -> X.
				if (match(V, m_Neg(m_Value(X))))
				return X;

				// Integral constants can be freely negated.
				if (match(V, m_AnyIntegralConstant()))
				return ConstantExpr::getNeg(cast<Constant>(V), /HasNUW=/false,
				/HasNSW=/false);

				// If we have a non-instruction, or it has other uses, then give up.
				if (!isa<Instruction>(V) \|\| !V->hasOneUse())
				return nullptr;

				auto *I = cast<Instruction>(V);
				unsigned BitWidth = I->getType()->getScalarSizeInBits();
				const APInt *Op1Val;

				// We must preserve the insertion point and debug info that is set in the
				// builder at the time this function is called.
				InstCombiner::BuilderTy::InsertPointGuard Guard(Builder);
				// And since we are trying to negate instruction I, that tells us about the
				spatelUnsubmitted Done Reply Inline Actions Remove/update comment. spatel: Remove/update comment.
				// insertion point and the debug info that we need to keep.
				Builder.SetInsertPoint(I);

				// In some cases we can give the answer without further recursion.
				switch (I->getOpcode()) {
				case Instruction::PHI:
				// `phi` is negatible if all the incoming values are negatible. We'd need to
				// ensure that we won't deadloop (pr12338.ll), so let's not bother for now.
				return nullptr; // FIXME: handle PHI!
				case Instruction::Add:
				// `inc` is always negatible.
				if (match(I->getOperand(1), m_One()))
				return Builder.CreateNot(I->getOperand(0), I->getName() + ".neg");
				break;
				case Instruction::Sub:
				// `sub` is always negatible.
				return Builder.CreateSub(I->getOperand(1), I->getOperand(0),
				I->getName() + ".neg");
				case Instruction::Xor:
				// `not` is always negatible.
				if (match(I, m_Not(m_Value(X))))
				return Builder.CreateAdd(X, ConstantInt::get(X->getType(), 1),
				I->getName() + ".neg");
				break;
				case Instruction::AShr:
				case Instruction::LShr:
				// Right-shift sign bit smear is negatible.
				if (match(I->getOperand(1), m_APInt(Op1Val)) && *Op1Val == BitWidth - 1) {
				Value *BO = I->getOpcode() == Instruction::AShr
				? Builder.CreateLShr(I->getOperand(0), I->getOperand(1))
				: Builder.CreateAShr(I->getOperand(0), I->getOperand(1));
				if (auto *NewInstr = dyn_cast<Instruction>(BO)) {
				NewInstr->copyIRFlags(I);
				NewInstr->setName(I->getName() + ".neg");
				}
				spatelUnsubmitted Done Reply Inline Actions it's -> its spatel: it's -> its
				return BO;
				}
				break;
				case Instruction::SDiv:
				// `sdiv` is negatible if divisor is not undef/INT_MIN/1.
				if (auto *Op1C = dyn_cast<Constant>(I->getOperand(1))) {
				if (!Op1C->containsUndefElement() && Op1C->isNotMinSignedValue() &&
				Op1C->isNotOneValue()) {
				Value *BO =
				Builder.CreateSDiv(I->getOperand(0), ConstantExpr::getNeg(Op1C),
				I->getName() + ".neg");
				if (auto *NewInstr = dyn_cast<Instruction>(BO))
				nikicUnsubmitted Done Reply Inline Actions Noticed while looking through the tramp3d-v4 diff: This should be behind a one-use check, to avoid duplicating expensive division instructions. nikic: Noticed while looking through the tramp3d-v4 diff: This should be behind a one-use check, to…
				lebedev.riAuthorUnsubmitted Done Reply Inline Actions I was hesitant about this one indeed, it isn't a typo that one-use check has gone away here, because we generally consider only instruction count. @spatel thoughts? But the main, bigger question this touches is: "but what if all the uses would get negated by us? In future, can we somehow sanely model the whole negatible tree, not giving up at non-single-use instructions, but defer that to after we've finished building new tree?" lebedev.ri: I was hesitant about this one indeed, it isn't a typo that one-use check has gone away here…
				spatelUnsubmitted Done Reply Inline Actions I missed that logic difference, and I'm not getting a reviewable diff of the attached files with llvm-diff or other apps. Can you create an IR example/regression test for that? spatel: I missed that logic difference, and I'm not getting a reviewable diff of the attached files…
				lebedev.riAuthorUnsubmitted Done Reply Inline Actions @spatel To be clear, the "bigger question" is pretty rhetorical, or at least not for this review. The actual question here is whether we should consider `sdiv` special, and even if we can negate it without increasing instruction count, we should only do so if there are no other uses of old `sdiv`. lebedev.ri: @spatel To be clear, the "bigger question" is pretty rhetorical, or at least not for this…
				spatelUnsubmitted Done Reply Inline Actions There is precedence for this kind of special treatment. In other words, not all opcodes are equal in terms of analysis (and secondary concern of codegen), and we will even increase instruction count to avoid some like div/rem (although those transforms are probably currently not safe with respect to poison). If it would be NFC-ish to keep the one-use check, then we should do that. Then, remove the limitation as a follow-up if that can be shown useful? spatel: There is precedence for this kind of special treatment. In other words, not all opcodes are…
				lebedev.riAuthorUnsubmitted Done Reply Inline Actions Alright, i'll add one-use check just to get this moving :) lebedev.ri: Alright, i'll add one-use check just to get this moving :)
				NewInstr->setIsExact(I->isExact());
				return BO;
				spatelUnsubmitted Done Reply Inline Actions it's -> its spatel: it's -> its
				}
				}
				break;
				case Instruction::SExt:
				// `sext` of i1 is always negatible
				if (I->getOperand(0)->getType()->isIntOrIntVectorTy(1))
				return Builder.CreateZExt(I->getOperand(0), I->getType(),
				I->getName() + ".neg");
				break;
				case Instruction::ZExt:
				spatelUnsubmitted Done Reply Inline Actions negatible one -> negatible if one spatel: negatible one -> negatible if one
				spatelUnsubmitted Done Reply Inline Actions it's -> its spatel: it's -> its
				// `zext` of i1 is always negatible
				if (I->getOperand(0)->getType()->isIntOrIntVectorTy(1))
				return Builder.CreateSExt(I->getOperand(0), I->getType(),
				I->getName() + ".neg");
				break;
				default:
				break; // Other instructions require recursive reasoning.
				}

				// Rest of the logic is recursive, so if it's time to give up then it's time.
				if (Depth > NegatorMaxDepth) {
				LLVM_DEBUG(dbgs() << "Negator: reached maximal allowed traversal depth in "
				<< *V << ". Giving up.\n");
				++NegatorTimesDepthLimitReached;
				return nullptr;
				}

				switch (I->getOpcode()) {
				case Instruction::Select: {
				{
				// `abs`/`nabs` is always negatible.
				Value LHS, RHS;
				SelectPatternFlavor SPF =
				matchSelectPattern(I, LHS, RHS, /CastOp=/nullptr, Depth).Flavor;
				if (SPF == SPF_ABS \|\| SPF == SPF_NABS) {
				auto *NewSelect = cast<SelectInst>(I->clone());
				// Just swap the operands of the select.
				NewSelect->swapValues();
				// Don't swap prof metadata, we didn't change the branch behavior.
				NewSelect->setName(I->getName() + ".neg");
				Builder.Insert(NewSelect);
				return NewSelect;
				}
				}
				// `select` is negatible if both hands of `select` are negatible.
				Value *NegOp1 = visit(I->getOperand(1), Depth + 1);
				if (!NegOp1) // Early return.
				return nullptr;
				Value *NegOp2 = visit(I->getOperand(2), Depth + 1);
				if (!NegOp2)
				return nullptr;
				// Do preserve the metadata!
				return Builder.CreateSelect(I->getOperand(0), NegOp1, NegOp2,
				I->getName() + ".neg", /MDFrom=/I);
				}
				case Instruction::Trunc: {
				// `trunc` is negatible if its operand is negatible.
				Value *NegOp = visit(I->getOperand(0), Depth + 1);
				if (!NegOp) // Early return.
				return nullptr;
				return Builder.CreateTrunc(NegOp, I->getType(), I->getName() + ".neg");
				spatelUnsubmitted Done Reply Inline Actions typo: temporarily spatel: typo: temporarily
				}
				case Instruction::Shl: {
				// `shl` is negatible if the first operand is negatible.
				Value *NegOp0 = visit(I->getOperand(0), Depth + 1);
				if (!NegOp0) // Early return.
				return nullptr;
				return Builder.CreateShl(NegOp0, I->getOperand(1), I->getName() + ".neg");
				}
				case Instruction::Add: {
				// `add` is negatible if both of its operands are negatible.
				Value *NegOp0 = visit(I->getOperand(0), Depth + 1);
				if (!NegOp0) // Early return.
				return nullptr;
				Value *NegOp1 = visit(I->getOperand(1), Depth + 1);
				if (!NegOp1)
				return nullptr;
				return Builder.CreateAdd(NegOp0, NegOp1, I->getName() + ".neg");
				}
				case Instruction::Mul: {
				// `mul` is negatible if one of its operands is negatible.
				Value NegatedOp, OtherOp;
				if (Value *NegOp0 = visit(I->getOperand(0), Depth + 1)) {
				NegatedOp = NegOp0;
				OtherOp = I->getOperand(1);
				} else if (Value *NegOp1 = visit(I->getOperand(1), Depth + 1)) {
				NegatedOp = NegOp1;
				OtherOp = I->getOperand(0);
				} else // Can't negate either of them.
				return nullptr;
				return Builder.CreateMul(NegatedOp, OtherOp, I->getName() + ".neg");
				}
				default:
				return nullptr; // Don't know, likely not negatible for free.
				}

				llvm_unreachable("Can't get here. We always return from switch.");
				};

				LLVM_NODISCARD Optional<Negator::Result> Negator::run(Value *Root) {
				Value Negated = visit(Root, /Depth=*/0);
				if (!Negated) {
				// We must cleanup newly-inserted instructions, to avoid any potential
				// endless combine looping.
				llvm::for_each(llvm::reverse(NewInstructions),
				[&](Instruction *I) { I->eraseFromParent(); });
				return llvm::None;
				}
				return std::make_pair(ArrayRef<Instruction *>(NewInstructions), Negated);
				};

				LLVM_NODISCARD Value Negator::Negate(Value Root, InstCombiner &IC) {
				++NegatorTotalNegationsAttempted;
				LLVM_DEBUG(dbgs() << "Negator: attempting to sink negation into " << *Root
				<< "\n");

				if (!NegatorEnabled \|\| !DebugCounter::shouldExecute(NegatorCounter))
				return nullptr;

				Negator N(Root->getContext(), IC.getDataLayout());
				Optional<Result> Res = N.run(Root);
				if (!Res) { // Negation failed.
				LLVM_DEBUG(dbgs() << "Negator: failed to sink negation into " << *Root
				<< "\n");
				return nullptr;
				}

				LLVM_DEBUG(dbgs() << "Negator: successfully sunk negation into " << *Root
				<< "\n NEW: " << *Res->second << "\n");
				++NegatorNumTreesNegated;

				// We must temporarily unset the 'current' insertion point and DebugLoc of the
				// InstCombine's IRBuilder so that it won't interfere with the ones we have
				// already specified when producing negated instructions.
				InstCombiner::BuilderTy::InsertPointGuard Guard(IC.Builder);
				IC.Builder.ClearInsertionPoint();
				IC.Builder.SetCurrentDebugLocation(DebugLoc());

				// And finally, we must add newly-created instructions into the InstCombine's
				// worklist (in a proper order!) so it can attempt to combine them.
				LLVM_DEBUG(dbgs() << "Negator: Propagating " << Res->first.size()
				<< " instrs to InstCombine\n");
				NegatorMaxInstructionsCreated.updateMax(Res->first.size());
				NegatorNumInstructionsNegatedSuccess += Res->first.size();

				// They are in def-use order, so nothing fancy, just insert them in order.
				llvm::for_each(Res->first, [&](Instruction *I) { IC.Builder.Insert(I); });

				// And return the new root.
				return Res->second;
				};

llvm/test/Transforms/InstCombine/ARM/strcmp.ll

Show All 10 Lines

declare i32 @strcmp(i8, i8)		declare i32 @strcmp(i8, i8)

; strcmp("", x) -> -*x		; strcmp("", x) -> -*x
define arm_aapcscc i32 @test1(i8* %str2) {		define arm_aapcscc i32 @test1(i8* %str2) {
; CHECK-LABEL: @test1(		; CHECK-LABEL: @test1(
; CHECK-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1		; CHECK-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1
; CHECK-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32		; CHECK-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32
; CHECK-NEXT: [[TMP2:%.*]] = sub nsw i32 0, [[TMP1]]		; CHECK-NEXT: [[TMP2:%.*]] = sub i32 0, [[TMP1]]
; CHECK-NEXT: ret i32 [[TMP2]]		; CHECK-NEXT: ret i32 [[TMP2]]
;		;

%str1 = getelementptr inbounds [1 x i8], [1 x i8]* @null, i32 0, i32 0		%str1 = getelementptr inbounds [1 x i8], [1 x i8]* @null, i32 0, i32 0
%temp1 = call arm_apcscc i32 @strcmp(i8* %str1, i8* %str2)		%temp1 = call arm_apcscc i32 @strcmp(i8* %str1, i8* %str2)
ret i32 %temp1		ret i32 %temp1

}		}
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	;
ret i32 %temp1		ret i32 %temp1
}		}

; strcmp("", x) -> -*x		; strcmp("", x) -> -*x
define arm_aapcs_vfpcc i32 @test1_vfp(i8* %str2) {		define arm_aapcs_vfpcc i32 @test1_vfp(i8* %str2) {
; CHECK-LABEL: @test1_vfp(		; CHECK-LABEL: @test1_vfp(
; CHECK-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1		; CHECK-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1
; CHECK-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32		; CHECK-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32
; CHECK-NEXT: [[TMP2:%.*]] = sub nsw i32 0, [[TMP1]]		; CHECK-NEXT: [[TMP2:%.*]] = sub i32 0, [[TMP1]]
; CHECK-NEXT: ret i32 [[TMP2]]		; CHECK-NEXT: ret i32 [[TMP2]]
;		;

%str1 = getelementptr inbounds [1 x i8], [1 x i8]* @null, i32 0, i32 0		%str1 = getelementptr inbounds [1 x i8], [1 x i8]* @null, i32 0, i32 0
%temp1 = call arm_aapcs_vfpcc i32 @strcmp(i8* %str1, i8* %str2)		%temp1 = call arm_aapcs_vfpcc i32 @strcmp(i8* %str1, i8* %str2)
ret i32 %temp1		ret i32 %temp1

}		}
▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/abs-1.ll

Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	;
%abs = select i1 %cmp, i8 %neg, i8 %x		%abs = select i1 %cmp, i8 %neg, i8 %x
ret i8 %abs		ret i8 %abs
}		}

define i32 @abs_canonical_5(i8 %x) {		define i32 @abs_canonical_5(i8 %x) {
; CHECK-LABEL: @abs_canonical_5(		; CHECK-LABEL: @abs_canonical_5(
; CHECK-NEXT: [[CMP:%.]] = icmp slt i8 [[X:%.]], 0		; CHECK-NEXT: [[CMP:%.]] = icmp slt i8 [[X:%.]], 0
; CHECK-NEXT: [[CONV:%.*]] = sext i8 [[X]] to i32		; CHECK-NEXT: [[CONV:%.*]] = sext i8 [[X]] to i32
; CHECK-NEXT: [[NEG:%.*]] = sub nsw i32 0, [[CONV]]		; CHECK-NEXT: [[NEG:%.*]] = sub i32 0, [[CONV]]
; CHECK-NEXT: [[ABS:%.*]] = select i1 [[CMP]], i32 [[NEG]], i32 [[CONV]]		; CHECK-NEXT: [[ABS:%.*]] = select i1 [[CMP]], i32 [[NEG]], i32 [[CONV]]
; CHECK-NEXT: ret i32 [[ABS]]		; CHECK-NEXT: ret i32 [[ABS]]
;		;
%cmp = icmp sgt i8 %x, 0		%cmp = icmp sgt i8 %x, 0
%conv = sext i8 %x to i32		%conv = sext i8 %x to i32
%neg = sub i32 0, %conv		%neg = sub i32 0, %conv
%abs = select i1 %cmp, i32 %conv, i32 %neg		%abs = select i1 %cmp, i32 %conv, i32 %neg
ret i32 %abs		ret i32 %abs
▲ Show 20 Lines • Show All 147 Lines • ▼ Show 20 Lines	;
%abs = select i1 %cmp, i8 %x, i8 %neg		%abs = select i1 %cmp, i8 %x, i8 %neg
ret i8 %abs		ret i8 %abs
}		}

define i32 @nabs_canonical_5(i8 %x) {		define i32 @nabs_canonical_5(i8 %x) {
; CHECK-LABEL: @nabs_canonical_5(		; CHECK-LABEL: @nabs_canonical_5(
; CHECK-NEXT: [[CMP:%.]] = icmp slt i8 [[X:%.]], 0		; CHECK-NEXT: [[CMP:%.]] = icmp slt i8 [[X:%.]], 0
; CHECK-NEXT: [[CONV:%.*]] = sext i8 [[X]] to i32		; CHECK-NEXT: [[CONV:%.*]] = sext i8 [[X]] to i32
; CHECK-NEXT: [[NEG:%.*]] = sub nsw i32 0, [[CONV]]		; CHECK-NEXT: [[NEG:%.*]] = sub i32 0, [[CONV]]
; CHECK-NEXT: [[ABS:%.*]] = select i1 [[CMP]], i32 [[CONV]], i32 [[NEG]]		; CHECK-NEXT: [[ABS:%.*]] = select i1 [[CMP]], i32 [[CONV]], i32 [[NEG]]
; CHECK-NEXT: ret i32 [[ABS]]		; CHECK-NEXT: ret i32 [[ABS]]
;		;
%cmp = icmp sgt i8 %x, 0		%cmp = icmp sgt i8 %x, 0
%conv = sext i8 %x to i32		%conv = sext i8 %x to i32
%neg = sub i32 0, %conv		%neg = sub i32 0, %conv
%abs = select i1 %cmp, i32 %neg, i32 %conv		%abs = select i1 %cmp, i32 %neg, i32 %conv
ret i32 %abs		ret i32 %abs
▲ Show 20 Lines • Show All 233 Lines • ▼ Show 20 Lines	;
%r = sub nsw nuw i12 %xor, %sh		%r = sub nsw nuw i12 %xor, %sh
ret i12 %r		ret i12 %r
}		}

define i8 @negate_abs(i8 %x) {		define i8 @negate_abs(i8 %x) {
; CHECK-LABEL: @negate_abs(		; CHECK-LABEL: @negate_abs(
; CHECK-NEXT: [[N:%.]] = sub i8 0, [[X:%.]]		; CHECK-NEXT: [[N:%.]] = sub i8 0, [[X:%.]]
; CHECK-NEXT: [[C:%.*]] = icmp slt i8 [[X]], 0		; CHECK-NEXT: [[C:%.*]] = icmp slt i8 [[X]], 0
; CHECK-NEXT: [[S:%.*]] = select i1 [[C]], i8 [[X]], i8 [[N]]		; CHECK-NEXT: [[TMP1:%.*]] = select i1 [[C]], i8 [[X]], i8 [[N]]
; CHECK-NEXT: ret i8 [[S]]		; CHECK-NEXT: ret i8 [[TMP1]]
;		;
%n = sub i8 0, %x		%n = sub i8 0, %x
%c = icmp slt i8 %x, 0		%c = icmp slt i8 %x, 0
%s = select i1 %c, i8 %n, i8 %x		%s = select i1 %c, i8 %n, i8 %x
%r = sub i8 0, %s		%r = sub i8 0, %s
ret i8 %r		ret i8 %r
}		}

define <2 x i8> @negate_nabs(<2 x i8> %x) {		define <2 x i8> @negate_nabs(<2 x i8> %x) {
; CHECK-LABEL: @negate_nabs(		; CHECK-LABEL: @negate_nabs(
; CHECK-NEXT: [[N:%.]] = sub <2 x i8> zeroinitializer, [[X:%.]]		; CHECK-NEXT: [[N:%.]] = sub <2 x i8> zeroinitializer, [[X:%.]]
; CHECK-NEXT: [[C:%.*]] = icmp slt <2 x i8> [[X]], zeroinitializer		; CHECK-NEXT: [[C:%.*]] = icmp slt <2 x i8> [[X]], zeroinitializer
; CHECK-NEXT: [[S:%.*]] = select <2 x i1> [[C]], <2 x i8> [[N]], <2 x i8> [[X]]		; CHECK-NEXT: [[TMP1:%.*]] = select <2 x i1> [[C]], <2 x i8> [[N]], <2 x i8> [[X]]
; CHECK-NEXT: ret <2 x i8> [[S]]		; CHECK-NEXT: ret <2 x i8> [[TMP1]]
;		;
%n = sub <2 x i8> zeroinitializer, %x		%n = sub <2 x i8> zeroinitializer, %x
%c = icmp slt <2 x i8> %x, zeroinitializer		%c = icmp slt <2 x i8> %x, zeroinitializer
%s = select <2 x i1> %c, <2 x i8> %x, <2 x i8> %n		%s = select <2 x i1> %c, <2 x i8> %x, <2 x i8> %n
%r = sub <2 x i8> zeroinitializer, %s		%r = sub <2 x i8> zeroinitializer, %s
ret <2 x i8> %r		ret <2 x i8> %r
}		}

Show All 11 Lines

llvm/test/Transforms/InstCombine/and-or-icmps.ll

	Show First 20 Lines • Show All 202 Lines • ▼ Show 20 Lines
	; we'd get into foldAndOfICmps() without running InstSimplify			; we'd get into foldAndOfICmps() without running InstSimplify
	; on an 'and' that should have been killed. It's not obvious			; on an 'and' that should have been killed. It's not obvious
	; why, but removing anything hides the bug, hence the long test.			; why, but removing anything hides the bug, hence the long test.

	define void @simplify_before_foldAndOfICmps() {			define void @simplify_before_foldAndOfICmps() {
	; CHECK-LABEL: @simplify_before_foldAndOfICmps(			; CHECK-LABEL: @simplify_before_foldAndOfICmps(
	; CHECK-NEXT: [[A8:%.*]] = alloca i16, align 2			; CHECK-NEXT: [[A8:%.*]] = alloca i16, align 2
	; CHECK-NEXT: [[L7:%.]] = load i16, i16 [[A8]], align 2			; CHECK-NEXT: [[L7:%.]] = load i16, i16 [[A8]], align 2
	; CHECK-NEXT: [[C10:%.*]] = icmp ult i16 [[L7]], 2			; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i16 [[L7]], -1
				; CHECK-NEXT: [[B11:%.*]] = zext i1 [[TMP1]] to i16
				; CHECK-NEXT: [[C10:%.*]] = icmp ugt i16 [[L7]], [[B11]]
				; CHECK-NEXT: [[C5:%.*]] = icmp slt i16 [[L7]], 1
				; CHECK-NEXT: [[C11:%.*]] = icmp ne i16 [[L7]], 0
	; CHECK-NEXT: [[C7:%.*]] = icmp slt i16 [[L7]], 0			; CHECK-NEXT: [[C7:%.*]] = icmp slt i16 [[L7]], 0
	; CHECK-NEXT: [[C18:%.*]] = or i1 [[C7]], [[C10]]			; CHECK-NEXT: [[B15:%.*]] = xor i1 [[C7]], [[C10]]
	; CHECK-NEXT: [[L7_LOBIT:%.*]] = ashr i16 [[L7]], 15			; CHECK-NEXT: [[B19:%.*]] = xor i1 [[C11]], [[B15]]
	; CHECK-NEXT: [[TMP1:%.*]] = sext i16 [[L7_LOBIT]] to i64			; CHECK-NEXT: [[TMP2:%.*]] = and i1 [[C10]], [[C5]]
	; CHECK-NEXT: [[G26:%.]] = getelementptr i1, i1 null, i64 [[TMP1]]			; CHECK-NEXT: [[C3:%.*]] = and i1 [[B19]], [[TMP2]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[C10]], true
				; CHECK-NEXT: [[C18:%.*]] = or i1 [[C7]], [[TMP3]]
				; CHECK-NEXT: [[TMP4:%.*]] = sext i1 [[C3]] to i64
				; CHECK-NEXT: [[G26:%.]] = getelementptr i1, i1 null, i64 [[TMP4]]
	; CHECK-NEXT: store i16 [[L7]], i16* undef, align 2			; CHECK-NEXT: store i16 [[L7]], i16* undef, align 2
	; CHECK-NEXT: store i1 [[C18]], i1* undef, align 1			; CHECK-NEXT: store i1 [[C18]], i1* undef, align 1
	; CHECK-NEXT: store i1* [[G26]], i1** undef, align 8			; CHECK-NEXT: store i1* [[G26]], i1** undef, align 8
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	%A8 = alloca i16			%A8 = alloca i16
	%L7 = load i16, i16* %A8			%L7 = load i16, i16* %A8
	%G21 = getelementptr i16, i16* %A8, i8 -1			%G21 = getelementptr i16, i16* %A8, i8 -1
	▲ Show 20 Lines • Show All 146 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/high-bit-signmask-with-trunc.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt %s -instcombine -S \| FileCheck %s			; RUN: opt %s -instcombine -S \| FileCheck %s

	define i32 @t0(i64 %x) {			define i32 @t0(i64 %x) {
	; CHECK-LABEL: @t0(			; CHECK-LABEL: @t0(
	; CHECK-NEXT: [[TMP1:%.]] = ashr i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = ashr i64 [[X:%.]], 63
	; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP2:%.*]] = trunc i64 [[TMP1]] to i32
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[TMP2]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	%t1 = trunc i64 %t0 to i32			%t1 = trunc i64 %t0 to i32
	%r = sub i32 0, %t1			%r = sub i32 0, %t1
	ret i32 %r			ret i32 %r
	}			}
	define i32 @t1_exact(i64 %x) {			define i32 @t1_exact(i64 %x) {
	; CHECK-LABEL: @t1_exact(			; CHECK-LABEL: @t1_exact(
	; CHECK-NEXT: [[TMP1:%.]] = ashr exact i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = ashr exact i64 [[X:%.]], 63
	; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP2:%.*]] = trunc i64 [[TMP1]] to i32
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[TMP2]]
	;			;
	%t0 = lshr exact i64 %x, 63			%t0 = lshr exact i64 %x, 63
	%t1 = trunc i64 %t0 to i32			%t1 = trunc i64 %t0 to i32
	%r = sub i32 0, %t1			%r = sub i32 0, %t1
	ret i32 %r			ret i32 %r
	}			}
	define i32 @t2(i64 %x) {			define i32 @t2(i64 %x) {
	; CHECK-LABEL: @t2(			; CHECK-LABEL: @t2(
	; CHECK-NEXT: [[TMP1:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = lshr i64 [[X:%.]], 63
	; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP2:%.*]] = trunc i64 [[TMP1]] to i32
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[TMP2]]
	;			;
	%t0 = ashr i64 %x, 63			%t0 = ashr i64 %x, 63
	%t1 = trunc i64 %t0 to i32			%t1 = trunc i64 %t0 to i32
	%r = sub i32 0, %t1			%r = sub i32 0, %t1
	ret i32 %r			ret i32 %r
	}			}
	define i32 @t3_exact(i64 %x) {			define i32 @t3_exact(i64 %x) {
	; CHECK-LABEL: @t3_exact(			; CHECK-LABEL: @t3_exact(
	; CHECK-NEXT: [[TMP1:%.]] = lshr exact i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = lshr exact i64 [[X:%.]], 63
	; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32			; CHECK-NEXT: [[TMP2:%.*]] = trunc i64 [[TMP1]] to i32
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[TMP2]]
	;			;
	%t0 = ashr exact i64 %x, 63			%t0 = ashr exact i64 %x, 63
	%t1 = trunc i64 %t0 to i32			%t1 = trunc i64 %t0 to i32
	%r = sub i32 0, %t1			%r = sub i32 0, %t1
	ret i32 %r			ret i32 %r
	}			}

	define <2 x i32> @t4(<2 x i64> %x) {			define <2 x i32> @t4(<2 x i64> %x) {
	; CHECK-LABEL: @t4(			; CHECK-LABEL: @t4(
	; CHECK-NEXT: [[TMP1:%.]] = ashr <2 x i64> [[X:%.]], <i64 63, i64 63>			; CHECK-NEXT: [[TMP1:%.]] = ashr <2 x i64> [[X:%.]], <i64 63, i64 63>
	; CHECK-NEXT: [[R:%.*]] = trunc <2 x i64> [[TMP1]] to <2 x i32>			; CHECK-NEXT: [[TMP2:%.*]] = trunc <2 x i64> [[TMP1]] to <2 x i32>
	; CHECK-NEXT: ret <2 x i32> [[R]]			; CHECK-NEXT: ret <2 x i32> [[TMP2]]
	;			;
	%t0 = lshr <2 x i64> %x, <i64 63, i64 63>			%t0 = lshr <2 x i64> %x, <i64 63, i64 63>
	%t1 = trunc <2 x i64> %t0 to <2 x i32>			%t1 = trunc <2 x i64> %t0 to <2 x i32>
	%r = sub <2 x i32> zeroinitializer, %t1			%r = sub <2 x i32> zeroinitializer, %t1
	ret <2 x i32> %r			ret <2 x i32> %r
	}			}

	define <2 x i32> @t5(<2 x i64> %x) {			define <2 x i32> @t5(<2 x i64> %x) {
	Show All 11 Lines

	declare void @use64(i64)			declare void @use64(i64)
	declare void @use32(i32)			declare void @use32(i32)

	define i32 @t6(i64 %x) {			define i32 @t6(i64 %x) {
	; CHECK-LABEL: @t6(			; CHECK-LABEL: @t6(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63
	; CHECK-NEXT: call void @use64(i64 [[T0]])			; CHECK-NEXT: call void @use64(i64 [[T0]])
	; CHECK-NEXT: [[TMP1:%.*]] = ashr i64 [[X]], 63			; CHECK-NEXT: [[T1:%.*]] = trunc i64 [[T0]] to i32
	; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32			; CHECK-NEXT: [[R:%.*]] = sub i32 0, [[T1]]
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	call void @use64(i64 %t0)			call void @use64(i64 %t0)
	%t1 = trunc i64 %t0 to i32			%t1 = trunc i64 %t0 to i32
	%r = sub i32 0, %t1			%r = sub i32 0, %t1
	ret i32 %r			ret i32 %r
	}			}

	define i32 @n7(i64 %x) {			define i32 @n7(i64 %x) {
	; CHECK-LABEL: @n7(			; CHECK-LABEL: @n7(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63
	; CHECK-NEXT: [[T1:%.*]] = trunc i64 [[T0]] to i32			; CHECK-NEXT: [[T1:%.*]] = trunc i64 [[T0]] to i32
	; CHECK-NEXT: call void @use32(i32 [[T1]])			; CHECK-NEXT: call void @use32(i32 [[T1]])
	; CHECK-NEXT: [[R:%.*]] = sub nsw i32 0, [[T1]]			; CHECK-NEXT: [[R:%.*]] = sub i32 0, [[T1]]
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	%t1 = trunc i64 %t0 to i32			%t1 = trunc i64 %t0 to i32
	call void @use32(i32 %t1)			call void @use32(i32 %t1)
	%r = sub i32 0, %t1			%r = sub i32 0, %t1
	ret i32 %r			ret i32 %r
	}			}

	define i32 @n8(i64 %x) {			define i32 @n8(i64 %x) {
	; CHECK-LABEL: @n8(			; CHECK-LABEL: @n8(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63
	; CHECK-NEXT: call void @use64(i64 [[T0]])			; CHECK-NEXT: call void @use64(i64 [[T0]])
	; CHECK-NEXT: [[T1:%.*]] = trunc i64 [[T0]] to i32			; CHECK-NEXT: [[T1:%.*]] = trunc i64 [[T0]] to i32
	; CHECK-NEXT: call void @use32(i32 [[T1]])			; CHECK-NEXT: call void @use32(i32 [[T1]])
	; CHECK-NEXT: [[R:%.*]] = sub nsw i32 0, [[T1]]			; CHECK-NEXT: [[R:%.*]] = sub i32 0, [[T1]]
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	call void @use64(i64 %t0)			call void @use64(i64 %t0)
	%t1 = trunc i64 %t0 to i32			%t1 = trunc i64 %t0 to i32
	call void @use32(i32 %t1)			call void @use32(i32 %t1)
	%r = sub i32 0, %t1			%r = sub i32 0, %t1
	ret i32 %r			ret i32 %r
	}			}

	define i32 @n9(i64 %x) {			define i32 @n9(i64 %x) {
	; CHECK-LABEL: @n9(			; CHECK-LABEL: @n9(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 62			; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 62
	; CHECK-NEXT: [[T1:%.*]] = trunc i64 [[T0]] to i32			; CHECK-NEXT: [[T1:%.*]] = trunc i64 [[T0]] to i32
	; CHECK-NEXT: [[R:%.*]] = sub nsw i32 0, [[T1]]			; CHECK-NEXT: [[R:%.*]] = sub i32 0, [[T1]]
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[R]]
	;			;
	%t0 = lshr i64 %x, 62			%t0 = lshr i64 %x, 62
	%t1 = trunc i64 %t0 to i32			%t1 = trunc i64 %t0 to i32
	%r = sub i32 0, %t1			%r = sub i32 0, %t1
	ret i32 %r			ret i32 %r
	}			}

	define i32 @n10(i64 %x) {			define i32 @n10(i64 %x) {
	; CHECK-LABEL: @n10(			; CHECK-LABEL: @n10(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = ashr i64 [[X:%.]], 63
	; CHECK-NEXT: [[T1:%.*]] = trunc i64 [[T0]] to i32			; CHECK-NEXT: [[TMP2:%.*]] = trunc i64 [[TMP1]] to i32
	; CHECK-NEXT: [[R:%.*]] = xor i32 [[T1]], 1			; CHECK-NEXT: [[R:%.*]] = add i32 [[TMP2]], 1
	; CHECK-NEXT: ret i32 [[R]]			; CHECK-NEXT: ret i32 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	%t1 = trunc i64 %t0 to i32			%t1 = trunc i64 %t0 to i32
	%r = sub i32 1, %t1			%r = sub i32 1, %t1
	ret i32 %r			ret i32 %r
	}			}

llvm/test/Transforms/InstCombine/high-bit-signmask.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt %s -instcombine -S \| FileCheck %s			; RUN: opt %s -instcombine -S \| FileCheck %s

	define i64 @t0(i64 %x) {			define i64 @t0(i64 %x) {
	; CHECK-LABEL: @t0(			; CHECK-LABEL: @t0(
	; CHECK-NEXT: [[R:%.]] = ashr i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = ashr i64 [[X:%.]], 63
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[TMP1]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}
	define i64 @t0_exact(i64 %x) {			define i64 @t0_exact(i64 %x) {
	; CHECK-LABEL: @t0_exact(			; CHECK-LABEL: @t0_exact(
	; CHECK-NEXT: [[R:%.]] = ashr exact i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = ashr exact i64 [[X:%.]], 63
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[TMP1]]
	;			;
	%t0 = lshr exact i64 %x, 63			%t0 = lshr exact i64 %x, 63
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}
	define i64 @t2(i64 %x) {			define i64 @t2(i64 %x) {
	; CHECK-LABEL: @t2(			; CHECK-LABEL: @t2(
	; CHECK-NEXT: [[R:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = lshr i64 [[X:%.]], 63
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[TMP1]]
	;			;
	%t0 = ashr i64 %x, 63			%t0 = ashr i64 %x, 63
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}
	define i64 @t3_exact(i64 %x) {			define i64 @t3_exact(i64 %x) {
	; CHECK-LABEL: @t3_exact(			; CHECK-LABEL: @t3_exact(
	; CHECK-NEXT: [[R:%.]] = lshr exact i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = lshr exact i64 [[X:%.]], 63
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[TMP1]]
	;			;
	%t0 = ashr exact i64 %x, 63			%t0 = ashr exact i64 %x, 63
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define <2 x i64> @t4(<2 x i64> %x) {			define <2 x i64> @t4(<2 x i64> %x) {
	; CHECK-LABEL: @t4(			; CHECK-LABEL: @t4(
	; CHECK-NEXT: [[R:%.]] = ashr <2 x i64> [[X:%.]], <i64 63, i64 63>			; CHECK-NEXT: [[TMP1:%.]] = ashr <2 x i64> [[X:%.]], <i64 63, i64 63>
	; CHECK-NEXT: ret <2 x i64> [[R]]			; CHECK-NEXT: ret <2 x i64> [[TMP1]]
	;			;
	%t0 = lshr <2 x i64> %x, <i64 63, i64 63>			%t0 = lshr <2 x i64> %x, <i64 63, i64 63>
	%r = sub <2 x i64> zeroinitializer, %t0			%r = sub <2 x i64> zeroinitializer, %t0
	ret <2 x i64> %r			ret <2 x i64> %r
	}			}

	define <2 x i64> @t5(<2 x i64> %x) {			define <2 x i64> @t5(<2 x i64> %x) {
	; CHECK-LABEL: @t5(			; CHECK-LABEL: @t5(
	; CHECK-NEXT: [[T0:%.]] = lshr <2 x i64> [[X:%.]], <i64 63, i64 undef>			; CHECK-NEXT: [[T0:%.]] = lshr <2 x i64> [[X:%.]], <i64 63, i64 undef>
	; CHECK-NEXT: [[R:%.*]] = sub <2 x i64> <i64 0, i64 undef>, [[T0]]			; CHECK-NEXT: [[R:%.*]] = sub <2 x i64> <i64 0, i64 undef>, [[T0]]
	; CHECK-NEXT: ret <2 x i64> [[R]]			; CHECK-NEXT: ret <2 x i64> [[R]]
	;			;
	%t0 = lshr <2 x i64> %x, <i64 63, i64 undef>			%t0 = lshr <2 x i64> %x, <i64 63, i64 undef>
	%r = sub <2 x i64> <i64 0, i64 undef>, %t0			%r = sub <2 x i64> <i64 0, i64 undef>, %t0
	ret <2 x i64> %r			ret <2 x i64> %r
	}			}

	declare void @use64(i64)			declare void @use64(i64)
	declare void @use32(i64)			declare void @use32(i64)

	define i64 @t6(i64 %x) {			define i64 @t6(i64 %x) {
	; CHECK-LABEL: @t6(			; CHECK-LABEL: @t6(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63
	; CHECK-NEXT: call void @use64(i64 [[T0]])			; CHECK-NEXT: call void @use64(i64 [[T0]])
	; CHECK-NEXT: [[R:%.*]] = ashr i64 [[X]], 63			; CHECK-NEXT: [[R:%.*]] = sub i64 0, [[T0]]
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	call void @use64(i64 %t0)			call void @use64(i64 %t0)
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define i64 @n7(i64 %x) {			define i64 @n7(i64 %x) {
	; CHECK-LABEL: @n7(			; CHECK-LABEL: @n7(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63
	; CHECK-NEXT: call void @use32(i64 [[T0]])			; CHECK-NEXT: call void @use32(i64 [[T0]])
	; CHECK-NEXT: [[R:%.*]] = ashr i64 [[X]], 63			; CHECK-NEXT: [[R:%.*]] = sub i64 0, [[T0]]
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	call void @use32(i64 %t0)			call void @use32(i64 %t0)
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define i64 @n8(i64 %x) {			define i64 @n8(i64 %x) {
	; CHECK-LABEL: @n8(			; CHECK-LABEL: @n8(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63
	; CHECK-NEXT: call void @use64(i64 [[T0]])			; CHECK-NEXT: call void @use64(i64 [[T0]])
	; CHECK-NEXT: call void @use32(i64 [[T0]])			; CHECK-NEXT: call void @use32(i64 [[T0]])
	; CHECK-NEXT: [[R:%.*]] = ashr i64 [[X]], 63			; CHECK-NEXT: [[R:%.*]] = sub i64 0, [[T0]]
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	call void @use64(i64 %t0)			call void @use64(i64 %t0)
	call void @use32(i64 %t0)			call void @use32(i64 %t0)
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define i64 @n9(i64 %x) {			define i64 @n9(i64 %x) {
	; CHECK-LABEL: @n9(			; CHECK-LABEL: @n9(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 62			; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 62
	; CHECK-NEXT: [[R:%.*]] = sub nsw i64 0, [[T0]]			; CHECK-NEXT: [[R:%.*]] = sub i64 0, [[T0]]
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 62			%t0 = lshr i64 %x, 62
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define i64 @n10(i64 %x) {			define i64 @n10(i64 %x) {
	; CHECK-LABEL: @n10(			; CHECK-LABEL: @n10(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[TMP1:%.]] = ashr i64 [[X:%.]], 63
	; CHECK-NEXT: [[R:%.*]] = xor i64 [[T0]], 1			; CHECK-NEXT: [[R:%.*]] = add nsw i64 [[TMP1]], 1
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	%r = sub i64 1, %t0			%r = sub i64 1, %t0
	ret i64 %r			ret i64 %r
	}			}

llvm/test/Transforms/InstCombine/icmp.ll

Show First 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret <2 x i1> undef		; CHECK-NEXT: ret <2 x i1> undef
;		;
%V = icmp eq <2 x i64> zeroinitializer, undef		%V = icmp eq <2 x i64> zeroinitializer, undef
ret <2 x i1> %V		ret <2 x i1> %V
}		}

define i32 @test6(i32 %a, i32 %b) {		define i32 @test6(i32 %a, i32 %b) {
; CHECK-LABEL: @test6(		; CHECK-LABEL: @test6(
; CHECK-NEXT: [[E:%.]] = ashr i32 [[A:%.]], 31		; CHECK-NEXT: [[TMP1:%.]] = ashr i32 [[A:%.]], 31
; CHECK-NEXT: [[F:%.]] = and i32 [[E]], [[B:%.]]		; CHECK-NEXT: [[F:%.]] = and i32 [[TMP1]], [[B:%.]]
; CHECK-NEXT: ret i32 [[F]]		; CHECK-NEXT: ret i32 [[F]]
;		;
%c = icmp sle i32 %a, -1		%c = icmp sle i32 %a, -1
%d = zext i1 %c to i32		%d = zext i1 %c to i32
%e = sub i32 0, %d		%e = sub i32 0, %d
%f = and i32 %e, %b		%f = and i32 %e, %b
ret i32 %f		ret i32 %f
}		}
▲ Show 20 Lines • Show All 396 Lines • ▼ Show 20 Lines	;
%cmp = icmp eq i32* %p1, getelementptr inbounds ([1000 x i32], [1000 x i32]* @X, i64 1, i64 0)		%cmp = icmp eq i32* %p1, getelementptr inbounds ([1000 x i32], [1000 x i32]* @X, i64 1, i64 0)
ret i1 %cmp		ret i1 %cmp
}		}

; Note: offs can be negative, LLVM used to make an incorrect assumption that		; Note: offs can be negative, LLVM used to make an incorrect assumption that
; unsigned overflow does not happen during offset computation		; unsigned overflow does not happen during offset computation
define i1 @test24_neg_offs(i32* %p, i64 %offs) {		define i1 @test24_neg_offs(i32* %p, i64 %offs) {
; CHECK-LABEL: @test24_neg_offs(		; CHECK-LABEL: @test24_neg_offs(
; CHECK-NEXT: [[CMP:%.]] = icmp eq i64 [[OFFS:%.]], -2		; CHECK-NEXT: [[TMP1:%.]] = mul i64 [[OFFS:%.]], -4
		; CHECK-NEXT: [[CMP:%.*]] = icmp eq i64 [[TMP1]], 8
; CHECK-NEXT: ret i1 [[CMP]]		; CHECK-NEXT: ret i1 [[CMP]]
;		;
%p1 = getelementptr inbounds i32, i32* %p, i64 %offs		%p1 = getelementptr inbounds i32, i32* %p, i64 %offs
%conv1 = ptrtoint i32* %p to i64		%conv1 = ptrtoint i32* %p to i64
%conv2 = ptrtoint i32* %p1 to i64		%conv2 = ptrtoint i32* %p1 to i64
%delta = sub i64 %conv1, %conv2		%delta = sub i64 %conv1, %conv2
%cmp = icmp eq i64 %delta, 8		%cmp = icmp eq i64 %delta, 8
ret i1 %cmp		ret i1 %cmp
▲ Show 20 Lines • Show All 2,377 Lines • ▼ Show 20 Lines

; Note: fptosi is used in various tests below to ensure that operand complexity		; Note: fptosi is used in various tests below to ensure that operand complexity
; canonicalization does not kick in, which would make some of the tests		; canonicalization does not kick in, which would make some of the tests
; equivalent to one another.		; equivalent to one another.

define i1 @cmp_sgt_rhs_dec(float %x, i32 %i) {		define i1 @cmp_sgt_rhs_dec(float %x, i32 %i) {
; CHECK-LABEL: @cmp_sgt_rhs_dec(		; CHECK-LABEL: @cmp_sgt_rhs_dec(
; CHECK-NEXT: [[CONV:%.]] = fptosi float [[X:%.]] to i32		; CHECK-NEXT: [[CONV:%.]] = fptosi float [[X:%.]] to i32
; CHECK-NEXT: [[CMP:%.]] = icmp sge i32 [[CONV]], [[I:%.]]		; CHECK-NEXT: [[DEC:%.]] = add i32 [[I:%.]], -1
		; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[DEC]], [[CONV]]
; CHECK-NEXT: ret i1 [[CMP]]		; CHECK-NEXT: ret i1 [[CMP]]
;		;
%conv = fptosi float %x to i32		%conv = fptosi float %x to i32
%dec = sub nsw i32 %i, 1		%dec = sub nsw i32 %i, 1
%cmp = icmp sgt i32 %conv, %dec		%cmp = icmp sgt i32 %conv, %dec
ret i1 %cmp		ret i1 %cmp
}		}

define i1 @cmp_sle_rhs_dec(float %x, i32 %i) {		define i1 @cmp_sle_rhs_dec(float %x, i32 %i) {
; CHECK-LABEL: @cmp_sle_rhs_dec(		; CHECK-LABEL: @cmp_sle_rhs_dec(
; CHECK-NEXT: [[CONV:%.]] = fptosi float [[X:%.]] to i32		; CHECK-NEXT: [[CONV:%.]] = fptosi float [[X:%.]] to i32
; CHECK-NEXT: [[CMP:%.]] = icmp slt i32 [[CONV]], [[I:%.]]		; CHECK-NEXT: [[DEC:%.]] = add i32 [[I:%.]], -1
		; CHECK-NEXT: [[CMP:%.*]] = icmp sge i32 [[DEC]], [[CONV]]
; CHECK-NEXT: ret i1 [[CMP]]		; CHECK-NEXT: ret i1 [[CMP]]
;		;
%conv = fptosi float %x to i32		%conv = fptosi float %x to i32
%dec = sub nsw i32 %i, 1		%dec = sub nsw i32 %i, 1
%cmp = icmp sle i32 %conv, %dec		%cmp = icmp sle i32 %conv, %dec
ret i1 %cmp		ret i1 %cmp
}		}

▲ Show 20 Lines • Show All 205 Lines • ▼ Show 20 Lines
;		;
%inc = add nuw i32 %x, 1		%inc = add nuw i32 %x, 1
%cmp = icmp uge i32 %inc, %y		%cmp = icmp uge i32 %inc, %y
ret i1 %cmp		ret i1 %cmp
}		}

define i1 @cmp_sgt_lhs_dec(i32 %x, i32 %y) {		define i1 @cmp_sgt_lhs_dec(i32 %x, i32 %y) {
; CHECK-LABEL: @cmp_sgt_lhs_dec(		; CHECK-LABEL: @cmp_sgt_lhs_dec(
; CHECK-NEXT: [[DEC:%.]] = add nsw i32 [[X:%.]], -1		; CHECK-NEXT: [[DEC:%.]] = add i32 [[X:%.]], -1
; CHECK-NEXT: [[CMP:%.]] = icmp sgt i32 [[DEC]], [[Y:%.]]		; CHECK-NEXT: [[CMP:%.]] = icmp sgt i32 [[DEC]], [[Y:%.]]
; CHECK-NEXT: ret i1 [[CMP]]		; CHECK-NEXT: ret i1 [[CMP]]
;		;
%dec = sub nsw i32 %x, 1		%dec = sub nsw i32 %x, 1
%cmp = icmp sgt i32 %dec, %y		%cmp = icmp sgt i32 %dec, %y
ret i1 %cmp		ret i1 %cmp
}		}

Show All 32 Lines	;
%inc = add nuw i32 %y, 1		%inc = add nuw i32 %y, 1
%cmp = icmp ule i32 %conv, %inc		%cmp = icmp ule i32 %conv, %inc
ret i1 %cmp		ret i1 %cmp
}		}

define i1 @cmp_slt_rhs_dec(float %x, i32 %y) {		define i1 @cmp_slt_rhs_dec(float %x, i32 %y) {
; CHECK-LABEL: @cmp_slt_rhs_dec(		; CHECK-LABEL: @cmp_slt_rhs_dec(
; CHECK-NEXT: [[CONV:%.]] = fptosi float [[X:%.]] to i32		; CHECK-NEXT: [[CONV:%.]] = fptosi float [[X:%.]] to i32
; CHECK-NEXT: [[DEC:%.]] = add nsw i32 [[Y:%.]], -1		; CHECK-NEXT: [[DEC:%.]] = add i32 [[Y:%.]], -1
; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[DEC]], [[CONV]]		; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[DEC]], [[CONV]]
; CHECK-NEXT: ret i1 [[CMP]]		; CHECK-NEXT: ret i1 [[CMP]]
;		;
%conv = fptosi float %x to i32		%conv = fptosi float %x to i32
%dec = sub nsw i32 %y, 1		%dec = sub nsw i32 %y, 1
%cmp = icmp slt i32 %conv, %dec		%cmp = icmp slt i32 %conv, %dec
ret i1 %cmp		ret i1 %cmp
}		}
▲ Show 20 Lines • Show All 324 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/mul.ll

Show First 20 Lines • Show All 450 Lines • ▼ Show 20 Lines
;		;
%neg = sub i32 0, %x		%neg = sub i32 0, %x
%mul = mul i32 %neg, %y		%mul = mul i32 %neg, %y
ret i32 %mul		ret i32 %mul
}		}

define i32 @test_mul_canonicalize_op1(i32 %x, i32 %z) {		define i32 @test_mul_canonicalize_op1(i32 %x, i32 %z) {
; CHECK-LABEL: @test_mul_canonicalize_op1(		; CHECK-LABEL: @test_mul_canonicalize_op1(
; CHECK-NEXT: [[Y:%.]] = mul i32 [[Z:%.]], 3		; CHECK-NEXT: [[TMP1:%.]] = mul i32 [[Z:%.]], -3
; CHECK-NEXT: [[TMP1:%.]] = mul i32 [[Y]], [[X:%.]]		; CHECK-NEXT: [[TMP2:%.]] = mul i32 [[TMP1]], [[X:%.]]
; CHECK-NEXT: [[MUL:%.*]] = sub i32 0, [[TMP1]]		; CHECK-NEXT: ret i32 [[TMP2]]
; CHECK-NEXT: ret i32 [[MUL]]
;		;
%y = mul i32 %z, 3		%y = mul i32 %z, 3
%neg = sub i32 0, %x		%neg = sub i32 0, %x
%mul = mul i32 %y, %neg		%mul = mul i32 %y, %neg
ret i32 %mul		ret i32 %mul
}		}

define i32 @test_mul_canonicalize_nsw(i32 %x, i32 %y) {		define i32 @test_mul_canonicalize_nsw(i32 %x, i32 %y) {
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	;
%sel = select i1 %cond, i32 1, i32 -1		%sel = select i1 %cond, i32 1, i32 -1
%r = mul i32 %sel, %x		%r = mul i32 %sel, %x
ret i32 %r		ret i32 %r
}		}

define <2 x i8> @negate_if_true_commute(<2 x i8> %px, i1 %cond) {		define <2 x i8> @negate_if_true_commute(<2 x i8> %px, i1 %cond) {
; CHECK-LABEL: @negate_if_true_commute(		; CHECK-LABEL: @negate_if_true_commute(
; CHECK-NEXT: [[X:%.]] = sdiv <2 x i8> <i8 42, i8 42>, [[PX:%.]]		; CHECK-NEXT: [[X:%.]] = sdiv <2 x i8> <i8 42, i8 42>, [[PX:%.]]
; CHECK-NEXT: [[TMP1:%.*]] = sub nsw <2 x i8> zeroinitializer, [[X]]		; CHECK-NEXT: [[TMP1:%.*]] = sub <2 x i8> zeroinitializer, [[X]]
; CHECK-NEXT: [[TMP2:%.]] = select i1 [[COND:%.]], <2 x i8> [[TMP1]], <2 x i8> [[X]]		; CHECK-NEXT: [[TMP2:%.]] = select i1 [[COND:%.]], <2 x i8> [[TMP1]], <2 x i8> [[X]]
; CHECK-NEXT: ret <2 x i8> [[TMP2]]		; CHECK-NEXT: ret <2 x i8> [[TMP2]]
;		;
%x = sdiv <2 x i8> <i8 42, i8 42>, %px ; thwart complexity-based canonicalization		%x = sdiv <2 x i8> <i8 42, i8 42>, %px ; thwart complexity-based canonicalization
%sel = select i1 %cond, <2 x i8> <i8 -1, i8 -1>, <2 x i8> <i8 1, i8 1>		%sel = select i1 %cond, <2 x i8> <i8 -1, i8 -1>, <2 x i8> <i8 1, i8 1>
%r = mul <2 x i8> %x, %sel		%r = mul <2 x i8> %x, %sel
ret <2 x i8> %r		ret <2 x i8> %r
}		}
▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/sadd-with-overflow.ll

	Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines
	;			;
	%a = add i32 %x, 12			%a = add i32 %x, 12
	%b = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 30, i32 %a)			%b = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 30, i32 %a)
	ret { i32, i1 } %b			ret { i32, i1 } %b
	}			}

	define { i32, i1 } @fold_sub_simple(i32 %x) {			define { i32, i1 } @fold_sub_simple(i32 %x) {
	; CHECK-LABEL: @fold_sub_simple(			; CHECK-LABEL: @fold_sub_simple(
	; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[X:%.]], i32 42)			; CHECK-NEXT: [[A:%.]] = add i32 [[X:%.]], 12
	; CHECK-NEXT: ret { i32, i1 } [[TMP1]]			; CHECK-NEXT: [[B:%.*]] = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A]], i32 30)
				; CHECK-NEXT: ret { i32, i1 } [[B]]
	;			;
	%a = sub nsw i32 %x, -12			%a = sub nsw i32 %x, -12
	%b = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 30)			%b = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 30)
	ret { i32, i1 } %b			ret { i32, i1 } %b
	}			}

llvm/test/Transforms/InstCombine/ssub-with-overflow.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -instcombine -S \| FileCheck %s			; RUN: opt < %s -instcombine -S \| FileCheck %s

	declare { <2 x i32>, <2 x i1> } @llvm.ssub.with.overflow.v2i32(<2 x i32>, <2 x i32>)			declare { <2 x i32>, <2 x i1> } @llvm.ssub.with.overflow.v2i32(<2 x i32>, <2 x i32>)

	declare { <2 x i8>, <2 x i1> } @llvm.ssub.with.overflow.v2i8(<2 x i8>, <2 x i8>)			declare { <2 x i8>, <2 x i1> } @llvm.ssub.with.overflow.v2i8(<2 x i8>, <2 x i8>)

	declare { i32, i1 } @llvm.ssub.with.overflow.i32(i32, i32)			declare { i32, i1 } @llvm.ssub.with.overflow.i32(i32, i32)

	declare { i8, i1 } @llvm.ssub.with.overflow.i8(i8, i8)			declare { i8, i1 } @llvm.ssub.with.overflow.i8(i8, i8)

	define { i32, i1 } @simple_fold(i32 %x) {			define { i32, i1 } @simple_fold(i32 %x) {
	; CHECK-LABEL: @simple_fold(			; CHECK-LABEL: @simple_fold(
	; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[X:%.]], i32 -20)			; CHECK-NEXT: [[A:%.]] = add i32 [[X:%.]], -7
				; CHECK-NEXT: [[TMP1:%.*]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A]], i32 -13)
	; CHECK-NEXT: ret { i32, i1 } [[TMP1]]			; CHECK-NEXT: ret { i32, i1 } [[TMP1]]
	;			;
	%a = sub nsw i32 %x, 7			%a = sub nsw i32 %x, 7
	%b = tail call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 %a, i32 13)			%b = tail call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 %a, i32 13)
	ret { i32, i1 } %b			ret { i32, i1 } %b
	}			}

	define { i32, i1 } @fold_mixed_signs(i32 %x) {			define { i32, i1 } @fold_mixed_signs(i32 %x) {
	; CHECK-LABEL: @fold_mixed_signs(			; CHECK-LABEL: @fold_mixed_signs(
	; CHECK-NEXT: [[B:%.]] = add nsw i32 [[X:%.]], -6			; CHECK-NEXT: [[A:%.]] = add i32 [[X:%.]], -13
	; CHECK-NEXT: [[TMP1:%.*]] = insertvalue { i32, i1 } { i32 undef, i1 false }, i32 [[B]], 0			; CHECK-NEXT: [[TMP1:%.*]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A]], i32 7)
	; CHECK-NEXT: ret { i32, i1 } [[TMP1]]			; CHECK-NEXT: ret { i32, i1 } [[TMP1]]
	;			;
	%a = sub nsw i32 %x, 13			%a = sub nsw i32 %x, 13
	%b = tail call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 %a, i32 -7)			%b = tail call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 %a, i32 -7)
	ret { i32, i1 } %b			ret { i32, i1 } %b
	}			}

	define { i8, i1 } @fold_on_constant_sub_no_overflow(i8 %x) {			define { i8, i1 } @fold_on_constant_sub_no_overflow(i8 %x) {
	; CHECK-LABEL: @fold_on_constant_sub_no_overflow(			; CHECK-LABEL: @fold_on_constant_sub_no_overflow(
	; CHECK-NEXT: [[TMP1:%.]] = call { i8, i1 } @llvm.sadd.with.overflow.i8(i8 [[X:%.]], i8 -128)			; CHECK-NEXT: [[A:%.]] = add i8 [[X:%.]], -100
				; CHECK-NEXT: [[TMP1:%.*]] = call { i8, i1 } @llvm.sadd.with.overflow.i8(i8 [[A]], i8 -28)
	; CHECK-NEXT: ret { i8, i1 } [[TMP1]]			; CHECK-NEXT: ret { i8, i1 } [[TMP1]]
	;			;
	%a = sub nsw i8 %x, 100			%a = sub nsw i8 %x, 100
	%b = tail call { i8, i1 } @llvm.ssub.with.overflow.i8(i8 %a, i8 28)			%b = tail call { i8, i1 } @llvm.ssub.with.overflow.i8(i8 %a, i8 28)
	ret { i8, i1 } %b			ret { i8, i1 } %b
	}			}

	define { i8, i1 } @no_fold_on_constant_sub_overflow(i8 %x) {			define { i8, i1 } @no_fold_on_constant_sub_overflow(i8 %x) {
	; CHECK-LABEL: @no_fold_on_constant_sub_overflow(			; CHECK-LABEL: @no_fold_on_constant_sub_overflow(
	; CHECK-NEXT: [[A:%.]] = add nsw i8 [[X:%.]], -100			; CHECK-NEXT: [[A:%.]] = add i8 [[X:%.]], -100
	; CHECK-NEXT: [[TMP1:%.*]] = call { i8, i1 } @llvm.sadd.with.overflow.i8(i8 [[A]], i8 -29)			; CHECK-NEXT: [[TMP1:%.*]] = call { i8, i1 } @llvm.sadd.with.overflow.i8(i8 [[A]], i8 -29)
	; CHECK-NEXT: ret { i8, i1 } [[TMP1]]			; CHECK-NEXT: ret { i8, i1 } [[TMP1]]
	;			;
	%a = sub nsw i8 %x, 100			%a = sub nsw i8 %x, 100
	%b = tail call { i8, i1 } @llvm.ssub.with.overflow.i8(i8 %a, i8 29)			%b = tail call { i8, i1 } @llvm.ssub.with.overflow.i8(i8 %a, i8 29)
	ret { i8, i1 } %b			ret { i8, i1 } %b
	}			}

	define { <2 x i32>, <2 x i1> } @fold_simple_splat_constant(<2 x i32> %x) {			define { <2 x i32>, <2 x i1> } @fold_simple_splat_constant(<2 x i32> %x) {
	; CHECK-LABEL: @fold_simple_splat_constant(			; CHECK-LABEL: @fold_simple_splat_constant(
	; CHECK-NEXT: [[TMP1:%.]] = call { <2 x i32>, <2 x i1> } @llvm.sadd.with.overflow.v2i32(<2 x i32> [[X:%.]], <2 x i32> <i32 -42, i32 -42>)			; CHECK-NEXT: [[A:%.]] = add <2 x i32> [[X:%.]], <i32 -12, i32 -12>
				; CHECK-NEXT: [[TMP1:%.*]] = call { <2 x i32>, <2 x i1> } @llvm.sadd.with.overflow.v2i32(<2 x i32> [[A]], <2 x i32> <i32 -30, i32 -30>)
	; CHECK-NEXT: ret { <2 x i32>, <2 x i1> } [[TMP1]]			; CHECK-NEXT: ret { <2 x i32>, <2 x i1> } [[TMP1]]
	;			;
	%a = sub nsw <2 x i32> %x, <i32 12, i32 12>			%a = sub nsw <2 x i32> %x, <i32 12, i32 12>
	%b = tail call { <2 x i32>, <2 x i1> } @llvm.ssub.with.overflow.v2i32(<2 x i32> %a, <2 x i32> <i32 30, i32 30>)			%b = tail call { <2 x i32>, <2 x i1> } @llvm.ssub.with.overflow.v2i32(<2 x i32> %a, <2 x i32> <i32 30, i32 30>)
	ret { <2 x i32>, <2 x i1> } %b			ret { <2 x i32>, <2 x i1> } %b
	}			}

	define { <2 x i32>, <2 x i1> } @no_fold_splat_undef_constant(<2 x i32> %x) {			define { <2 x i32>, <2 x i1> } @no_fold_splat_undef_constant(<2 x i32> %x) {
	Show All 15 Lines
	;			;
	%a = sub nsw <2 x i32> %x, %y			%a = sub nsw <2 x i32> %x, %y
	%b = tail call { <2 x i32>, <2 x i1> } @llvm.ssub.with.overflow.v2i32(<2 x i32> %a, <2 x i32> <i32 30, i32 30>)			%b = tail call { <2 x i32>, <2 x i1> } @llvm.ssub.with.overflow.v2i32(<2 x i32> %a, <2 x i32> <i32 30, i32 30>)
	ret { <2 x i32>, <2 x i1> } %b			ret { <2 x i32>, <2 x i1> } %b
	}			}

	define { i32, i1 } @fold_nuwnsw(i32 %x) {			define { i32, i1 } @fold_nuwnsw(i32 %x) {
	; CHECK-LABEL: @fold_nuwnsw(			; CHECK-LABEL: @fold_nuwnsw(
	; CHECK-NEXT: [[TMP1:%.]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[X:%.]], i32 -42)			; CHECK-NEXT: [[A:%.]] = add i32 [[X:%.]], -12
				; CHECK-NEXT: [[TMP1:%.*]] = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 [[A]], i32 -30)
	; CHECK-NEXT: ret { i32, i1 } [[TMP1]]			; CHECK-NEXT: ret { i32, i1 } [[TMP1]]
	;			;
	%a = sub nuw nsw i32 %x, 12			%a = sub nuw nsw i32 %x, 12
	%b = tail call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 %a, i32 30)			%b = tail call { i32, i1 } @llvm.ssub.with.overflow.i32(i32 %a, i32 30)
	ret { i32, i1 } %b			ret { i32, i1 } %b
	}			}

	define { i32, i1 } @no_fold_nuw(i32 %x) {			define { i32, i1 } @no_fold_nuw(i32 %x) {
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/strcmp-1.ll

	Show All 16 Lines
	; CHECK-LABEL: @test1(			; CHECK-LABEL: @test1(
	; CHECK: %strcmpload = load i8, i8* %str			; CHECK: %strcmpload = load i8, i8* %str
	; CHECK: %1 = zext i8 %strcmpload to i32			; CHECK: %1 = zext i8 %strcmpload to i32
	; CHECK: %2 = sub nsw i32 0, %1			; CHECK: %2 = sub nsw i32 0, %1
	; CHECK: ret i32 %2			; CHECK: ret i32 %2
	; NOBCMP-LABEL: @test1(			; NOBCMP-LABEL: @test1(
	; NOBCMP-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1			; NOBCMP-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1
	; NOBCMP-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32			; NOBCMP-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32
	; NOBCMP-NEXT: [[TMP2:%.*]] = sub nsw i32 0, [[TMP1]]			; NOBCMP-NEXT: [[TMP2:%.*]] = sub i32 0, [[TMP1]]
	; NOBCMP-NEXT: ret i32 [[TMP2]]			; NOBCMP-NEXT: ret i32 [[TMP2]]
	;			;
	; BCMP-LABEL: @test1(			; BCMP-LABEL: @test1(
	; BCMP-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1			; BCMP-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1
	; BCMP-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32			; BCMP-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32
	; BCMP-NEXT: [[TMP2:%.*]] = sub nsw i32 0, [[TMP1]]			; BCMP-NEXT: [[TMP2:%.*]] = sub i32 0, [[TMP1]]
	; BCMP-NEXT: ret i32 [[TMP2]]			; BCMP-NEXT: ret i32 [[TMP2]]
	;			;
	%str1 = getelementptr inbounds [1 x i8], [1 x i8]* @null, i32 0, i32 0			%str1 = getelementptr inbounds [1 x i8], [1 x i8]* @null, i32 0, i32 0
	%temp1 = call i32 @strcmp(i8* %str1, i8* %str2)			%temp1 = call i32 @strcmp(i8* %str1, i8* %str2)
	ret i32 %temp1			ret i32 %temp1

	}			}

	▲ Show 20 Lines • Show All 114 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/strncmp-1.ll

	Show All 10 Lines

	declare i32 @strncmp(i8, i8, i32)			declare i32 @strncmp(i8, i8, i32)

	; strncmp("", x, n) -> -*x			; strncmp("", x, n) -> -*x
	define i32 @test1(i8* %str2) {			define i32 @test1(i8* %str2) {
	; CHECK-LABEL: @test1(			; CHECK-LABEL: @test1(
	; CHECK-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1			; CHECK-NEXT: [[STRCMPLOAD:%.]] = load i8, i8 [[STR2:%.*]], align 1
	; CHECK-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32			; CHECK-NEXT: [[TMP1:%.*]] = zext i8 [[STRCMPLOAD]] to i32
	; CHECK-NEXT: [[TMP2:%.*]] = sub nsw i32 0, [[TMP1]]			; CHECK-NEXT: [[TMP2:%.*]] = sub i32 0, [[TMP1]]
	; CHECK-NEXT: ret i32 [[TMP2]]			; CHECK-NEXT: ret i32 [[TMP2]]
	;			;

	%str1 = getelementptr inbounds [1 x i8], [1 x i8]* @null, i32 0, i32 0			%str1 = getelementptr inbounds [1 x i8], [1 x i8]* @null, i32 0, i32 0
	%temp1 = call i32 @strncmp(i8* %str1, i8* %str2, i32 10)			%temp1 = call i32 @strncmp(i8* %str1, i8* %str2, i32 10)
	ret i32 %temp1			ret i32 %temp1
	}			}

	▲ Show 20 Lines • Show All 123 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/sub-of-negatible.ll

Show All 12 Lines	;
ret i8 %t0		ret i8 %t0
}		}

; Negation can be negated for free		; Negation can be negated for free
define i8 @t1(i8 %x, i8 %y) {		define i8 @t1(i8 %x, i8 %y) {
; CHECK-LABEL: @t1(		; CHECK-LABEL: @t1(
; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]		; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = add i8 [[X:%.]], [[Y]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[Y]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = sub i8 0, %y		%t0 = sub i8 0, %y
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}

; Shift-left can be negated if all uses can be updated		; Shift-left can be negated if all uses can be updated
define i8 @t2(i8 %x, i8 %y) {		define i8 @t2(i8 %x, i8 %y) {
; CHECK-LABEL: @t2(		; CHECK-LABEL: @t2(
; CHECK-NEXT: [[T0:%.]] = shl i8 -42, [[Y:%.]]		; CHECK-NEXT: [[TMP1:%.]] = shl i8 42, [[Y:%.]]
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[TMP1]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = shl i8 -42, %y		%t0 = shl i8 -42, %y
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @n2(i8 %x, i8 %y) {		define i8 @n2(i8 %x, i8 %y) {
; CHECK-LABEL: @n2(		; CHECK-LABEL: @n2(
; CHECK-NEXT: [[T0:%.]] = shl i8 -42, [[Y:%.]]		; CHECK-NEXT: [[T0:%.]] = shl i8 -42, [[Y:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = shl i8 -42, %y		%t0 = shl i8 -42, %y
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @t3(i8 %x, i8 %y, i8 %z) {		define i8 @t3(i8 %x, i8 %y, i8 %z) {
; CHECK-LABEL: @t3(		; CHECK-LABEL: @t3(
; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Z:%.]]		; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Z:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = shl i8 [[T0]], [[Y:%.]]		; CHECK-NEXT: [[TMP1:%.]] = shl i8 [[Z]], [[Y:%.]]
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[TMP1]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = sub i8 0, %z		%t0 = sub i8 0, %z
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = shl i8 %t0, %y		%t1 = shl i8 %t0, %y
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
Show All 12 Lines	;
call void @use8(i8 %t1)		call void @use8(i8 %t1)
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}

; Select can be negated if all it's operands can be negated and all the users of select can be updated		; Select can be negated if all it's operands can be negated and all the users of select can be updated
define i8 @t4(i8 %x, i1 %y) {		define i8 @t4(i8 %x, i1 %y) {
; CHECK-LABEL: @t4(		; CHECK-LABEL: @t4(
; CHECK-NEXT: [[T0:%.]] = select i1 [[Y:%.]], i8 -42, i8 44		; CHECK-NEXT: [[TMP1:%.]] = select i1 [[Y:%.]], i8 42, i8 -44
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[TMP1]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = select i1 %y, i8 -42, i8 44		%t0 = select i1 %y, i8 -42, i8 44
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @n4(i8 %x, i1 %y) {		define i8 @n4(i8 %x, i1 %y) {
; CHECK-LABEL: @n4(		; CHECK-LABEL: @n4(
Show All 16 Lines	;
%t0 = select i1 %y, i8 -42, i8 %z		%t0 = select i1 %y, i8 -42, i8 %z
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @t6(i8 %x, i1 %y, i8 %z) {		define i8 @t6(i8 %x, i1 %y, i8 %z) {
; CHECK-LABEL: @t6(		; CHECK-LABEL: @t6(
; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Z:%.]]		; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Z:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = select i1 [[Y:%.]], i8 -42, i8 [[T0]]		; CHECK-NEXT: [[TMP1:%.]] = select i1 [[Y:%.]], i8 42, i8 [[Z]]
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[TMP1]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = sub i8 0, %z		%t0 = sub i8 0, %z
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = select i1 %y, i8 -42, i8 %t0		%t1 = select i1 %y, i8 -42, i8 %t0
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
define i8 @t7(i8 %x, i1 %y, i8 %z) {		define i8 @t7(i8 %x, i1 %y, i8 %z) {
; CHECK-LABEL: @t7(		; CHECK-LABEL: @t7(
; CHECK-NEXT: [[T0:%.]] = shl i8 1, [[Z:%.]]		; CHECK-NEXT: [[TMP1:%.]] = shl i8 -1, [[Z:%.]]
; CHECK-NEXT: [[T1:%.]] = select i1 [[Y:%.]], i8 0, i8 [[T0]]		; CHECK-NEXT: [[TMP2:%.]] = select i1 [[Y:%.]], i8 0, i8 [[TMP1]]
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[TMP2]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = shl i8 1, %z		%t0 = shl i8 1, %z
%t1 = select i1 %y, i8 0, i8 %t0		%t1 = select i1 %y, i8 0, i8 %t0
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
define i8 @n8(i8 %x, i1 %y, i8 %z) {		define i8 @n8(i8 %x, i1 %y, i8 %z) {
; CHECK-LABEL: @n8(		; CHECK-LABEL: @n8(
; CHECK-NEXT: [[T0:%.]] = shl i8 1, [[Z:%.]]		; CHECK-NEXT: [[T0:%.]] = shl i8 1, [[Z:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = select i1 [[Y:%.]], i8 0, i8 [[T0]]		; CHECK-NEXT: [[T1:%.]] = select i1 [[Y:%.]], i8 0, i8 [[T0]]
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = shl i8 1, %z		%t0 = shl i8 1, %z
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = select i1 %y, i8 0, i8 %t0		%t1 = select i1 %y, i8 0, i8 %t0
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}

; Subtraction can be negated by swapping its operands.		; Subtraction can be negated by swapping its operands.
; x - (y - z) -> x - y + z -> x + (z - y)		; x - (y - z) -> x - y + z -> x + (z - y)
define i8 @t9(i8 %x, i8 %y) {		define i8 @t9(i8 %x, i8 %y) {
		spatelUnsubmitted Done Reply Inline Actions I didn't follow the diffs here - was one of these tests redundant? The code comment didn't match before, but it still doesn't? spatel: I didn't follow the diffs here - was one of these tests redundant? The code comment didn't…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions The test was too complicated, was checking more than the minimal pattern - subtraction can be freely negated by swapping its operands. lebedev.ri: The test was too complicated, was checking more than the minimal pattern - subtraction can be…
; CHECK-LABEL: @t9(		; CHECK-LABEL: @t9(
; CHECK-NEXT: [[T01:%.]] = sub i8 [[X:%.]], [[Y:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = sub i8 [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret i8 [[T01]]		; CHECK-NEXT: ret i8 [[TMP1]]
;		;
%t0 = sub i8 %y, %x		%t0 = sub i8 %y, %x
%t1 = sub i8 0, %t0		%t1 = sub i8 0, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @n10(i8 %x, i8 %y, i8 %z) {		define i8 @n10(i8 %x, i8 %y, i8 %z) {
; CHECK-LABEL: @n10(		; CHECK-LABEL: @n10(
; CHECK-NEXT: [[T0:%.]] = sub i8 [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[T0:%.]] = sub i8 [[Y:%.]], [[X:%.*]]
Show All 26 Lines	;
%t2 = add i8 %t0, %t1		%t2 = add i8 %t0, %t1
%t3 = sub i8 %x, %t2		%t3 = sub i8 %x, %t2
ret i8 %t3		ret i8 %t3
}		}
define i8 @n13(i8 %x, i8 %y, i8 %z) {		define i8 @n13(i8 %x, i8 %y, i8 %z) {
; CHECK-LABEL: @n13(		; CHECK-LABEL: @n13(
; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]		; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T11:%.]] = sub i8 [[Y]], [[Z:%.]]		; CHECK-NEXT: [[TMP1:%.]] = sub i8 [[Y]], [[Z:%.]]
; CHECK-NEXT: [[T2:%.]] = add i8 [[T11]], [[X:%.]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[TMP1]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = sub i8 0, %y		%t0 = sub i8 0, %y
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = add i8 %t0, %z		%t1 = add i8 %t0, %z
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
Show All 20 Lines
}		}

; Multiplication can be negated if either one of operands can be negated		; Multiplication can be negated if either one of operands can be negated
; x - (y * z) -> x + ((-y) * z) or x + ((-z) * y)		; x - (y * z) -> x + ((-y) * z) or x + ((-z) * y)
define i8 @t15(i8 %x, i8 %y, i8 %z) {		define i8 @t15(i8 %x, i8 %y, i8 %z) {
; CHECK-LABEL: @t15(		; CHECK-LABEL: @t15(
; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]		; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[TMP1:%.]] = mul i8 [[Z:%.]], [[Y]]		; CHECK-NEXT: [[TMP1:%.]] = mul i8 [[Y]], [[Z:%.]]
; CHECK-NEXT: [[T2:%.]] = add i8 [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[TMP1]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = sub i8 0, %y		%t0 = sub i8 0, %y
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = mul i8 %t0, %z		%t1 = mul i8 %t0, %z
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
Show All 10 Lines	;
%t0 = sub i8 0, %y		%t0 = sub i8 0, %y
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = mul i8 %t0, %z		%t1 = mul i8 %t0, %z
call void @use8(i8 %t1)		call void @use8(i8 %t1)
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}

; Phi can be negated if all incoming values can be negated		; Phi can be negated if all incoming values can be negated
define i8 @t16(i1 %c, i8 %x) {		define i8 @t16(i1 %c, i8 %x) {
		spatelUnsubmitted Done Reply Inline Actions Add these tests with baseline results as pre-commit? spatel: Add these tests with baseline results as pre-commit?
; CHECK-LABEL: @t16(		; CHECK-LABEL: @t16(
; CHECK-NEXT: begin:		; CHECK-NEXT: begin:
; CHECK-NEXT: br i1 [[C:%.]], label [[THEN:%.]], label [[ELSE:%.*]]		; CHECK-NEXT: br i1 [[C:%.]], label [[THEN:%.]], label [[ELSE:%.*]]
; CHECK: then:		; CHECK: then:
		; CHECK-NEXT: [[Y:%.]] = sub i8 0, [[X:%.]]
; CHECK-NEXT: br label [[END:%.*]]		; CHECK-NEXT: br label [[END:%.*]]
; CHECK: else:		; CHECK: else:
; CHECK-NEXT: br label [[END]]		; CHECK-NEXT: br label [[END]]
; CHECK: end:		; CHECK: end:
; CHECK-NEXT: [[Z:%.]] = phi i8 [ [[X:%.]], [[THEN]] ], [ 42, [[ELSE]] ]		; CHECK-NEXT: [[Z:%.*]] = phi i8 [ [[Y]], [[THEN]] ], [ -42, [[ELSE]] ]
; CHECK-NEXT: ret i8 [[Z]]		; CHECK-NEXT: [[N:%.*]] = sub i8 0, [[Z]]
		; CHECK-NEXT: ret i8 [[N]]
;		;
begin:		begin:
br i1 %c, label %then, label %else		br i1 %c, label %then, label %else
then:		then:
%y = sub i8 0, %x		%y = sub i8 0, %x
br label %end		br label %end
else:		else:
br label %end		br label %end
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	then:
%z = sub i8 0, %x		%z = sub i8 0, %x
br label %end		br label %end
else:		else:
br label %end		br label %end
end:		end:
%r = phi i8 [ %z, %then], [ %y, %else ]		%r = phi i8 [ %z, %then], [ %y, %else ]
%n = sub i8 0, %r		%n = sub i8 0, %r
ret i8 %n		ret i8 %n
}		}
		spatelUnsubmitted Done Reply Inline Actions it's -> its spatel: it's -> its

; truncation can be negated if it's operand can be negated		; truncation can be negated if it's operand can be negated
define i8 @t20(i8 %x, i16 %y) {		define i8 @t20(i8 %x, i16 %y) {
; CHECK-LABEL: @t20(		; CHECK-LABEL: @t20(
; CHECK-NEXT: [[T0:%.]] = shl i16 -42, [[Y:%.]]		; CHECK-NEXT: [[TMP1:%.]] = shl i16 42, [[Y:%.]]
; CHECK-NEXT: [[T1:%.*]] = trunc i16 [[T0]] to i8		; CHECK-NEXT: [[TMP2:%.*]] = trunc i16 [[TMP1]] to i8
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[TMP2]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = shl i16 -42, %y		%t0 = shl i16 -42, %y
%t1 = trunc i16 %t0 to i8		%t1 = trunc i16 %t0 to i8
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
define i8 @n21(i8 %x, i16 %y) {		define i8 @n21(i8 %x, i16 %y) {
Show All 13 Lines

llvm/test/Transforms/InstCombine/sub.ll

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret <4 x i32> [[R]]		; CHECK-NEXT: ret <4 x i32> [[R]]
;		;
%r = sub <4 x i32> %x, bitcast (i128 ptrtoint (i32* @g to i128) to <4 x i32>)		%r = sub <4 x i32> %x, bitcast (i128 ptrtoint (i32* @g to i128) to <4 x i32>)
ret <4 x i32> %r		ret <4 x i32> %r
}		}

define i32 @neg_sub(i32 %x, i32 %y) {		define i32 @neg_sub(i32 %x, i32 %y) {
; CHECK-LABEL: @neg_sub(		; CHECK-LABEL: @neg_sub(
; CHECK-NEXT: [[R:%.]] = add i32 [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add i32 [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[R]]
;		;
%neg = sub i32 0, %x		%neg = sub i32 0, %x
%r = sub i32 %y, %neg		%r = sub i32 %y, %neg
ret i32 %r		ret i32 %r
}		}

define i32 @neg_nsw_sub(i32 %x, i32 %y) {		define i32 @neg_nsw_sub(i32 %x, i32 %y) {
; CHECK-LABEL: @neg_nsw_sub(		; CHECK-LABEL: @neg_nsw_sub(
; CHECK-NEXT: [[R:%.]] = add i32 [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add i32 [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[R]]
;		;
%neg = sub nsw i32 0, %x		%neg = sub nsw i32 0, %x
%r = sub i32 %y, %neg		%r = sub i32 %y, %neg
ret i32 %r		ret i32 %r
}		}

define i32 @neg_sub_nsw(i32 %x, i32 %y) {		define i32 @neg_sub_nsw(i32 %x, i32 %y) {
; CHECK-LABEL: @neg_sub_nsw(		; CHECK-LABEL: @neg_sub_nsw(
; CHECK-NEXT: [[R:%.]] = add i32 [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add i32 [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[R]]
;		;
%neg = sub i32 0, %x		%neg = sub i32 0, %x
%r = sub nsw i32 %y, %neg		%r = sub nsw i32 %y, %neg
ret i32 %r		ret i32 %r
}		}

define i32 @neg_nsw_sub_nsw(i32 %x, i32 %y) {		define i32 @neg_nsw_sub_nsw(i32 %x, i32 %y) {
; CHECK-LABEL: @neg_nsw_sub_nsw(		; CHECK-LABEL: @neg_nsw_sub_nsw(
; CHECK-NEXT: [[R:%.]] = add nsw i32 [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add i32 [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[R]]
;		;
%neg = sub nsw i32 0, %x		%neg = sub nsw i32 0, %x
%r = sub nsw i32 %y, %neg		%r = sub nsw i32 %y, %neg
ret i32 %r		ret i32 %r
}		}

define <2 x i32> @neg_sub_vec(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @neg_sub_vec(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @neg_sub_vec(		; CHECK-LABEL: @neg_sub_vec(
; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret <2 x i32> [[R]]		; CHECK-NEXT: ret <2 x i32> [[R]]
;		;
%neg = sub <2 x i32> zeroinitializer, %x		%neg = sub <2 x i32> zeroinitializer, %x
%r = sub <2 x i32> %y, %neg		%r = sub <2 x i32> %y, %neg
ret <2 x i32> %r		ret <2 x i32> %r
}		}

define <2 x i32> @neg_nsw_sub_vec(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @neg_nsw_sub_vec(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @neg_nsw_sub_vec(		; CHECK-LABEL: @neg_nsw_sub_vec(
; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret <2 x i32> [[R]]		; CHECK-NEXT: ret <2 x i32> [[R]]
;		;
%neg = sub nsw <2 x i32> zeroinitializer, %x		%neg = sub nsw <2 x i32> zeroinitializer, %x
%r = sub <2 x i32> %y, %neg		%r = sub <2 x i32> %y, %neg
ret <2 x i32> %r		ret <2 x i32> %r
}		}

define <2 x i32> @neg_sub_nsw_vec(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @neg_sub_nsw_vec(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @neg_sub_nsw_vec(		; CHECK-LABEL: @neg_sub_nsw_vec(
; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret <2 x i32> [[R]]		; CHECK-NEXT: ret <2 x i32> [[R]]
;		;
%neg = sub <2 x i32> zeroinitializer, %x		%neg = sub <2 x i32> zeroinitializer, %x
%r = sub nsw <2 x i32> %y, %neg		%r = sub nsw <2 x i32> %y, %neg
ret <2 x i32> %r		ret <2 x i32> %r
}		}

define <2 x i32> @neg_nsw_sub_nsw_vec(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @neg_nsw_sub_nsw_vec(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @neg_nsw_sub_nsw_vec(		; CHECK-LABEL: @neg_nsw_sub_nsw_vec(
; CHECK-NEXT: [[R:%.]] = add nsw <2 x i32> [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret <2 x i32> [[R]]		; CHECK-NEXT: ret <2 x i32> [[R]]
;		;
%neg = sub nsw <2 x i32> zeroinitializer, %x		%neg = sub nsw <2 x i32> zeroinitializer, %x
%r = sub nsw <2 x i32> %y, %neg		%r = sub nsw <2 x i32> %y, %neg
ret <2 x i32> %r		ret <2 x i32> %r
}		}

define <2 x i32> @neg_sub_vec_undef(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @neg_sub_vec_undef(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @neg_sub_vec_undef(		; CHECK-LABEL: @neg_sub_vec_undef(
; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret <2 x i32> [[R]]		; CHECK-NEXT: ret <2 x i32> [[R]]
;		;
%neg = sub <2 x i32> <i32 0, i32 undef>, %x		%neg = sub <2 x i32> <i32 0, i32 undef>, %x
%r = sub <2 x i32> %y, %neg		%r = sub <2 x i32> %y, %neg
ret <2 x i32> %r		ret <2 x i32> %r
}		}

define <2 x i32> @neg_nsw_sub_vec_undef(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @neg_nsw_sub_vec_undef(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @neg_nsw_sub_vec_undef(		; CHECK-LABEL: @neg_nsw_sub_vec_undef(
; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret <2 x i32> [[R]]		; CHECK-NEXT: ret <2 x i32> [[R]]
;		;
%neg = sub nsw <2 x i32> <i32 undef, i32 0>, %x		%neg = sub nsw <2 x i32> <i32 undef, i32 0>, %x
%r = sub <2 x i32> %y, %neg		%r = sub <2 x i32> %y, %neg
ret <2 x i32> %r		ret <2 x i32> %r
}		}

define <2 x i32> @neg_sub_nsw_vec_undef(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @neg_sub_nsw_vec_undef(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @neg_sub_nsw_vec_undef(		; CHECK-LABEL: @neg_sub_nsw_vec_undef(
; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret <2 x i32> [[R]]		; CHECK-NEXT: ret <2 x i32> [[R]]
;		;
%neg = sub <2 x i32> <i32 undef, i32 0>, %x		%neg = sub <2 x i32> <i32 undef, i32 0>, %x
%r = sub nsw <2 x i32> %y, %neg		%r = sub nsw <2 x i32> %y, %neg
ret <2 x i32> %r		ret <2 x i32> %r
}		}

; This should not drop 'nsw'.		; This should not drop 'nsw'.

define <2 x i32> @neg_nsw_sub_nsw_vec_undef(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @neg_nsw_sub_nsw_vec_undef(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @neg_nsw_sub_nsw_vec_undef(		; CHECK-LABEL: @neg_nsw_sub_nsw_vec_undef(
; CHECK-NEXT: [[R:%.]] = add nsw <2 x i32> [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[R:%.]] = add <2 x i32> [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret <2 x i32> [[R]]		; CHECK-NEXT: ret <2 x i32> [[R]]
;		;
%neg = sub nsw <2 x i32> <i32 0, i32 undef>, %x		%neg = sub nsw <2 x i32> <i32 0, i32 undef>, %x
%r = sub nsw <2 x i32> %y, %neg		%r = sub nsw <2 x i32> %y, %neg
ret <2 x i32> %r		ret <2 x i32> %r
}		}

; (~X) - (~Y) --> Y - X		; (~X) - (~Y) --> Y - X
Show All 37 Lines	;
%nx = xor <2 x i8> %x, <i8 undef, i8 -1>		%nx = xor <2 x i8> %x, <i8 undef, i8 -1>
%ny = xor <2 x i8> %y, <i8 -1, i8 undef>		%ny = xor <2 x i8> %y, <i8 -1, i8 undef>
%sub = sub <2 x i8> %nx, %ny		%sub = sub <2 x i8> %nx, %ny
ret <2 x i8> %sub		ret <2 x i8> %sub
}		}

define i32 @test5(i32 %A, i32 %B, i32 %C) {		define i32 @test5(i32 %A, i32 %B, i32 %C) {
; CHECK-LABEL: @test5(		; CHECK-LABEL: @test5(
; CHECK-NEXT: [[D1:%.]] = sub i32 [[C:%.]], [[B:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = sub i32 [[C:%.]], [[B:%.*]]
; CHECK-NEXT: [[E:%.]] = add i32 [[D1]], [[A:%.]]		; CHECK-NEXT: [[E:%.]] = add i32 [[TMP1]], [[A:%.]]
; CHECK-NEXT: ret i32 [[E]]		; CHECK-NEXT: ret i32 [[E]]
;		;
%D = sub i32 %B, %C		%D = sub i32 %B, %C
%E = sub i32 %A, %D		%E = sub i32 %A, %D
ret i32 %E		ret i32 %E
}		}

define i32 @test6(i32 %A, i32 %B) {		define i32 @test6(i32 %A, i32 %B) {
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
;		;
%C = sub <2 x i8> %A, %B		%C = sub <2 x i8> %A, %B
%D = icmp ne <2 x i8> %C, zeroinitializer		%D = icmp ne <2 x i8> %C, zeroinitializer
ret <2 x i1> %D		ret <2 x i1> %D
}		}

define i32 @test12(i32 %A) {		define i32 @test12(i32 %A) {
; CHECK-LABEL: @test12(		; CHECK-LABEL: @test12(
; CHECK-NEXT: [[C:%.]] = lshr i32 [[A:%.]], 31		; CHECK-NEXT: [[TMP1:%.]] = lshr i32 [[A:%.]], 31
; CHECK-NEXT: ret i32 [[C]]		; CHECK-NEXT: ret i32 [[TMP1]]
;		;
%B = ashr i32 %A, 31		%B = ashr i32 %A, 31
%C = sub i32 0, %B		%C = sub i32 0, %B
ret i32 %C		ret i32 %C
}		}

define i32 @test13(i32 %A) {		define i32 @test13(i32 %A) {
; CHECK-LABEL: @test13(		; CHECK-LABEL: @test13(
; CHECK-NEXT: [[C:%.]] = ashr i32 [[A:%.]], 31		; CHECK-NEXT: [[TMP1:%.]] = ashr i32 [[A:%.]], 31
; CHECK-NEXT: ret i32 [[C]]		; CHECK-NEXT: ret i32 [[TMP1]]
;		;
%B = lshr i32 %A, 31		%B = lshr i32 %A, 31
%C = sub i32 0, %B		%C = sub i32 0, %B
ret i32 %C		ret i32 %C
}		}

define <2 x i32> @test12vec(<2 x i32> %A) {		define <2 x i32> @test12vec(<2 x i32> %A) {
; CHECK-LABEL: @test12vec(		; CHECK-LABEL: @test12vec(
; CHECK-NEXT: [[C:%.]] = lshr <2 x i32> [[A:%.]], <i32 31, i32 31>		; CHECK-NEXT: [[TMP1:%.]] = lshr <2 x i32> [[A:%.]], <i32 31, i32 31>
; CHECK-NEXT: ret <2 x i32> [[C]]		; CHECK-NEXT: ret <2 x i32> [[TMP1]]
;		;
%B = ashr <2 x i32> %A, <i32 31, i32 31>		%B = ashr <2 x i32> %A, <i32 31, i32 31>
%C = sub <2 x i32> zeroinitializer, %B		%C = sub <2 x i32> zeroinitializer, %B
ret <2 x i32> %C		ret <2 x i32> %C
}		}

define <2 x i32> @test13vec(<2 x i32> %A) {		define <2 x i32> @test13vec(<2 x i32> %A) {
; CHECK-LABEL: @test13vec(		; CHECK-LABEL: @test13vec(
; CHECK-NEXT: [[C:%.]] = ashr <2 x i32> [[A:%.]], <i32 31, i32 31>		; CHECK-NEXT: [[TMP1:%.]] = ashr <2 x i32> [[A:%.]], <i32 31, i32 31>
; CHECK-NEXT: ret <2 x i32> [[C]]		; CHECK-NEXT: ret <2 x i32> [[TMP1]]
;		;
%B = lshr <2 x i32> %A, <i32 31, i32 31>		%B = lshr <2 x i32> %A, <i32 31, i32 31>
%C = sub <2 x i32> zeroinitializer, %B		%C = sub <2 x i32> zeroinitializer, %B
ret <2 x i32> %C		ret <2 x i32> %C
}		}

define i32 @test15(i32 %A, i32 %B) {		define i32 @test15(i32 %A, i32 %B) {
; CHECK-LABEL: @test15(		; CHECK-LABEL: @test15(
; CHECK-NEXT: [[C:%.]] = sub i32 0, [[A:%.]]		; CHECK-NEXT: [[C:%.]] = sub i32 0, [[A:%.]]
; CHECK-NEXT: [[D:%.]] = srem i32 [[B:%.]], [[C]]		; CHECK-NEXT: [[D:%.]] = srem i32 [[B:%.]], [[C]]
; CHECK-NEXT: ret i32 [[D]]		; CHECK-NEXT: ret i32 [[D]]
;		;
%C = sub i32 0, %A		%C = sub i32 0, %A
%D = srem i32 %B, %C		%D = srem i32 %B, %C
ret i32 %D		ret i32 %D
}		}

define i32 @test16(i32 %A) {		define i32 @test16(i32 %A) {
; CHECK-LABEL: @test16(		; CHECK-LABEL: @test16(
; CHECK-NEXT: [[Y:%.]] = sdiv i32 [[A:%.]], -1123		; CHECK-NEXT: [[TMP1:%.]] = sdiv i32 [[A:%.]], -1123
; CHECK-NEXT: ret i32 [[Y]]		; CHECK-NEXT: ret i32 [[TMP1]]
;		;
%X = sdiv i32 %A, 1123		%X = sdiv i32 %A, 1123
%Y = sub i32 0, %X		%Y = sub i32 0, %X
ret i32 %Y		ret i32 %Y
}		}

; Can't fold subtract here because negation it might oveflow.		; Can't fold subtract here because negation it might oveflow.
; PR3142		; PR3142
▲ Show 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	;
%G = sub i64 %C, ptrtoint ([42 x i16]* @Arr to i64)		%G = sub i64 %C, ptrtoint ([42 x i16]* @Arr to i64)
ret i64 %G		ret i64 %G
}		}


define i64 @test25(i8* %P, i64 %A){		define i64 @test25(i8* %P, i64 %A){
; CHECK-LABEL: @test25(		; CHECK-LABEL: @test25(
; CHECK-NEXT: [[B_IDX:%.]] = shl nsw i64 [[A:%.]], 1		; CHECK-NEXT: [[B_IDX:%.]] = shl nsw i64 [[A:%.]], 1
; CHECK-NEXT: [[DIFF_NEG:%.*]] = add i64 [[B_IDX]], -84		; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[B_IDX]], -84
; CHECK-NEXT: ret i64 [[DIFF_NEG]]		; CHECK-NEXT: ret i64 [[TMP1]]
;		;
%B = getelementptr inbounds [42 x i16], [42 x i16]* @Arr, i64 0, i64 %A		%B = getelementptr inbounds [42 x i16], [42 x i16]* @Arr, i64 0, i64 %A
%C = ptrtoint i16* %B to i64		%C = ptrtoint i16* %B to i64
%G = sub i64 %C, ptrtoint (i16* getelementptr ([42 x i16], [42 x i16]* @Arr, i64 1, i64 0) to i64)		%G = sub i64 %C, ptrtoint (i16* getelementptr ([42 x i16], [42 x i16]* @Arr, i64 1, i64 0) to i64)
ret i64 %G		ret i64 %G
}		}

@Arr_as1 = external addrspace(1) global [42 x i16]		@Arr_as1 = external addrspace(1) global [42 x i16]

define i16 @test25_as1(i8 addrspace(1)* %P, i64 %A) {		define i16 @test25_as1(i8 addrspace(1)* %P, i64 %A) {
; CHECK-LABEL: @test25_as1(		; CHECK-LABEL: @test25_as1(
; CHECK-NEXT: [[TMP1:%.]] = trunc i64 [[A:%.]] to i16		; CHECK-NEXT: [[TMP1:%.]] = trunc i64 [[A:%.]] to i16
; CHECK-NEXT: [[B_IDX:%.*]] = shl nsw i16 [[TMP1]], 1		; CHECK-NEXT: [[B_IDX:%.*]] = shl nsw i16 [[TMP1]], 1
; CHECK-NEXT: [[DIFF_NEG:%.*]] = add i16 [[B_IDX]], -84		; CHECK-NEXT: [[TMP2:%.*]] = add i16 [[B_IDX]], -84
; CHECK-NEXT: ret i16 [[DIFF_NEG]]		; CHECK-NEXT: ret i16 [[TMP2]]
;		;
%B = getelementptr inbounds [42 x i16], [42 x i16] addrspace(1)* @Arr_as1, i64 0, i64 %A		%B = getelementptr inbounds [42 x i16], [42 x i16] addrspace(1)* @Arr_as1, i64 0, i64 %A
%C = ptrtoint i16 addrspace(1)* %B to i16		%C = ptrtoint i16 addrspace(1)* %B to i16
%G = sub i16 %C, ptrtoint (i16 addrspace(1)* getelementptr ([42 x i16], [42 x i16] addrspace(1)* @Arr_as1, i64 1, i64 0) to i16)		%G = sub i16 %C, ptrtoint (i16 addrspace(1)* getelementptr ([42 x i16], [42 x i16] addrspace(1)* @Arr_as1, i64 1, i64 0) to i16)
ret i16 %G		ret i16 %G
}		}

define i32 @test26(i32 %x) {		define i32 @test26(i32 %x) {
; CHECK-LABEL: @test26(		; CHECK-LABEL: @test26(
; CHECK-NEXT: [[NEG:%.]] = shl i32 -3, [[X:%.]]		; CHECK-NEXT: [[TMP1:%.]] = shl i32 -3, [[X:%.]]
; CHECK-NEXT: ret i32 [[NEG]]		; CHECK-NEXT: ret i32 [[TMP1]]
;		;
%shl = shl i32 3, %x		%shl = shl i32 3, %x
%neg = sub i32 0, %shl		%neg = sub i32 0, %shl
ret i32 %neg		ret i32 %neg
}		}

define i32 @test27(i32 %x, i32 %y) {		define i32 @test27(i32 %x, i32 %y) {
; CHECK-LABEL: @test27(		; CHECK-LABEL: @test27(
▲ Show 20 Lines • Show All 188 Lines • ▼ Show 20 Lines	;
%shl = shl <2 x i64> %A, <i64 3, i64 4>		%shl = shl <2 x i64> %A, <i64 3, i64 4>
%sub = sub <2 x i64> %shl, %A		%sub = sub <2 x i64> %shl, %A
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define <2 x i32> @test37(<2 x i32> %A) {		define <2 x i32> @test37(<2 x i32> %A) {
; CHECK-LABEL: @test37(		; CHECK-LABEL: @test37(
; CHECK-NEXT: [[TMP1:%.]] = icmp eq <2 x i32> [[A:%.]], <i32 -2147483648, i32 -2147483648>		; CHECK-NEXT: [[TMP1:%.]] = icmp eq <2 x i32> [[A:%.]], <i32 -2147483648, i32 -2147483648>
; CHECK-NEXT: [[SUB:%.*]] = sext <2 x i1> [[TMP1]] to <2 x i32>		; CHECK-NEXT: [[TMP2:%.*]] = sext <2 x i1> [[TMP1]] to <2 x i32>
; CHECK-NEXT: ret <2 x i32> [[SUB]]		; CHECK-NEXT: ret <2 x i32> [[TMP2]]
;		;
%div = sdiv <2 x i32> %A, <i32 -2147483648, i32 -2147483648>		%div = sdiv <2 x i32> %A, <i32 -2147483648, i32 -2147483648>
%sub = sub nsw <2 x i32> zeroinitializer, %div		%sub = sub nsw <2 x i32> zeroinitializer, %div
ret <2 x i32> %sub		ret <2 x i32> %sub
}		}

define i32 @test38(i32 %A) {		define i32 @test38(i32 %A) {
; CHECK-LABEL: @test38(		; CHECK-LABEL: @test38(
; CHECK-NEXT: [[TMP1:%.]] = icmp eq i32 [[A:%.]], -2147483648		; CHECK-NEXT: [[TMP1:%.]] = icmp eq i32 [[A:%.]], -2147483648
; CHECK-NEXT: [[SUB:%.*]] = sext i1 [[TMP1]] to i32		; CHECK-NEXT: [[TMP2:%.*]] = sext i1 [[TMP1]] to i32
; CHECK-NEXT: ret i32 [[SUB]]		; CHECK-NEXT: ret i32 [[TMP2]]
;		;
%div = sdiv i32 %A, -2147483648		%div = sdiv i32 %A, -2147483648
%sub = sub nsw i32 0, %div		%sub = sub nsw i32 0, %div
ret i32 %sub		ret i32 %sub
}		}

define i16 @test40(i16 %a, i16 %b) {		define i16 @test40(i16 %a, i16 %b) {
; CHECK-LABEL: @test40(		; CHECK-LABEL: @test40(
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	;
%a = or i4 %x, -8		%a = or i4 %x, -8
%b = and i4 %y, 7		%b = and i4 %y, 7
%c = sub i4 %a, %b		%c = sub i4 %a, %b
ret i4 %c		ret i4 %c
}		}

define i32 @test44(i32 %x) {		define i32 @test44(i32 %x) {
; CHECK-LABEL: @test44(		; CHECK-LABEL: @test44(
; CHECK-NEXT: [[SUB:%.]] = add nsw i32 [[X:%.]], -32768		; CHECK-NEXT: [[SUB:%.]] = add i32 [[X:%.]], -32768
; CHECK-NEXT: ret i32 [[SUB]]		; CHECK-NEXT: ret i32 [[SUB]]
;		;
%sub = sub nsw i32 %x, 32768		%sub = sub nsw i32 %x, 32768
ret i32 %sub		ret i32 %sub
}		}

define i32 @test45(i32 %x, i32 %y) {		define i32 @test45(i32 %x, i32 %y) {
; CHECK-LABEL: @test45(		; CHECK-LABEL: @test45(
▲ Show 20 Lines • Show All 351 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret <2 x i32> [[B]]		; CHECK-NEXT: ret <2 x i32> [[B]]
;		;
%B = sub <2 x i32> <i32 1, i32 1>, %A		%B = sub <2 x i32> <i32 1, i32 1>, %A
%C = shl <2 x i32> %B, <i32 1, i32 1>		%C = shl <2 x i32> %B, <i32 1, i32 1>
%D = sub <2 x i32> <i32 2, i32 2>, %C		%D = sub <2 x i32> <i32 2, i32 2>, %C
ret <2 x i32> %D		ret <2 x i32> %D
}		}

; FIXME: Transform (neg (max ~X, C)) -> ((min X, ~C) + 1). Same for min.		; FIXME: Transform (neg (max ~X, C)) -> ((min X, ~C) + 1). Same for min.
		xbolva00Unsubmitted Done Reply Inline Actions Remove FIXME? xbolva00: Remove FIXME?
define i32 @test64(i32 %x) {		define i32 @test64(i32 %x) {
; CHECK-LABEL: @test64(		; CHECK-LABEL: @test64(
; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[X:%.]], 255		; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[X:%.]], 255
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 255		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 255
; CHECK-NEXT: [[RES:%.*]] = add nsw i32 [[TMP2]], 1		; CHECK-NEXT: [[TMP3:%.*]] = add nsw i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[RES]]		; CHECK-NEXT: ret i32 [[TMP3]]
;		;
%1 = xor i32 %x, -1		%1 = xor i32 %x, -1
%2 = icmp sgt i32 %1, -256		%2 = icmp sgt i32 %1, -256
%3 = select i1 %2, i32 %1, i32 -256		%3 = select i1 %2, i32 %1, i32 -256
%res = sub i32 0, %3		%res = sub i32 0, %3
ret i32 %res		ret i32 %res
}		}

define i32 @test65(i32 %x) {		define i32 @test65(i32 %x) {
; CHECK-LABEL: @test65(		; CHECK-LABEL: @test65(
; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[X:%.]], -256		; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[X:%.]], -256
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 -256		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 -256
; CHECK-NEXT: [[RES:%.*]] = add i32 [[TMP2]], 1		; CHECK-NEXT: [[TMP3:%.*]] = add i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[RES]]		; CHECK-NEXT: ret i32 [[TMP3]]
;		;
%1 = xor i32 %x, -1		%1 = xor i32 %x, -1
%2 = icmp slt i32 %1, 255		%2 = icmp slt i32 %1, 255
%3 = select i1 %2, i32 %1, i32 255		%3 = select i1 %2, i32 %1, i32 255
%res = sub i32 0, %3		%res = sub i32 0, %3
ret i32 %res		ret i32 %res
}		}

define i32 @test66(i32 %x) {		define i32 @test66(i32 %x) {
; CHECK-LABEL: @test66(		; CHECK-LABEL: @test66(
; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[X:%.]], -101		; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[X:%.]], -101
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 -101		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 -101
; CHECK-NEXT: [[RES:%.*]] = add nuw i32 [[TMP2]], 1		; CHECK-NEXT: [[TMP3:%.*]] = add nuw i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[RES]]		; CHECK-NEXT: ret i32 [[TMP3]]
;		;
%1 = xor i32 %x, -1		%1 = xor i32 %x, -1
%2 = icmp ugt i32 %1, 100		%2 = icmp ugt i32 %1, 100
%3 = select i1 %2, i32 %1, i32 100		%3 = select i1 %2, i32 %1, i32 100
%res = sub i32 0, %3		%res = sub i32 0, %3
ret i32 %res		ret i32 %res
}		}

define i32 @test67(i32 %x) {		define i32 @test67(i32 %x) {
; CHECK-LABEL: @test67(		; CHECK-LABEL: @test67(
; CHECK-NEXT: [[TMP1:%.]] = icmp ugt i32 [[X:%.]], 100		; CHECK-NEXT: [[TMP1:%.]] = icmp ugt i32 [[X:%.]], 100
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 100		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 100
; CHECK-NEXT: [[RES:%.*]] = add i32 [[TMP2]], 1		; CHECK-NEXT: [[TMP3:%.*]] = add i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[RES]]		; CHECK-NEXT: ret i32 [[TMP3]]
;		;
%1 = xor i32 %x, -1		%1 = xor i32 %x, -1
%2 = icmp ult i32 %1, -101		%2 = icmp ult i32 %1, -101
%3 = select i1 %2, i32 %1, i32 -101		%3 = select i1 %2, i32 %1, i32 -101
%res = sub i32 0, %3		%res = sub i32 0, %3
ret i32 %res		ret i32 %res
}		}

; Check splat vectors too		; Check splat vectors too
define <2 x i32> @test68(<2 x i32> %x) {		define <2 x i32> @test68(<2 x i32> %x) {
; CHECK-LABEL: @test68(		; CHECK-LABEL: @test68(
; CHECK-NEXT: [[TMP1:%.]] = icmp slt <2 x i32> [[X:%.]], <i32 255, i32 255>		; CHECK-NEXT: [[TMP1:%.]] = icmp slt <2 x i32> [[X:%.]], <i32 255, i32 255>
; CHECK-NEXT: [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i32> [[X]], <2 x i32> <i32 255, i32 255>		; CHECK-NEXT: [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i32> [[X]], <2 x i32> <i32 255, i32 255>
; CHECK-NEXT: [[RES:%.*]] = add nsw <2 x i32> [[TMP2]], <i32 1, i32 1>		; CHECK-NEXT: [[TMP3:%.*]] = add nsw <2 x i32> [[TMP2]], <i32 1, i32 1>
; CHECK-NEXT: ret <2 x i32> [[RES]]		; CHECK-NEXT: ret <2 x i32> [[TMP3]]
;		;
%1 = xor <2 x i32> %x, <i32 -1, i32 -1>		%1 = xor <2 x i32> %x, <i32 -1, i32 -1>
%2 = icmp sgt <2 x i32> %1, <i32 -256, i32 -256>		%2 = icmp sgt <2 x i32> %1, <i32 -256, i32 -256>
%3 = select <2 x i1> %2, <2 x i32> %1, <2 x i32> <i32 -256, i32 -256>		%3 = select <2 x i1> %2, <2 x i32> %1, <2 x i32> <i32 -256, i32 -256>
%res = sub <2 x i32> zeroinitializer, %3		%res = sub <2 x i32> zeroinitializer, %3
ret <2 x i32> %res		ret <2 x i32> %res
}		}

; And non-splat constant vectors.		; And non-splat constant vectors.
define <2 x i32> @test69(<2 x i32> %x) {		define <2 x i32> @test69(<2 x i32> %x) {
; CHECK-LABEL: @test69(		; CHECK-LABEL: @test69(
; CHECK-NEXT: [[TMP1:%.]] = icmp slt <2 x i32> [[X:%.]], <i32 255, i32 127>		; CHECK-NEXT: [[TMP1:%.]] = icmp slt <2 x i32> [[X:%.]], <i32 255, i32 127>
; CHECK-NEXT: [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i32> [[X]], <2 x i32> <i32 255, i32 127>		; CHECK-NEXT: [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i32> [[X]], <2 x i32> <i32 255, i32 127>
; CHECK-NEXT: [[RES:%.*]] = add <2 x i32> [[TMP2]], <i32 1, i32 1>		; CHECK-NEXT: [[TMP3:%.*]] = add <2 x i32> [[TMP2]], <i32 1, i32 1>
; CHECK-NEXT: ret <2 x i32> [[RES]]		; CHECK-NEXT: ret <2 x i32> [[TMP3]]
;		;
%1 = xor <2 x i32> %x, <i32 -1, i32 -1>		%1 = xor <2 x i32> %x, <i32 -1, i32 -1>
%2 = icmp sgt <2 x i32> %1, <i32 -256, i32 -128>		%2 = icmp sgt <2 x i32> %1, <i32 -256, i32 -128>
%3 = select <2 x i1> %2, <2 x i32> %1, <2 x i32> <i32 -256, i32 -128>		%3 = select <2 x i1> %2, <2 x i32> %1, <2 x i32> <i32 -256, i32 -128>
%res = sub <2 x i32> zeroinitializer, %3		%res = sub <2 x i32> zeroinitializer, %3
ret <2 x i32> %res		ret <2 x i32> %res
}		}

▲ Show 20 Lines • Show All 74 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll

Show First 20 Lines • Show All 345 Lines • ▼ Show 20 Lines	;
%sub = add i32 %a, -1		%sub = add i32 %a, -1
%sel = select i1 %cmp, i32 %sub, i32 0		%sel = select i1 %cmp, i32 %sub, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @max_sub_ult_c2(i32 %a) {		define i32 @max_sub_ult_c2(i32 %a) {
; CHECK-LABEL: @max_sub_ult_c2(		; CHECK-LABEL: @max_sub_ult_c2(
; CHECK-NEXT: [[TMP1:%.]] = call i32 @llvm.usub.sat.i32(i32 2, i32 [[A:%.]])		; CHECK-NEXT: [[TMP1:%.]] = call i32 @llvm.usub.sat.i32(i32 2, i32 [[A:%.]])
; CHECK-NEXT: [[TMP2:%.*]] = sub nsw i32 0, [[TMP1]]		; CHECK-NEXT: [[TMP2:%.*]] = sub i32 0, [[TMP1]]
; CHECK-NEXT: ret i32 [[TMP2]]		; CHECK-NEXT: ret i32 [[TMP2]]
;		;
%cmp = icmp ult i32 %a, 2		%cmp = icmp ult i32 %a, 2
%sub = add i32 %a, -2		%sub = add i32 %a, -2
%sel = select i1 %cmp, i32 %sub, i32 0		%sel = select i1 %cmp, i32 %sub, i32 0
ret i32 %sel		ret i32 %sel
}		}

define i32 @max_sub_ult_c2_oneuseicmp(i32 %a) {		define i32 @max_sub_ult_c2_oneuseicmp(i32 %a) {
; CHECK-LABEL: @max_sub_ult_c2_oneuseicmp(		; CHECK-LABEL: @max_sub_ult_c2_oneuseicmp(
; CHECK-NEXT: [[CMP:%.]] = icmp ult i32 [[A:%.]], 2		; CHECK-NEXT: [[CMP:%.]] = icmp ult i32 [[A:%.]], 2
; CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.usub.sat.i32(i32 2, i32 [[A]])		; CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.usub.sat.i32(i32 2, i32 [[A]])
; CHECK-NEXT: [[TMP2:%.*]] = sub nsw i32 0, [[TMP1]]		; CHECK-NEXT: [[TMP2:%.*]] = sub i32 0, [[TMP1]]
; CHECK-NEXT: call void @usei1(i1 [[CMP]])		; CHECK-NEXT: call void @usei1(i1 [[CMP]])
; CHECK-NEXT: ret i32 [[TMP2]]		; CHECK-NEXT: ret i32 [[TMP2]]
;		;
%cmp = icmp ult i32 %a, 2		%cmp = icmp ult i32 %a, 2
%sub = add i32 %a, -2		%sub = add i32 %a, -2
%sel = select i1 %cmp, i32 %sub, i32 0		%sel = select i1 %cmp, i32 %sub, i32 0
call void @usei1(i1 %cmp)		call void @usei1(i1 %cmp)
ret i32 %sel		ret i32 %sel
}		}

define i32 @max_sub_ult_c2_oneusesub(i32 %a) {		define i32 @max_sub_ult_c2_oneusesub(i32 %a) {
; CHECK-LABEL: @max_sub_ult_c2_oneusesub(		; CHECK-LABEL: @max_sub_ult_c2_oneusesub(
; CHECK-NEXT: [[SUB:%.]] = add i32 [[A:%.]], -2		; CHECK-NEXT: [[SUB:%.]] = add i32 [[A:%.]], -2
; CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.usub.sat.i32(i32 2, i32 [[A]])		; CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.usub.sat.i32(i32 2, i32 [[A]])
; CHECK-NEXT: [[TMP2:%.*]] = sub nsw i32 0, [[TMP1]]		; CHECK-NEXT: [[TMP2:%.*]] = sub i32 0, [[TMP1]]
; CHECK-NEXT: call void @usei32(i32 [[SUB]])		; CHECK-NEXT: call void @usei32(i32 [[SUB]])
; CHECK-NEXT: ret i32 [[TMP2]]		; CHECK-NEXT: ret i32 [[TMP2]]
;		;
%cmp = icmp ult i32 %a, 2		%cmp = icmp ult i32 %a, 2
%sub = add i32 %a, -2		%sub = add i32 %a, -2
%sel = select i1 %cmp, i32 %sub, i32 0		%sel = select i1 %cmp, i32 %sub, i32 0
call void @usei32(i32 %sub)		call void @usei32(i32 %sub)
ret i32 %sel		ret i32 %sel
▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/zext-bool-add-sub.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -instcombine -S \| FileCheck %s		; RUN: opt < %s -instcombine -S \| FileCheck %s

; rdar://11748024		; rdar://11748024

define i32 @a(i1 zeroext %x, i1 zeroext %y) {		define i32 @a(i1 zeroext %x, i1 zeroext %y) {
; CHECK-LABEL: @a(		; CHECK-LABEL: @a(
; CHECK-NEXT: [[CONV3_NEG:%.]] = sext i1 [[Y:%.]] to i32		; CHECK-NEXT: [[TMP1:%.]] = sext i1 [[Y:%.]] to i32
; CHECK-NEXT: [[SUB:%.]] = select i1 [[X:%.]], i32 2, i32 1		; CHECK-NEXT: [[SUB:%.]] = select i1 [[X:%.]], i32 2, i32 1
; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[SUB]], [[CONV3_NEG]]		; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[SUB]], [[TMP1]]
; CHECK-NEXT: ret i32 [[ADD]]		; CHECK-NEXT: ret i32 [[ADD]]
;		;
%conv = zext i1 %x to i32		%conv = zext i1 %x to i32
%conv3 = zext i1 %y to i32		%conv3 = zext i1 %y to i32
%conv3.neg = sub i32 0, %conv3		%conv3.neg = sub i32 0, %conv3
%sub = add i32 %conv, 1		%sub = add i32 %conv, 1
%add = add i32 %sub, %conv3.neg		%add = add i32 %sub, %conv3.neg
ret i32 %add		ret i32 %add
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	;
%add = add <2 x i32> %zext, <i32 42, i32 23>		%add = add <2 x i32> %zext, <i32 42, i32 23>
ret <2 x i32> %add		ret <2 x i32> %add
}		}

declare void @use(i64)		declare void @use(i64)

define i64 @zext_negate(i1 %A) {		define i64 @zext_negate(i1 %A) {
; CHECK-LABEL: @zext_negate(		; CHECK-LABEL: @zext_negate(
; CHECK-NEXT: [[SUB:%.]] = sext i1 [[A:%.]] to i64		; CHECK-NEXT: [[TMP1:%.]] = sext i1 [[A:%.]] to i64
; CHECK-NEXT: ret i64 [[SUB]]		; CHECK-NEXT: ret i64 [[TMP1]]
;		;
%ext = zext i1 %A to i64		%ext = zext i1 %A to i64
%sub = sub i64 0, %ext		%sub = sub i64 0, %ext
ret i64 %sub		ret i64 %sub
}		}

define i64 @zext_negate_extra_use(i1 %A) {		define i64 @zext_negate_extra_use(i1 %A) {
; CHECK-LABEL: @zext_negate_extra_use(		; CHECK-LABEL: @zext_negate_extra_use(
; CHECK-NEXT: [[EXT:%.]] = zext i1 [[A:%.]] to i64		; CHECK-NEXT: [[EXT:%.]] = zext i1 [[A:%.]] to i64
; CHECK-NEXT: [[SUB:%.*]] = sext i1 [[A]] to i64		; CHECK-NEXT: [[SUB:%.*]] = sub i64 0, [[EXT]]
; CHECK-NEXT: call void @use(i64 [[EXT]])		; CHECK-NEXT: call void @use(i64 [[EXT]])
; CHECK-NEXT: ret i64 [[SUB]]		; CHECK-NEXT: ret i64 [[SUB]]
;		;
%ext = zext i1 %A to i64		%ext = zext i1 %A to i64
%sub = sub i64 0, %ext		%sub = sub i64 0, %ext
call void @use(i64 %ext)		call void @use(i64 %ext)
ret i64 %sub		ret i64 %sub
}		}

define <2 x i64> @zext_negate_vec(<2 x i1> %A) {		define <2 x i64> @zext_negate_vec(<2 x i1> %A) {
; CHECK-LABEL: @zext_negate_vec(		; CHECK-LABEL: @zext_negate_vec(
; CHECK-NEXT: [[SUB:%.]] = sext <2 x i1> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[TMP1:%.]] = sext <2 x i1> [[A:%.]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[SUB]]		; CHECK-NEXT: ret <2 x i64> [[TMP1]]
;		;
%ext = zext <2 x i1> %A to <2 x i64>		%ext = zext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> zeroinitializer, %ext		%sub = sub <2 x i64> zeroinitializer, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define <2 x i64> @zext_negate_vec_undef_elt(<2 x i1> %A) {		define <2 x i64> @zext_negate_vec_undef_elt(<2 x i1> %A) {
; CHECK-LABEL: @zext_negate_vec_undef_elt(		; CHECK-LABEL: @zext_negate_vec_undef_elt(
; CHECK-NEXT: [[SUB:%.]] = sext <2 x i1> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[TMP1:%.]] = sext <2 x i1> [[A:%.]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[SUB]]		; CHECK-NEXT: ret <2 x i64> [[TMP1]]
;		;
%ext = zext <2 x i1> %A to <2 x i64>		%ext = zext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> <i64 0, i64 undef>, %ext		%sub = sub <2 x i64> <i64 0, i64 undef>, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define i64 @zext_sub_const(i1 %A) {		define i64 @zext_sub_const(i1 %A) {
; CHECK-LABEL: @zext_sub_const(		; CHECK-LABEL: @zext_sub_const(
Show All 35 Lines
;		;
%ext = zext <2 x i1> %A to <2 x i64>		%ext = zext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> <i64 42, i64 undef>, %ext		%sub = sub <2 x i64> <i64 42, i64 undef>, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define i64 @sext_negate(i1 %A) {		define i64 @sext_negate(i1 %A) {
; CHECK-LABEL: @sext_negate(		; CHECK-LABEL: @sext_negate(
; CHECK-NEXT: [[SUB:%.]] = zext i1 [[A:%.]] to i64		; CHECK-NEXT: [[TMP1:%.]] = zext i1 [[A:%.]] to i64
; CHECK-NEXT: ret i64 [[SUB]]		; CHECK-NEXT: ret i64 [[TMP1]]
;		;
%ext = sext i1 %A to i64		%ext = sext i1 %A to i64
%sub = sub i64 0, %ext		%sub = sub i64 0, %ext
ret i64 %sub		ret i64 %sub
}		}

define i64 @sext_negate_extra_use(i1 %A) {		define i64 @sext_negate_extra_use(i1 %A) {
; CHECK-LABEL: @sext_negate_extra_use(		; CHECK-LABEL: @sext_negate_extra_use(
; CHECK-NEXT: [[EXT:%.]] = sext i1 [[A:%.]] to i64		; CHECK-NEXT: [[EXT:%.]] = sext i1 [[A:%.]] to i64
; CHECK-NEXT: [[SUB:%.*]] = zext i1 [[A]] to i64		; CHECK-NEXT: [[SUB:%.*]] = sub i64 0, [[EXT]]
; CHECK-NEXT: call void @use(i64 [[EXT]])		; CHECK-NEXT: call void @use(i64 [[EXT]])
; CHECK-NEXT: ret i64 [[SUB]]		; CHECK-NEXT: ret i64 [[SUB]]
;		;
%ext = sext i1 %A to i64		%ext = sext i1 %A to i64
%sub = sub i64 0, %ext		%sub = sub i64 0, %ext
call void @use(i64 %ext)		call void @use(i64 %ext)
ret i64 %sub		ret i64 %sub
}		}

define <2 x i64> @sext_negate_vec(<2 x i1> %A) {		define <2 x i64> @sext_negate_vec(<2 x i1> %A) {
; CHECK-LABEL: @sext_negate_vec(		; CHECK-LABEL: @sext_negate_vec(
; CHECK-NEXT: [[SUB:%.]] = zext <2 x i1> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i1> [[A:%.]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[SUB]]		; CHECK-NEXT: ret <2 x i64> [[TMP1]]
;		;
%ext = sext <2 x i1> %A to <2 x i64>		%ext = sext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> zeroinitializer, %ext		%sub = sub <2 x i64> zeroinitializer, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define <2 x i64> @sext_negate_vec_undef_elt(<2 x i1> %A) {		define <2 x i64> @sext_negate_vec_undef_elt(<2 x i1> %A) {
; CHECK-LABEL: @sext_negate_vec_undef_elt(		; CHECK-LABEL: @sext_negate_vec_undef_elt(
; CHECK-NEXT: [[SUB:%.]] = zext <2 x i1> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i1> [[A:%.]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[SUB]]		; CHECK-NEXT: ret <2 x i64> [[TMP1]]
;		;
%ext = sext <2 x i1> %A to <2 x i64>		%ext = sext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> <i64 0, i64 undef>, %ext		%sub = sub <2 x i64> <i64 0, i64 undef>, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define i64 @sext_sub_const(i1 %A) {		define i64 @sext_sub_const(i1 %A) {
; CHECK-LABEL: @sext_sub_const(		; CHECK-LABEL: @sext_sub_const(
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	;
ret <2 x i8> %sub		ret <2 x i8> %sub
}		}

; NSW is preserved.		; NSW is preserved.

define <2 x i8> @sext_sub_vec_nsw(<2 x i8> %x, <2 x i1> %y) {		define <2 x i8> @sext_sub_vec_nsw(<2 x i8> %x, <2 x i1> %y) {
; CHECK-LABEL: @sext_sub_vec_nsw(		; CHECK-LABEL: @sext_sub_vec_nsw(
; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i1> [[Y:%.]] to <2 x i8>		; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i1> [[Y:%.]] to <2 x i8>
; CHECK-NEXT: [[SUB:%.]] = add nsw <2 x i8> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add <2 x i8> [[TMP1]], [[X:%.]]
; CHECK-NEXT: ret <2 x i8> [[SUB]]		; CHECK-NEXT: ret <2 x i8> [[SUB]]
;		;
%sext = sext <2 x i1> %y to <2 x i8>		%sext = sext <2 x i1> %y to <2 x i8>
%sub = sub nsw <2 x i8> %x, %sext		%sub = sub nsw <2 x i8> %x, %sext
ret <2 x i8> %sub		ret <2 x i8> %sub
}		}

; We favor the canonical zext+add over keeping the NUW.		; We favor the canonical zext+add over keeping the NUW.
▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines