This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
CMakeLists.txt
3/3
InstCombineAddSub.cpp
1/1
InstCombineInternal.h
13/13
InstCombineNegator.cpp
-
InstructionCombining.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
and-or-icmps.ll
-
fold-sub-of-not-to-inc-of-add.ll
-
high-bit-signmask-with-trunc.ll
-
high-bit-signmask.ll
4/4
sub-of-negatible.ll
1/1
sub.ll
-
zext-bool-add-sub.ll

Differential D68408

[InstCombine] Negator - sink sinkable negations
ClosedPublic

Authored by lebedev.ri on Oct 3 2019, 10:27 AM.

Download Raw Diff

Details

Reviewers

spatel
nikic
efriedma
xbolva00
vitalybuka
dvyukov

Commits

rG352fef3f11f5: [InstCombine] Negator - sink sinkable negations

Summary

As we have discussed previously (e.g. in D63992 / D64090 / PR42457), sub instruction
can almost be considered non-canonical. While we do convert sub %x, C -> add %x, -C,
we sparsely do that for non-constants. But we should.

Here, i propose to interpret sub %x, %y as add (sub 0, %y), %x IFF the negation can be sinked into the %y

This has some potential to cause endless combine loops (either around PHI's, or if there are some opposite transforms).
For former there's -instcombine-negator-max-depth option to mitigate it, should this expose any such issues
For latter, if there are still any such opposing folds, we'd need to remove the colliding fold.
In any case, reproducers welcomed!

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lebedev.ri created this revision.Oct 3 2019, 10:27 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptOct 3 2019, 10:27 AM

lebedev.ri edited the summary of this revision. (Show Details)Oct 3 2019, 10:28 AM

lebedev.ri edited the summary of this revision. (Show Details)Oct 3 2019, 10:33 AM

Actually, sub can be freely negated.

This *is* unusual. Is this too ugly to live?

I'd prefer to actually transform the operation tree at the point we decide it's profitable, to make it clear we can't end up in an infinite loop or something like that. As it is, you're depending on some other transforms happening in a particular order, and it's not clear that will happen consistently. (Yes, it's a little more code, but I think that's okay.)

In D68408#1695357, @efriedma wrote:

This *is* unusual. Is this too ugly to live?

I'd prefer to actually transform the operation tree at the point we decide it's profitable, to make it clear we can't end up in an infinite loop or something like that. As it is, you're depending on some other transforms happening in a particular order, and it's not clear that will happen consistently. (Yes, it's a little more code, but I think that's okay.)

Actually, that's presicely why i believed this may be the best approach - we both avoid lot's of code duplication,
and if there are missing folds this *will* shake them loose - if they don't happen it'll "be caught" as a deadlock.

I'm not really sure how more direct approach would look..

In D68408#1695399, @lebedev.ri wrote:

In D68408#1695357, @efriedma wrote:

This *is* unusual. Is this too ugly to live?

I'd prefer to actually transform the operation tree at the point we decide it's profitable, to make it clear we can't end up in an infinite loop or something like that. As it is, you're depending on some other transforms happening in a particular order, and it's not clear that will happen consistently. (Yes, it's a little more code, but I think that's okay.)

I'm not really sure how more direct approach would look..

Okay so this is nowhere near polished-enough, but how about this?

Herald added a subscriber: mgorny. · View Herald TranscriptOct 5 2019, 3:02 PM

Give more thought as to whether the new instructions should be inserted or not, and actually succeed in building test-suite.

@efriedma any high-level feedback on this?

Hmm, that's a bit more complicated than I hoped it would be... but I don't see any obvious way to simplify it.

It looks like this speculatively creates new instructions, then erases them on failure?

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
2	InstCombineAddSub.cpp?

In D68408#1708655, @efriedma wrote:

Hmm, that's a bit more complicated than I hoped it would be... but I don't see any obvious way to simplify it.

The one big missing thing is to use an actual worklist instead of recursion, but i'm not sure how to do that yet.

It looks like this speculatively creates new instructions, then erases them on failure?

Yes. More specifically, while speculatively creating them, it inserts them internal worklist
and in their proper places in basic blocks. If we succeed to negate the entire tree,
that worklist is then propagated to instcombine itself.
If that doesn't happen, then they are 'trivially' dead and deleted.

Thank you for taking a look!
Also handle trunc, fix comments.

If that doesn't happen, then they are 'trivially' dead and deleted.

Deleted where, exactly? If you're expecting the instcombine main loop to delete them, you'll force instcombine into an infinite loop, I think.

In D68408#1710214, @efriedma wrote:

If that doesn't happen, then they are 'trivially' dead and deleted.

Deleted where, exactly? If you're expecting the instcombine main loop to delete them, you'll force instcombine into an infinite loop, I think.

Hmm, they are not added to the instcombine's worklist unless we succeed in negating
the entire tree, and this passes test-suite with no infinite looping.

You are saying that we should instead DCE instructions in *our* worklist if we fail, correct?

Erase newly-created/inserted instruction if negation failed.

Bump

bump

Ping

@efriedma - do you want to continue reviewing?

I've just pointed out a few nits for now.

llvm/lib/Transforms/InstCombine/InstCombineInternal.h
1020	typo: attempt
llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
169	it's -> its
183	it's -> its
193	negatible one -> negatible if one
193	it's -> its
244	typo: temporarily
llvm/test/Transforms/InstCombine/sub-of-negatible.ll
159–161	I didn't follow the diffs here - was one of these tests redundant? The code comment didn't match before, but it still doesn't?
272–273	Add these tests with baseline results as pre-commit?
352	it's -> its

In D68408#1767626, @spatel wrote:

@efriedma - do you want to continue reviewing?

In D68408#1767626, @spatel wrote:

I've just pointed out a few nits for now.

Nits addressed.

There are some more patterns that aren't handled here.
The idea is to ideally move them all here.

llvm/test/Transforms/InstCombine/sub-of-negatible.ll
159–161	The test was too complicated, was checking more than the minimal pattern - subtraction can be freely negated by swapping its operands.

Do you have stats on how often this fires? Impact on compile-time?

Remove more specific pattern-matching from InstCombiner::visitSub() simultaneously with adding this more general functionality, so we don't have redundancy (and limit compile-time impact)?

lebedev.ri mentioned this in rGb89ba5f9399a: [NFC][InstCombine] Autogenerate check lines in a few tests.Dec 4 2019, 2:19 PM

NOT READY FOR REVIEW

In D68408#1769213, @spatel wrote:

Remove more specific pattern-matching from InstCombiner::visitSub() simultaneously with adding this more general functionality, so we don't have redundancy (and limit compile-time impact)?

I was hoping to do that in steps, but i guess that's one way to boost those stats :))
I think i have moved everything relevant from InstCombiner::visitSub() now.
Observations:

There's an artificial one-use restriction that needs to go away (when Depth=0 and we are looking at sub 0, %x)
We loose/do not propagate no wrap flags
InstCombiner::visitAdd() change is needed because of how good we are get at sinking negations - else there are two opposite folds. It should be done beforehand, separately.
Same with InstCombiner::foldAddWithConstant() change, not entirely related to the diff.
There are some other regressions.

lebedev.ri mentioned this in D71064: [InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to `sub A, zext B -> add A, sext B`).Dec 5 2019, 6:21 AM

lebedev.ri mentioned this in rG796fa662f128: [InstCombine] Invert `add A, sext(B) --> sub A, zext(B)` canonicalization (to….Dec 5 2019, 10:24 AM

Rebased, slightly better, but not there yet still.

xbolva00 added a subscriber: xbolva00.Dec 5 2019, 10:55 AM

Still not for review

Some more unreachable code removed from InstCombiner::visitSub()

Maybe you can change title to [WIP] or [NOT READY FOR REVIEW] ?

xbolva00 added inline comments.Dec 6 2019, 4:41 AM

llvm/test/Transforms/InstCombine/sub.ll
1354–1355	Remove FIXME?

spatel mentioned this in D72007: [InstCombine] try to pull 'not' of select into compare operands.Jan 4 2020, 6:17 AM

nikic mentioned this in D72978: [InstCombine] Combine neg of shl of sub (PR44529).Jan 18 2020, 8:05 AM

nikic mentioned this in rG0b83c5a78fae: [InstCombine] Combine neg of shl of sub (PR44529).Jan 22 2020, 2:04 PM

spatel mentioned this in D77230: [InstCombine] enhance freelyNegateValue() by handling xor.Apr 1 2020, 10:23 AM

spatel mentioned this in rG3d9004879118: [InstCombine] enhance freelyNegateValue() by handling xor.Apr 1 2020, 12:23 PM

spatel mentioned this in rG1008435f3d47: Revert "[InstCombine] do not exclude min/max from icmp with casted operand fold".Apr 2 2020, 7:01 AM

Rebased.

Finally understood the interfering transforms - this fundamentally interferes with D48754.
Yes, i see the PhaseOrdering regressions if we revert that.
Not yet sure how to deal with this, i see 3 options:

don't do such canonicalization (what i've done here)
don't implement Negator
add Abs IR instruction (i think not)
Don't try to sink negation when it's used by abs/nabs pattern

This doesn't hang check-llvm/test-suite.

Harbormaster failed remote builds in B52802: Diff 256768!Apr 11 2020, 7:27 AM

Handle PHI's, too.
Strangely, i distinctively recall seeing some preexisting test
causing an endless cycle, but i'm no longer observing that problem.

Harbormaster failed remote builds in B52817: Diff 256801!Apr 11 2020, 4:00 PM

xbolva00 added inline comments.Apr 11 2020, 4:11 PM

llvm/test/Transforms/PhaseOrdering/min-max-abs-cse.ll
39 ↗	(On Diff #256801)	Regression

And solve potential endless combine cycle by not trying to sink negations
if that instruction has an user that looks like abs/nabs.

I think this is it for now, so review is welcomed..

llvm/test/Transforms/PhaseOrdering/min-max-abs-cse.ll
39 ↗	(On Diff #256801)	https://reviews.llvm.org/D68408#1976039

In D68408#1976039, @lebedev.ri wrote:

add Abs IR instruction (i think not)

Is the answer the same for an intrinsic? The more we see these kinds of problems, the more I wish we had created intrinsics for abs/smin/smax/umin/umax long ago. We keep trying to work around this limitation in IR, but it doesn't seem worth it. We have abs() functions in source and an abs node in codegen, so we're creating intermediate ops that don't exist on either side of IR.

Harbormaster failed remote builds in B52854: Diff 256855!Apr 12 2020, 8:32 AM

In D68408#1976790, @spatel wrote:

In D68408#1976039, @lebedev.ri wrote:

add Abs IR instruction (i think not)

Is the answer the same for an intrinsic?

Ack, i meant instruction/intrinsic interchangeably here.

The more we see these kinds of problems, the more I wish we had created intrinsics for abs/smin/smax/umin/umax long ago. We keep trying to work around this limitation in IR, but it doesn't seem worth it. We have abs() functions in source and an abs node in codegen, so we're creating intermediate ops that don't exist on either side of IR.

Note that we similarly don't have saturating multiplication, overflowing/saturating left-shift.

In D68408#1976807, @lebedev.ri wrote:

In D68408#1976790, @spatel wrote:

In D68408#1976039, @lebedev.ri wrote:

add Abs IR instruction (i think not)

Is the answer the same for an intrinsic?

Ack, i meant instruction/intrinsic interchangeably here.

We have grown more accepting of intrinsics (overflow, saturating, funnel, etc.) recently, so I'm not sure if the old arguments against an abs intrinsic still hold. Is there anything in particular about abs() that makes it different?

The more we see these kinds of problems, the more I wish we had created intrinsics for abs/smin/smax/umin/umax long ago. We keep trying to work around this limitation in IR, but it doesn't seem worth it. We have abs() functions in source and an abs node in codegen, so we're creating intermediate ops that don't exist on either side of IR.

Note that we similarly don't have saturating multiplication, overflowing/saturating left-shift.

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

In D68408#1977656, @spatel wrote:

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

FWIW I agree that we need to reevaluate this decision and at least introduce min/max intrinsics. These are special-cased in too many places for the current treatment as a simple compare and select to still make sense. Especially when we take into account that there seems to be a new min/max related infinite loop every other month, and that the icmp/select representation has ill-defined behavior when it comes to undef. I expect the effort for introducing these intrinsics will be pretty similar to saturating add/sub, and would be happy to chip in...

+1 for new abs/min/max intrinsics

It'd be great if we could separate the new intrinsics disscussion from this patch..

In D68408#1977656, @spatel wrote:

In D68408#1976807, @lebedev.ri wrote:

In D68408#1976790, @spatel wrote:

In D68408#1976039, @lebedev.ri wrote:

add Abs IR instruction (i think not)

Is the answer the same for an intrinsic?

Ack, i meant instruction/intrinsic interchangeably here.

We have grown more accepting of intrinsics (overflow, saturating, funnel, etc.) recently, so I'm not sure if the old arguments against an abs intrinsic still hold. Is there anything in particular about abs() that makes it different?

No, i just don't want to condition this patch on introduction of whole new set of intrinsics :]

spatel mentioned this in D74484: [AggressiveInstCombine] Add support for ICmp instr that feeds a select intsr's condition operand..Apr 14 2020, 6:55 AM

In D68408#1978925, @nikic wrote:

In D68408#1977656, @spatel wrote:

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

FWIW I agree that we need to reevaluate this decision and at least introduce min/max intrinsics. These are special-cased in too many places for the current treatment as a simple compare and select to still make sense. Especially when we take into account that there seems to be a new min/max related infinite loop every other month, and that the icmp/select representation has ill-defined behavior when it comes to undef. I expect the effort for introducing these intrinsics will be pretty similar to saturating add/sub, and would be happy to chip in...

Sounds like general agreement for the people on this review (and sorry for going off-topic for this specific patch). I'll post something to llvm-dev after trying to dig up the earlier discussions for those intrinsics.

And looks like we have this month's infinite loop here :) -
https://bugs.llvm.org/show_bug.cgi?id=45539

In D68408#1981910, @spatel wrote:

In D68408#1978925, @nikic wrote:

In D68408#1977656, @spatel wrote:

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

FWIW I agree that we need to reevaluate this decision and at least introduce min/max intrinsics. These are special-cased in too many places for the current treatment as a simple compare and select to still make sense. Especially when we take into account that there seems to be a new min/max related infinite loop every other month, and that the icmp/select representation has ill-defined behavior when it comes to undef. I expect the effort for introducing these intrinsics will be pretty similar to saturating add/sub, and would be happy to chip in...

Sounds like general agreement for the people on this review (and sorry for going off-topic for this specific patch). I'll post something to llvm-dev after trying to dig up the earlier discussions for those intrinsics.

Note that i've posted https://github.com/AliveToolkit/alive2/pull/353 with my vision of their modelling.
Notably, i don't think we should have nabs, and i believe abs should be similar to the cttz/ctlz
in the sense that it should take a second param - i1 NSW.

In D68408#1978925, @nikic wrote:

In D68408#1977656, @spatel wrote:

Yes, there's no clear line that I know of. It's like IR canonicalization - we make the rules up as we go along and adapt the surrounding logic to work with the current format. I think we've overcome the limitation for this patch, so we don't need to gate this patch on a decision, but I think we still have the problem. The clearest sign is that -- within instcombine -- we have bailouts for otherwise accepted canonicalizations only if we "matchSelectPattern()".

FWIW I agree that we need to reevaluate this decision and at least introduce min/max intrinsics. These are special-cased in too many places for the current treatment as a simple compare and select to still make sense. Especially when we take into account that there seems to be a new min/max related infinite loop every other month, and that the icmp/select representation has ill-defined behavior when it comes to undef. I expect the effort for introducing these intrinsics will be pretty similar to saturating add/sub, and would be happy to chip in...

I think i'd like to try to handle that, since out of the people in this disscussion
only i (well, and @xbolva00) haven't dealt with that before, may be good to spread the knowledge.

Ping. What would it take to get this moving? :)

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

Change does not seem to have cost above noise, apart from a -0.6% improvement on tramp3d-v4. There's a corresponding -0.74% reducing in code-size, so this transform is clearly doing (or enabling) something big there. It might be interesting (but not necessary) to take a quick look at what happens there.

In D68408#1991910, @nikic wrote:

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

I was actually trying to get that info, and i'm not sure what step i'm missing other than pushing the [[ https://github.com/LebedevRI/llvm-project/tree/perf/instcombine-negator | perf/* ]] branch?

Change does not seem to have cost above noise, apart from a -0.6% improvement on tramp3d-v4. There's a corresponding -0.74% reducing in code-size, so this transform is clearly doing (or enabling) something big there. It might be interesting (but not necessary) to take a quick look at what happens there.

Ah, interesting.
I actually expected this to have measurable negative(bad) cost,
so it's a pleasant surprise to see beneficial numbers here :)

In D68408#1991930, @lebedev.ri wrote:

In D68408#1991910, @nikic wrote:

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

I was actually trying to get that info, and i'm not sure what step i'm missing other than pushing the [[ https://github.com/LebedevRI/llvm-project/tree/perf/instcombine-negator | perf/* ]] branch?

I need to manually add your fork as a remote first, so branches get picked up (too many forks to listen to all of them). I've done that now.

In D68408#1991910, @nikic wrote:

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

Change does not seem to have cost above noise, apart from a -0.6% improvement on tramp3d-v4. There's a corresponding -0.74% reducing in code-size, so this transform is clearly doing (or enabling) something big there. It might be interesting (but not necessary) to take a quick look at what happens there.

Yes, it would be good to derive a regression test from that benchmark and/or invent some larger tests that show the greater optimization power of the new code. Unless I missed it, all of the current test diffs show that we do no harm, but if we can show that the added code/complexity buys us something immediately, that makes the benefit clear.

In D68408#1992227, @spatel wrote:

In D68408#1991910, @nikic wrote:

Compile-time numbers look good: http://llvm-compile-time-tracker.com/compare.php?from=f52e0507574b4fd84dc4674536f5dfbab396c0f6&to=0a009b654793dee8e335c053eb043e297071e0d1&stat=instructions

Change does not seem to have cost above noise, apart from a -0.6% improvement on tramp3d-v4. There's a corresponding -0.74% reducing in code-size, so this transform is clearly doing (or enabling) something big there.
It might be interesting (but not necessary) to take a quick look at what happens there.

Yes, it would be good to derive a regression test from that benchmark

old.ll.xz853 KBDownload

new.ll.xz846 KBDownload

The impact there is quite noticeable as per llvm-diff, which almost immediately crashes.
It appears, a lot more inlining happens (functions now-missing in new.ll),
and some more function 'specialization' (functions now-appearing in new.ll).
I'm not sure i can distill/filter that to make any reasonable test case..

In D68408#1992227, @spatel wrote:

and/or invent some larger tests that show the greater optimization power of the new code.
Unless I missed it, all of the current test diffs show that we do no harm, but if we can show that the added code/complexity buys us something immediately, that makes the benefit clear.

The claim i'm making in the patch's description is that we almost consider sub instruction
non-canonical, and we should be trying to fold it away as much as possible.
These test cases show the common patters, that we miss currently, where we can get rid of it.

This approach is what was requested in

In D68408#1695357, @efriedma wrote:

This *is* unusual. Is this too ugly to live?

I'd prefer to actually transform the operation tree at the point we decide it's profitable, to make it clear we can't end up in an infinite loop or something like that. As it is, you're depending on some other transforms happening in a particular order, and it's not clear that will happen consistently. (Yes, it's a little more code, but I think that's okay.)

Given that we have compile-time data now, LGTM.
The implementation goes beyond my normal casual C++ usage (eg, I'd never seen 'zip' before), so if someone else can take a 2nd/final look too, that would be great.

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
134	Remove/update comment.

This revision is now accepted and ready to land.Apr 20 2020, 1:02 PM

nikic added inline comments.Apr 20 2020, 1:51 PM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	Noticed while looking through the tramp3d-v4 diff: This should be behind a one-use check, to avoid duplicating expensive division instructions.

xbolva00 added inline comments.Apr 20 2020, 1:52 PM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
2005	Just wondering if we could use some better name for this lambda than “cleanup”.

lebedev.ri added inline comments.Apr 20 2020, 2:19 PM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
2005	I couldn't come up with a better one, any suggestions? `TryToNarrowDeduceFlags()`? This is where `goto` might make sense, but somehow i don't want to use it..

xbolva00 added inline comments.Apr 20 2020, 2:48 PM

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
2005	Your idea is fine I think.

lebedev.ri added inline comments.Apr 20 2020, 2:52 PM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	I was hesitant about this one indeed, it isn't a typo that one-use check has gone away here, because we generally consider only instruction count. @spatel thoughts? But the main, bigger question this touches is: "but what if all the uses would get negated by us? In future, can we somehow sanely model the whole negatible tree, not giving up at non-single-use instructions, but defer that to after we've finished building new tree?"

spatel added inline comments.Apr 21 2020, 5:52 AM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	I missed that logic difference, and I'm not getting a reviewable diff of the attached files with llvm-diff or other apps. Can you create an IR example/regression test for that?

lebedev.ri added inline comments.Apr 21 2020, 6:06 AM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	@spatel To be clear, the "bigger question" is pretty rhetorical, or at least not for this review. The actual question here is whether we should consider `sdiv` special, and even if we can negate it without increasing instruction count, we should only do so if there are no other uses of old `sdiv`.

spatel added inline comments.Apr 21 2020, 6:45 AM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	There is precedence for this kind of special treatment. In other words, not all opcodes are equal in terms of analysis (and secondary concern of codegen), and we will even increase instruction count to avoid some like div/rem (although those transforms are probably currently not safe with respect to poison). If it would be NFC-ish to keep the one-use check, then we should do that. Then, remove the limitation as a follow-up if that can be shown useful?

Updated: adjust comments, lambda name, guard sdiv with an artificial one-use-check.

lebedev.ri added inline comments.Apr 21 2020, 10:50 AM

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp
181	Alright, i'll add one-use check just to get this moving :)

Harbormaster failed remote builds in B54127: Diff 259055!Apr 21 2020, 11:21 AM

Closed by commit rG352fef3f11f5: [InstCombine] Negator - sink sinkable negations (authored by lebedev.ri). · Explain WhyApr 21 2020, 12:27 PM

This revision was automatically updated to reflect the committed changes.

Hi,

This change causes a performance regression in tsan, as detected on our LLVM buildbot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/49850/steps/tsan%20analyze/logs/stdio

The script that comes with tsan checks the number of PUSH, etc in some of the key tsan functions,
where each extra PUSH cases tsan to be slower.

With this change, the number of PUSHes went from 3 to 4.

Please take a look, this might be a performance regression for a wider set of targets.

Before your change:

read1 tot 484; size 1830; rsp 1; push 3; pop 15; call 2; load 24; store  9; sh  46; mov 106; lea   2; cmp  76

After your change:

read1 tot 515; size 1980; rsp 1; push 4; pop 4; call 2; load 24; store  9; sh  46; mov 113; lea   2; cmp  90

Script to reproduce (in the llvm-project root dir, with "build" subdir)

#!/bin/bash

compile() {

clang -c -O2  compiler-rt/lib/tsan/rtl/tsan_rtl.cpp -I compiler-rt/lib -Wall -std=c++14 -Wno-unused-parameter -O2 -g -DNDEBUG    -m64 -fno-lto -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fno-sanitize=safe-stack -fvisibility=hidden -fno-lto -O3 -gline-tables-only -Wno-gnu -Wno-variadic-macros -Wno-c99-extensions -Wno-non-virtual-dtor -fPIE -fno-rtti -msse3 -Wframe-larger-than=530 -Wglobal-constructors

}

git checkout a13dce1d90cba6c55252dee0a2600eab37ffbc44
(cd build; ninja clang 2> /dev/null)
compile
compiler-rt/lib/tsan/analyze_libtsan.sh tsan_rtl.o | grep read1

git checkout 352fef3f11f5ccb2ddc8e16cecb7302a54721e9f
(cd build; ninja clang 2> /dev/null)
compile
compiler-rt/lib/tsan/analyze_libtsan.sh tsan_rtl.o | grep read1

In D68408#1998112, @kcc wrote:

Hi,

Hi.

This change causes a performance regression in tsan, as detected on our LLVM buildbot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/49850/steps/tsan%20analyze/logs/stdio

Looks like the build was red already: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/49808
That explains why i didn't see the new failure: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/49809

The script that comes with tsan checks the number of PUSH, etc in some of the key tsan functions,
where each extra PUSH cases tsan to be slower.

With this change, the number of PUSHes went from 3 to 4.

Please take a look, this might be a performance regression for a wider set of targets.

Before your change:
read1 tot 484; size 1830; rsp 1; push 3; pop 15; call 2; load 24; store  9; sh  46; mov 106; lea   2; cmp  76
After your change:
read1 tot 515; size 1980; rsp 1; push 4; pop 4; call 2; load 24; store  9; sh  46; mov 113; lea   2; cmp  90

Interesting. Not very unexpected, there's always possibility of an avalanche effect with IR changes.

~~Unhelpful answer: wow how things have regressed since rL342092 / D51985~~

Script to reproduce (in the llvm-project root dir, with "build" subdir)

#!/bin/bash

compile() {
clang -c -O2  compiler-rt/lib/tsan/rtl/tsan_rtl.cpp -I compiler-rt/lib -Wall -std=c++14 -Wno-unused-parameter -O2 -g -DNDEBUG    -m64 -fno-lto -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fno-sanitize=safe-stack -fvisibility=hidden -fno-lto -O3 -gline-tables-only -Wno-gnu -Wno-variadic-macros -Wno-c99-extensions -Wno-non-virtual-dtor -fPIE -fno-rtti -msse3 -Wframe-larger-than=530 -Wglobal-constructors
}

git checkout a13dce1d90cba6c55252dee0a2600eab37ffbc44
(cd build; ninja clang 2> /dev/null)
compile
compiler-rt/lib/tsan/analyze_libtsan.sh tsan_rtl.o | grep read1

git checkout 352fef3f11f5ccb2ddc8e16cecb7302a54721e9f
(cd build; ninja clang 2> /dev/null)
compile
compiler-rt/lib/tsan/analyze_libtsan.sh tsan_rtl.o | grep read1

old.ll1 MBDownload

old.o84 KBDownload

new.ll1 MBDownload

new.o85 KBDownload

llvm-diff report is pretty large, IR instruction-wise, this appears to be a win overall (-639 +609 instructions).
Visually, i think can spot only two IR patterns that we now fail to fold:

in function _ZN6__tsan10InitializeEPNS_11ThreadStateE:
  in block %if.then88.i:
    >   %22 = xor i64 %xor.i.i.i, -17592186044417
    >   %mul.i.i194.neg.i = add i64 %22, 1
    >   %sub.i = add i64 %mul.i.i194.neg.i, %mul.i.i203.i
    <   %mul.i.i194.i = xor i64 %xor.i.i.i, 17592186044416
    <   %sub.i = sub i64 %mul.i.i203.i, %mul.i.i194.i
  in block %if.then88.1.i:
    >   %35 = xor i64 %xor.i.i.1.i, -17592186044417
    >   %mul.i.i194.neg.1.i = or i64 %mul.i.i203.1.i, 1
    >   %sub.1.i = add i64 %mul.i.i194.neg.1.i, %35
    <   %mul.i.i194.1.i = xor i64 %xor.i.i.1.i, 17592186044416
    <   %sub.1.i = sub i64 %mul.i.i203.1.i, %mul.i.i194.1.i

Filed https://bugs.llvm.org/show_bug.cgi?id=45647

But i believe, i'm supposed to look at the @__tsan_read1 function, right?
Then the relevant diff is:

in function __tsan_read1:
  in block %if.then.i1390.i.i.i:
    >   %sub.neg.i1385.i.i.i = sub nsw i64 %and.i19.i1382.i.i.i, %and.i.i1380.i.i.i
    >   %sub.neg.highbits.i1388.i.i.i = lshr i64 %sub.neg.i1385.i.i.i, %and.i.i.i1387.i.i.i
    >   %cmp7.i1389.i.i.i = icmp ne i64 %sub.neg.highbits.i1388.i.i.i, 0
    <   %sub6.i1387.i.i.i = sub nsw i64 0, %sub.i1383.i.i.i
    <   %sub6.highbits.i1388.i.i.i = lshr i64 %sub6.i1387.i.i.i, %and.i.i.i1386.i.i.i
    <   %cmp7.i1389.i.i.i = icmp ne i64 %sub6.highbits.i1388.i.i.i, 0
        %cmp.i1378.i.i.i = icmp ult i64 %xor.i1422.i.i.i, 1125899906842624
    >   %or.cond.i.i.i = or i1 %cmp.i1378.i.i.i, %cmp7.i1389.i.i.i
    >   br i1 %or.cond.i.i.i, label %do.body226.i.i.i, label %if.end86.i.i.i
    <   %or.cond.i.i.i = or i1 %cmp.i1378.i.i.i, %cmp7.i1389.i.i.i
    <   br i1 %or.cond.i.i.i, label %do.body226.i.i.i, label %if.end86.i.i.i

So we've traded 0 - %sub.i1383.i.i.i for %and.i19.i1382.i.i.i - %and.i.i1380.i.i.i
That's it, [un]fortunately, there is nothing else going on..
But thankfully, that explains the problem well.
Pushed rG5a159ed2a8e5a9a6ced73f78e4c64b01d76d3493.
Thanks.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

CMakeLists.txt

1 line

InstCombineAddSub.cpp

140 lines

InstCombineInternal.h

53 lines

InstCombineNegator.cpp

377 lines

InstructionCombining.cpp

114 lines

test/

Transforms/

InstCombine/

and-or-icmps.ll

18 lines

fold-sub-of-not-to-inc-of-add.ll

16 lines

high-bit-signmask-with-trunc.ll

44 lines

42 lines

86 lines

92 lines

60 lines

Diff 259074

llvm/lib/Transforms/InstCombine/CMakeLists.txt

	set(LLVM_TARGET_DEFINITIONS InstCombineTables.td)			set(LLVM_TARGET_DEFINITIONS InstCombineTables.td)
	tablegen(LLVM InstCombineTables.inc -gen-searchable-tables)			tablegen(LLVM InstCombineTables.inc -gen-searchable-tables)
	add_public_tablegen_target(InstCombineTableGen)			add_public_tablegen_target(InstCombineTableGen)

	add_llvm_component_library(LLVMInstCombine			add_llvm_component_library(LLVMInstCombine
	InstructionCombining.cpp			InstructionCombining.cpp
	InstCombineAddSub.cpp			InstCombineAddSub.cpp
	InstCombineAtomicRMW.cpp			InstCombineAtomicRMW.cpp
	InstCombineAndOrXor.cpp			InstCombineAndOrXor.cpp
	InstCombineCalls.cpp			InstCombineCalls.cpp
	InstCombineCasts.cpp			InstCombineCasts.cpp
	InstCombineCompares.cpp			InstCombineCompares.cpp
	InstCombineLoadStoreAlloca.cpp			InstCombineLoadStoreAlloca.cpp
	InstCombineMulDivRem.cpp			InstCombineMulDivRem.cpp
				InstCombineNegator.cpp
	InstCombinePHI.cpp			InstCombinePHI.cpp
	InstCombineSelect.cpp			InstCombineSelect.cpp
	InstCombineShifts.cpp			InstCombineShifts.cpp
	InstCombineSimplifyDemanded.cpp			InstCombineSimplifyDemanded.cpp
	InstCombineVectorOps.cpp			InstCombineVectorOps.cpp

	ADDITIONAL_HEADER_DIRS			ADDITIONAL_HEADER_DIRS
	${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms			${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms
	${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms/InstCombine			${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms/InstCombine

	DEPENDS			DEPENDS
	intrinsics_gen			intrinsics_gen
	)			)

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp

Show First 20 Lines • Show All 1,676 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitSub(BinaryOperator &I) {
if (Value *V = SimplifySubInst(I.getOperand(0), I.getOperand(1),		if (Value *V = SimplifySubInst(I.getOperand(0), I.getOperand(1),
I.hasNoSignedWrap(), I.hasNoUnsignedWrap(),		I.hasNoSignedWrap(), I.hasNoUnsignedWrap(),
SQ.getWithInstruction(&I)))		SQ.getWithInstruction(&I)))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (Instruction *X = foldVectorBinop(I))		if (Instruction *X = foldVectorBinop(I))
return X;		return X;

// (AB)-(AC) -> A*(B-C) etc		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);
if (Value *V = SimplifyUsingDistributiveLaws(I))
return replaceInstUsesWith(I, V);

// If this is a 'B = x-(-A)', change to B = x+A.		// If this is a 'B = x-(-A)', change to B = x+A.
Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		// We deal with this without involving Negator to preserve NSW flag.
if (Value *V = dyn_castNegVal(Op1)) {		if (Value *V = dyn_castNegVal(Op1)) {
BinaryOperator *Res = BinaryOperator::CreateAdd(Op0, V);		BinaryOperator *Res = BinaryOperator::CreateAdd(Op0, V);

if (const auto *BO = dyn_cast<BinaryOperator>(Op1)) {		if (const auto *BO = dyn_cast<BinaryOperator>(Op1)) {
assert(BO->getOpcode() == Instruction::Sub &&		assert(BO->getOpcode() == Instruction::Sub &&
"Expected a subtraction operator!");		"Expected a subtraction operator!");
if (BO->hasNoSignedWrap() && I.hasNoSignedWrap())		if (BO->hasNoSignedWrap() && I.hasNoSignedWrap())
Res->setHasNoSignedWrap(true);		Res->setHasNoSignedWrap(true);
} else {		} else {
if (cast<Constant>(Op1)->isNotMinSignedValue() && I.hasNoSignedWrap())		if (cast<Constant>(Op1)->isNotMinSignedValue() && I.hasNoSignedWrap())
Res->setHasNoSignedWrap(true);		Res->setHasNoSignedWrap(true);
}		}

return Res;		return Res;
}		}

		auto TryToNarrowDeduceFlags = [this, &I, &Op0, &Op1]() -> Instruction * {
		if (Instruction *Ext = narrowMathIfNoOverflow(I))
		return Ext;

		bool Changed = false;
		if (!I.hasNoSignedWrap() && willNotOverflowSignedSub(Op0, Op1, I)) {
		Changed = true;
		I.setHasNoSignedWrap(true);
		}
		if (!I.hasNoUnsignedWrap() && willNotOverflowUnsignedSub(Op0, Op1, I)) {
		Changed = true;
		I.setHasNoUnsignedWrap(true);
		}

		return Changed ? &I : nullptr;
		};

		// First, let's try to interpret `sub a, b` as `add a, (sub 0, b)`,
		// and let's try to sink `(sub 0, b)` into `b` itself. But only if this isn't
		// a pure negation used by a select that looks like abs/nabs.
		bool IsNegation = match(Op0, m_ZeroInt());
		if (!IsNegation \|\| none_of(I.users(), [&I, Op1](const User *U) {
		const Instruction *UI = dyn_cast<Instruction>(U);
		if (!UI)
		return false;
		return match(UI,
		m_Select(m_Value(), m_Specific(Op1), m_Specific(&I))) \|\|
		match(UI, m_Select(m_Value(), m_Specific(&I), m_Specific(Op1)));
		})) {
		if (Value NegOp1 = Negator::Negate(IsNegation, Op1, this))
		return BinaryOperator::CreateAdd(NegOp1, Op0);
		}
		if (IsNegation)
		return TryToNarrowDeduceFlags(); // Should have been handled in Negator!

		// (AB)-(AC) -> A*(B-C) etc
		if (Value *V = SimplifyUsingDistributiveLaws(I))
		return replaceInstUsesWith(I, V);

if (I.getType()->isIntOrIntVectorTy(1))		if (I.getType()->isIntOrIntVectorTy(1))
return BinaryOperator::CreateXor(Op0, Op1);		return BinaryOperator::CreateXor(Op0, Op1);

// Replace (-1 - A) with (~A).		// Replace (-1 - A) with (~A).
if (match(Op0, m_AllOnes()))		if (match(Op0, m_AllOnes()))
return BinaryOperator::CreateNot(Op1);		return BinaryOperator::CreateNot(Op1);

// (~X) - (~Y) --> Y - X		// (~X) - (~Y) --> Y - X
Value X, Y;		Value X, Y;
if (match(Op0, m_Not(m_Value(X))) && match(Op1, m_Not(m_Value(Y))))		if (match(Op0, m_Not(m_Value(X))) && match(Op1, m_Not(m_Value(Y))))
return BinaryOperator::CreateSub(Y, X);		return BinaryOperator::CreateSub(Y, X);

// (X + -1) - Y --> ~Y + X		// (X + -1) - Y --> ~Y + X
if (match(Op0, m_OneUse(m_Add(m_Value(X), m_AllOnes()))))		if (match(Op0, m_OneUse(m_Add(m_Value(X), m_AllOnes()))))
return BinaryOperator::CreateAdd(Builder.CreateNot(Op1), X);		return BinaryOperator::CreateAdd(Builder.CreateNot(Op1), X);

// Y - (X + 1) --> ~X + Y
if (match(Op1, m_OneUse(m_Add(m_Value(X), m_One()))))
return BinaryOperator::CreateAdd(Builder.CreateNot(X), Op0);

// Y - ~X --> (X + 1) + Y
if (match(Op1, m_OneUse(m_Not(m_Value(X))))) {
return BinaryOperator::CreateAdd(
Builder.CreateAdd(Op0, ConstantInt::get(I.getType(), 1)), X);
}

if (Constant *C = dyn_cast<Constant>(Op0)) {		if (Constant *C = dyn_cast<Constant>(Op0)) {
// -f(x) -> f(-x) if possible.
if (match(C, m_Zero()))
if (Value *Neg = freelyNegateValue(Op1))
return replaceInstUsesWith(I, Neg);

Value *X;		Value *X;
if (match(Op1, m_ZExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1))		if (match(Op1, m_ZExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1))
// C - (zext bool) --> bool ? C - 1 : C		// C - (zext bool) --> bool ? C - 1 : C
return SelectInst::Create(X, SubOne(C), C);		return SelectInst::Create(X, SubOne(C), C);
if (match(Op1, m_SExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1))		if (match(Op1, m_SExt(m_Value(X))) && X->getType()->isIntOrIntVectorTy(1))
// C - (sext bool) --> bool ? C + 1 : C		// C - (sext bool) --> bool ? C + 1 : C
return SelectInst::Create(X, AddOne(C), C);		return SelectInst::Create(X, AddOne(C), C);

Show All 18 Lines	if (match(Op1, m_Sub(m_Constant(C2), m_Value(X))) && !isa<ConstantExpr>(C2))
return BinaryOperator::CreateAdd(X, ConstantExpr::getSub(C, C2));		return BinaryOperator::CreateAdd(X, ConstantExpr::getSub(C, C2));

// C-(X+C2) --> (C-C2)-X		// C-(X+C2) --> (C-C2)-X
if (match(Op1, m_Add(m_Value(X), m_Constant(C2))))		if (match(Op1, m_Add(m_Value(X), m_Constant(C2))))
return BinaryOperator::CreateSub(ConstantExpr::getSub(C, C2), X);		return BinaryOperator::CreateSub(ConstantExpr::getSub(C, C2), X);
}		}

const APInt *Op0C;		const APInt *Op0C;
if (match(Op0, m_APInt(Op0C))) {		if (match(Op0, m_APInt(Op0C)) && Op0C->isMask()) {
if (Op0C->isNullValue() && Op1->hasOneUse()) {
Value LHS, RHS;
SelectPatternFlavor SPF = matchSelectPattern(Op1, LHS, RHS).Flavor;
if (SPF == SPF_ABS \|\| SPF == SPF_NABS) {
// This is a negate of an ABS/NABS pattern. Just swap the operands
// of the select.
cast<SelectInst>(Op1)->swapValues();
// Don't swap prof metadata, we didn't change the branch behavior.
return replaceInstUsesWith(I, Op1);
}
}

// Turn this into a xor if LHS is 2^n-1 and the remaining bits are known		// Turn this into a xor if LHS is 2^n-1 and the remaining bits are known
// zero.		// zero.
if (Op0C->isMask()) {
KnownBits RHSKnown = computeKnownBits(Op1, 0, &I);		KnownBits RHSKnown = computeKnownBits(Op1, 0, &I);
if ((*Op0C \| RHSKnown.Zero).isAllOnesValue())		if ((*Op0C \| RHSKnown.Zero).isAllOnesValue())
return BinaryOperator::CreateXor(Op1, Op0);		return BinaryOperator::CreateXor(Op1, Op0);
}		}
}

{		{
Value *Y;		Value *Y;
// X-(X+Y) == -Y X-(Y+X) == -Y		// X-(X+Y) == -Y X-(Y+X) == -Y
if (match(Op1, m_c_Add(m_Specific(Op0), m_Value(Y))))		if (match(Op1, m_c_Add(m_Specific(Op0), m_Value(Y))))
return BinaryOperator::CreateNeg(Y);		return BinaryOperator::CreateNeg(Y);

// (X-Y)-X == -Y		// (X-Y)-X == -Y
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitSub(BinaryOperator &I) {
}		}

// (X - (X & Y)) --> (X & ~Y)		// (X - (X & Y)) --> (X & ~Y)
if (match(Op1, m_c_And(m_Specific(Op0), m_Value(Y))) &&		if (match(Op1, m_c_And(m_Specific(Op0), m_Value(Y))) &&
(Op1->hasOneUse() \|\| isa<Constant>(Y)))		(Op1->hasOneUse() \|\| isa<Constant>(Y)))
return BinaryOperator::CreateAnd(		return BinaryOperator::CreateAnd(
Op0, Builder.CreateNot(Y, Y->getName() + ".not"));		Op0, Builder.CreateNot(Y, Y->getName() + ".not"));

if (Op1->hasOneUse()) {
Value Y = nullptr, Z = nullptr;
Constant *C = nullptr;

// (X - (Y - Z)) --> (X + (Z - Y)).
if (match(Op1, m_Sub(m_Value(Y), m_Value(Z))))
return BinaryOperator::CreateAdd(Op0,
Builder.CreateSub(Z, Y, Op1->getName()));

// Subtracting -1/0 is the same as adding 1/0:
// sub [nsw] Op0, sext(bool Y) -> add [nsw] Op0, zext(bool Y)
// 'nuw' is dropped in favor of the canonical form.
if (match(Op1, m_SExt(m_Value(Y))) &&
Y->getType()->getScalarSizeInBits() == 1) {
Value *Zext = Builder.CreateZExt(Y, I.getType());
BinaryOperator *Add = BinaryOperator::CreateAdd(Op0, Zext);
Add->setHasNoSignedWrap(I.hasNoSignedWrap());
return Add;
}
// sub [nsw] X, zext(bool Y) -> add [nsw] X, sext(bool Y)
// 'nuw' is dropped in favor of the canonical form.
if (match(Op1, m_ZExt(m_Value(Y))) && Y->getType()->isIntOrIntVectorTy(1)) {
Value *Sext = Builder.CreateSExt(Y, I.getType());
BinaryOperator *Add = BinaryOperator::CreateAdd(Op0, Sext);
Add->setHasNoSignedWrap(I.hasNoSignedWrap());
return Add;
}

// X - A-B -> X + AB
// X - -AB -> X + AB
Value A, B;
if (match(Op1, m_c_Mul(m_Value(A), m_Neg(m_Value(B)))))
return BinaryOperator::CreateAdd(Op0, Builder.CreateMul(A, B));

// X - AC -> X + A-C
// No need to handle commuted multiply because multiply handling will
// ensure constant will be move to the right hand side.
if (match(Op1, m_Mul(m_Value(A), m_Constant(C))) && !isa<ConstantExpr>(C)) {
Value *NewMul = Builder.CreateMul(A, ConstantExpr::getNeg(C));
return BinaryOperator::CreateAdd(Op0, NewMul);
}
}

{		{
// ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A		// ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A
// ~A - Min/Max(O, ~A) -> Max/Min(A, ~O) - A		// ~A - Min/Max(O, ~A) -> Max/Min(A, ~O) - A
// Min/Max(~A, O) - ~A -> A - Max/Min(A, ~O)		// Min/Max(~A, O) - ~A -> A - Max/Min(A, ~O)
// Min/Max(O, ~A) - ~A -> A - Max/Min(A, ~O)		// Min/Max(O, ~A) - ~A -> A - Max/Min(A, ~O)
// So long as O here is freely invertible, this will be neutral or a win.		// So long as O here is freely invertible, this will be neutral or a win.
Value LHS, RHS, *A;		Value LHS, RHS, *A;
Value NotA = Op0, MinMax = Op1;		Value NotA = Op0, MinMax = Op1;
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	Value *Neg = Builder.CreateNeg(A, "", I.hasNoUnsignedWrap(),
I.hasNoSignedWrap());		I.hasNoSignedWrap());
return SelectInst::Create(Cmp, Neg, A);		return SelectInst::Create(Cmp, Neg, A);
}		}

if (Instruction *V =		if (Instruction *V =
canonicalizeCondSignextOfHighBitExtractToSignextHighBitExtract(I))		canonicalizeCondSignextOfHighBitExtractToSignextHighBitExtract(I))
return V;		return V;

if (Instruction *Ext = narrowMathIfNoOverflow(I))		return TryToNarrowDeduceFlags();
return Ext;

bool Changed = false;
if (!I.hasNoSignedWrap() && willNotOverflowSignedSub(Op0, Op1, I)) {
Changed = true;
I.setHasNoSignedWrap(true);
}
if (!I.hasNoUnsignedWrap() && willNotOverflowUnsignedSub(Op0, Op1, I)) {
Changed = true;
I.setHasNoUnsignedWrap(true);
}

return Changed ? &I : nullptr;
}		}
		xbolva00Unsubmitted Done Reply Inline Actions Just wondering if we could use some better name for this lambda than “cleanup”. xbolva00: Just wondering if we could use some better name for this lambda than “cleanup”.
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions I couldn't come up with a better one, any suggestions? `TryToNarrowDeduceFlags()`? This is where `goto` might make sense, but somehow i don't want to use it.. lebedev.ri: I couldn't come up with a better one, any suggestions? `TryToNarrowDeduceFlags()`? This is…
		xbolva00Unsubmitted Done Reply Inline Actions Your idea is fine I think. xbolva00: Your idea is fine I think.

/// This eliminates floating-point negation in either 'fneg(X)' or		/// This eliminates floating-point negation in either 'fneg(X)' or
/// 'fsub(-0.0, X)' form by combining into a constant operand.		/// 'fsub(-0.0, X)' form by combining into a constant operand.
static Instruction *foldFNegIntoConstant(Instruction &I) {		static Instruction *foldFNegIntoConstant(Instruction &I) {
Value *X;		Value *X;
Constant *C;		Constant *C;

// Fold negation into constant operand. This is limited with one-use because		// Fold negation into constant operand. This is limited with one-use because
▲ Show 20 Lines • Show All 195 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineInternal.h

Show All 10 Lines
/// This file provides internal interfaces used to implement the InstCombine.		/// This file provides internal interfaces used to implement the InstCombine.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H		#ifndef LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H
#define LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H		#define LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H

#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/TargetFolder.h"		#include "llvm/Analysis/TargetFolder.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/Argument.h"		#include "llvm/IR/Argument.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
▲ Show 20 Lines • Show All 439 Lines • ▼ Show 20 Lines	public:

LoadInst combineLoadToNewType(LoadInst &LI, Type NewTy,		LoadInst combineLoadToNewType(LoadInst &LI, Type NewTy,
const Twine &Suffix = "");		const Twine &Suffix = "");

private:		private:
bool shouldChangeType(unsigned FromBitWidth, unsigned ToBitWidth) const;		bool shouldChangeType(unsigned FromBitWidth, unsigned ToBitWidth) const;
bool shouldChangeType(Type From, Type To) const;		bool shouldChangeType(Type From, Type To) const;
Value dyn_castNegVal(Value V) const;		Value dyn_castNegVal(Value V) const;
Value freelyNegateValue(Value V);
Type FindElementAtOffset(PointerType PtrTy, int64_t Offset,		Type FindElementAtOffset(PointerType PtrTy, int64_t Offset,
SmallVectorImpl<Value *> &NewIndices);		SmallVectorImpl<Value *> &NewIndices);

/// Classify whether a cast is worth optimizing.		/// Classify whether a cast is worth optimizing.
///		///
/// This is a helper to decide whether the simplification of		/// This is a helper to decide whether the simplification of
/// logic(cast(A), cast(B)) to cast(logic(A, B)) should be performed.		/// logic(cast(A), cast(B)) to cast(logic(A, B)) should be performed.
///		///
Show All 25 Lines	private:
bool transformConstExprCastCall(CallBase &Call);		bool transformConstExprCastCall(CallBase &Call);
Instruction *transformCallThroughTrampoline(CallBase &Call,		Instruction *transformCallThroughTrampoline(CallBase &Call,
IntrinsicInst &Tramp);		IntrinsicInst &Tramp);

Value *simplifyMaskedLoad(IntrinsicInst &II);		Value *simplifyMaskedLoad(IntrinsicInst &II);
Instruction *simplifyMaskedStore(IntrinsicInst &II);		Instruction *simplifyMaskedStore(IntrinsicInst &II);
Instruction *simplifyMaskedGather(IntrinsicInst &II);		Instruction *simplifyMaskedGather(IntrinsicInst &II);
Instruction *simplifyMaskedScatter(IntrinsicInst &II);		Instruction *simplifyMaskedScatter(IntrinsicInst &II);

/// Transform (zext icmp) to bitwise / integer operations in order to		/// Transform (zext icmp) to bitwise / integer operations in order to
/// eliminate it.		/// eliminate it.
///		///
/// \param ICI The icmp of the (zext icmp) pair we are interested in.		/// \param ICI The icmp of the (zext icmp) pair we are interested in.
/// \parem CI The zext of the (zext icmp) pair we are interested in.		/// \parem CI The zext of the (zext icmp) pair we are interested in.
/// \param DoTransform Pass false to just test whether the given (zext icmp)		/// \param DoTransform Pass false to just test whether the given (zext icmp)
/// would be transformed. Pass true to actually perform the transformation.		/// would be transformed. Pass true to actually perform the transformation.
///		///
▲ Show 20 Lines • Show All 484 Lines • ▼ Show 20 Lines	private:
Value EvaluateInDifferentType(Value V, Type *Ty, bool isSigned);		Value EvaluateInDifferentType(Value V, Type *Ty, bool isSigned);

/// Returns a value X such that Val = X * Scale, or null if none.		/// Returns a value X such that Val = X * Scale, or null if none.
///		///
/// If the multiplication is known not to overflow then NoSignedWrap is set.		/// If the multiplication is known not to overflow then NoSignedWrap is set.
Value Descale(Value Val, APInt Scale, bool &NoSignedWrap);		Value Descale(Value Val, APInt Scale, bool &NoSignedWrap);
};		};

		namespace {

		// As a default, let's assume that we want to be aggressive,
		// and attempt to traverse with no limits in attempt to sink negation.
		spatelUnsubmitted Done Reply Inline Actions typo: attempt spatel: typo: attempt
		static constexpr unsigned NegatorDefaultMaxDepth = ~0U;

		// Let's guesstimate that most often we will end up visiting/producing
		// fairly small number of new instructions.
		static constexpr unsigned NegatorMaxNodesSSO = 16;

		} // namespace

		class Negator final {
		/// Top-to-bottom, def-to-use negated instruction tree we produced.
		SmallVector<Instruction *, NegatorMaxNodesSSO> NewInstructions;

		using BuilderTy = IRBuilder<TargetFolder, IRBuilderCallbackInserter>;
		BuilderTy Builder;

		const bool IsTrulyNegation;

		Negator(LLVMContext &C, const DataLayout &DL, bool IsTrulyNegation);

		#if LLVM_ENABLE_STATS
		unsigned NumValuesVisitedInThisNegator = 0;
		~Negator();
		#endif

		using Result = std::pair<ArrayRef<Instruction > /NewInstructions*/,
		Value * /NegatedRoot/>;

		LLVM_NODISCARD Value visit(Value V, unsigned Depth);

		/// Recurse depth-first and attempt to sink the negation.
		/// FIXME: use worklist?
		LLVM_NODISCARD Optional<Result> run(Value *Root);

		Negator(const Negator &) = delete;
		Negator(Negator &&) = delete;
		Negator &operator=(const Negator &) = delete;
		Negator &operator=(Negator &&) = delete;

		public:
		/// Attempt to negate \p Root. Retuns nullptr if negation can't be performed,
		/// otherwise returns negated value.
		LLVM_NODISCARD static Value Negate(bool LHSIsZero, Value Root,
		InstCombiner &IC);
		};

} // end namespace llvm		} // end namespace llvm

#undef DEBUG_TYPE		#undef DEBUG_TYPE

#endif // LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H		#endif // LLVM_LIB_TRANSFORMS_INSTCOMBINE_INSTCOMBINEINTERNAL_H

llvm/lib/Transforms/InstCombine/InstCombineNegator.cpp

This file was added.

				//===- InstCombineNegator.cpp ------------------------------------ C++ --===//
				//
				efriedmaUnsubmitted Done Reply Inline Actions InstCombineAddSub.cpp? efriedma: InstCombineAddSub.cpp?
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements sinking of negation into expression trees,
				// as long as that can be done without increasing instruction count.
				//
				//===----------------------------------------------------------------------===//

				#include "InstCombineInternal.h"
				#include "llvm/ADT/APInt.h"
				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/ADT/None.h"
				#include "llvm/ADT/Optional.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/ADT/Twine.h"
				#include "llvm/ADT/iterator_range.h"
				#include "llvm/Analysis/TargetFolder.h"
				#include "llvm/Analysis/ValueTracking.h"
				#include "llvm/IR/Constant.h"
				#include "llvm/IR/Constants.h"
				#include "llvm/IR/DebugLoc.h"
				#include "llvm/IR/DerivedTypes.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/Instruction.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/PatternMatch.h"
				#include "llvm/IR/Type.h"
				#include "llvm/IR/Use.h"
				#include "llvm/IR/User.h"
				#include "llvm/IR/Value.h"
				#include "llvm/Support/Casting.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/Compiler.h"
				#include "llvm/Support/DebugCounter.h"
				#include "llvm/Support/ErrorHandling.h"
				#include "llvm/Support/raw_ostream.h"
				#include <functional>
				#include <tuple>
				#include <utility>

				using namespace llvm;

				#define DEBUG_TYPE "instcombine"

				STATISTIC(NegatorTotalNegationsAttempted,
				"Negator: Number of negations attempted to be sinked");
				STATISTIC(NegatorNumTreesNegated,
				"Negator: Number of negations successfully sinked");
				STATISTIC(NegatorMaxDepthVisited, "Negator: Maximal traversal depth ever "
				"reached while attempting to sink negation");
				STATISTIC(NegatorTimesDepthLimitReached,
				"Negator: How many times did the traversal depth limit was reached "
				"during sinking");
				STATISTIC(
				NegatorNumValuesVisited,
				"Negator: Total number of values visited during attempts to sink negation");
				STATISTIC(NegatorMaxTotalValuesVisited,
				"Negator: Maximal number of values ever visited while attempting to "
				"sink negation");
				STATISTIC(NegatorNumInstructionsCreatedTotal,
				"Negator: Number of new negated instructions created, total");
				STATISTIC(NegatorMaxInstructionsCreated,
				"Negator: Maximal number of new instructions created during negation "
				"attempt");
				STATISTIC(NegatorNumInstructionsNegatedSuccess,
				"Negator: Number of new negated instructions created in successful "
				"negation sinking attempts");

				DEBUG_COUNTER(NegatorCounter, "instcombine-negator",
				"Controls Negator transformations in InstCombine pass");

				static cl::opt<bool>
				NegatorEnabled("instcombine-negator-enabled", cl::init(true),
				cl::desc("Should we attempt to sink negations?"));

				static cl::opt<unsigned>
				NegatorMaxDepth("instcombine-negator-max-depth",
				cl::init(NegatorDefaultMaxDepth),
				cl::desc("What is the maximal lookup depth when trying to "
				"check for viability of negation sinking."));

				Negator::Negator(LLVMContext &C, const DataLayout &DL, bool IsTrulyNegation_)
				: Builder(C, TargetFolder(DL),
				IRBuilderCallbackInserter([&](Instruction *I) {
				++NegatorNumInstructionsCreatedTotal;
				NewInstructions.push_back(I);
				})),
				IsTrulyNegation(IsTrulyNegation_) {}

				#if LLVM_ENABLE_STATS
				Negator::~Negator() {
				NegatorMaxTotalValuesVisited.updateMax(NumValuesVisitedInThisNegator);
				}
				#endif

				// FIXME: can this be reworked into a worklist-based algorithm while preserving
				// the depth-first, early bailout traversal?
				LLVM_NODISCARD Value Negator::visit(Value V, unsigned Depth) {
				NegatorMaxDepthVisited.updateMax(Depth);
				++NegatorNumValuesVisited;

				#if LLVM_ENABLE_STATS
				++NumValuesVisitedInThisNegator;
				#endif

				// In i1, negation can simply be ignored.
				if (V->getType()->isIntOrIntVectorTy(1))
				return V;

				Value *X;

				// -(-(X)) -> X.
				if (match(V, m_Neg(m_Value(X))))
				return X;

				// Integral constants can be freely negated.
				if (match(V, m_AnyIntegralConstant()))
				return ConstantExpr::getNeg(cast<Constant>(V), /HasNUW=/false,
				/HasNSW=/false);

				// If we have a non-instruction, then give up.
				if (!isa<Instruction>(V))
				return nullptr;

				// If we have started with a true negation (i.e. `sub 0, %y`), then if we've
				// got instruction that does not require recursive reasoning, we can still
				spatelUnsubmitted Done Reply Inline Actions Remove/update comment. spatel: Remove/update comment.
				// negate it even if it has other uses, without increasing instruction count.
				if (!V->hasOneUse() && !IsTrulyNegation)
				return nullptr;

				auto *I = cast<Instruction>(V);
				unsigned BitWidth = I->getType()->getScalarSizeInBits();

				// We must preserve the insertion point and debug info that is set in the
				// builder at the time this function is called.
				InstCombiner::BuilderTy::InsertPointGuard Guard(Builder);
				// And since we are trying to negate instruction I, that tells us about the
				// insertion point and the debug info that we need to keep.
				Builder.SetInsertPoint(I);

				// In some cases we can give the answer without further recursion.
				switch (I->getOpcode()) {
				case Instruction::Sub:
				// `sub` is always negatible.
				return Builder.CreateSub(I->getOperand(1), I->getOperand(0),
				I->getName() + ".neg");
				case Instruction::Add:
				// `inc` is always negatible.
				if (match(I->getOperand(1), m_One()))
				return Builder.CreateNot(I->getOperand(0), I->getName() + ".neg");
				break;
				case Instruction::Xor:
				// `not` is always negatible.
				if (match(I, m_Not(m_Value(X))))
				return Builder.CreateAdd(X, ConstantInt::get(X->getType(), 1),
				I->getName() + ".neg");
				break;
				case Instruction::AShr:
				case Instruction::LShr: {
				// Right-shift sign bit smear is negatible.
				const APInt *Op1Val;
				spatelUnsubmitted Done Reply Inline Actions it's -> its spatel: it's -> its
				if (match(I->getOperand(1), m_APInt(Op1Val)) && *Op1Val == BitWidth - 1) {
				Value *BO = I->getOpcode() == Instruction::AShr
				? Builder.CreateLShr(I->getOperand(0), I->getOperand(1))
				: Builder.CreateAShr(I->getOperand(0), I->getOperand(1));
				if (auto *NewInstr = dyn_cast<Instruction>(BO)) {
				NewInstr->copyIRFlags(I);
				NewInstr->setName(I->getName() + ".neg");
				}
				return BO;
				}
				break;
				}
				nikicUnsubmitted Done Reply Inline Actions Noticed while looking through the tramp3d-v4 diff: This should be behind a one-use check, to avoid duplicating expensive division instructions. nikic: Noticed while looking through the tramp3d-v4 diff: This should be behind a one-use check, to…
				lebedev.riAuthorUnsubmitted Done Reply Inline Actions I was hesitant about this one indeed, it isn't a typo that one-use check has gone away here, because we generally consider only instruction count. @spatel thoughts? But the main, bigger question this touches is: "but what if all the uses would get negated by us? In future, can we somehow sanely model the whole negatible tree, not giving up at non-single-use instructions, but defer that to after we've finished building new tree?" lebedev.ri: I was hesitant about this one indeed, it isn't a typo that one-use check has gone away here…
				spatelUnsubmitted Done Reply Inline Actions I missed that logic difference, and I'm not getting a reviewable diff of the attached files with llvm-diff or other apps. Can you create an IR example/regression test for that? spatel: I missed that logic difference, and I'm not getting a reviewable diff of the attached files…
				lebedev.riAuthorUnsubmitted Done Reply Inline Actions @spatel To be clear, the "bigger question" is pretty rhetorical, or at least not for this review. The actual question here is whether we should consider `sdiv` special, and even if we can negate it without increasing instruction count, we should only do so if there are no other uses of old `sdiv`. lebedev.ri: @spatel To be clear, the "bigger question" is pretty rhetorical, or at least not for this…
				spatelUnsubmitted Done Reply Inline Actions There is precedence for this kind of special treatment. In other words, not all opcodes are equal in terms of analysis (and secondary concern of codegen), and we will even increase instruction count to avoid some like div/rem (although those transforms are probably currently not safe with respect to poison). If it would be NFC-ish to keep the one-use check, then we should do that. Then, remove the limitation as a follow-up if that can be shown useful? spatel: There is precedence for this kind of special treatment. In other words, not all opcodes are…
				lebedev.riAuthorUnsubmitted Done Reply Inline Actions Alright, i'll add one-use check just to get this moving :) lebedev.ri: Alright, i'll add one-use check just to get this moving :)
				case Instruction::SDiv:
				// `sdiv` is negatible if divisor is not undef/INT_MIN/1.
				spatelUnsubmitted Done Reply Inline Actions it's -> its spatel: it's -> its
				// While this is normally not behind a use-check,
				// let's consider division to be special since it's costly.
				if (!I->hasOneUse())
				break;
				if (auto *Op1C = dyn_cast<Constant>(I->getOperand(1))) {
				if (!Op1C->containsUndefElement() && Op1C->isNotMinSignedValue() &&
				Op1C->isNotOneValue()) {
				Value *BO =
				Builder.CreateSDiv(I->getOperand(0), ConstantExpr::getNeg(Op1C),
				I->getName() + ".neg");
				spatelUnsubmitted Done Reply Inline Actions negatible one -> negatible if one spatel: negatible one -> negatible if one
				spatelUnsubmitted Done Reply Inline Actions it's -> its spatel: it's -> its
				if (auto *NewInstr = dyn_cast<Instruction>(BO))
				NewInstr->setIsExact(I->isExact());
				return BO;
				}
				}
				break;
				case Instruction::SExt:
				case Instruction::ZExt:
				// `*ext` of i1 is always negatible
				if (I->getOperand(0)->getType()->isIntOrIntVectorTy(1))
				return I->getOpcode() == Instruction::SExt
				? Builder.CreateZExt(I->getOperand(0), I->getType(),
				I->getName() + ".neg")
				: Builder.CreateSExt(I->getOperand(0), I->getType(),
				I->getName() + ".neg");
				break;
				default:
				break; // Other instructions require recursive reasoning.
				}

				// Rest of the logic is recursive, and if either the current instruction
				// has other uses or if it's time to give up then it's time.
				if (!V->hasOneUse())
				return nullptr;
				if (Depth > NegatorMaxDepth) {
				LLVM_DEBUG(dbgs() << "Negator: reached maximal allowed traversal depth in "
				<< *V << ". Giving up.\n");
				++NegatorTimesDepthLimitReached;
				return nullptr;
				}

				switch (I->getOpcode()) {
				case Instruction::PHI: {
				// `phi` is negatible if all the incoming values are negatible.
				PHINode *PHI = cast<PHINode>(I);
				SmallVector<Value *, 4> NegatedIncomingValues(PHI->getNumOperands());
				for (auto I : zip(PHI->incoming_values(), NegatedIncomingValues)) {
				if (!(std::get<1>(I) = visit(std::get<0>(I), Depth + 1))) // Early return.
				return nullptr;
				}
				// All incoming values are indeed negatible. Create negated PHI node.
				PHINode *NegatedPHI = Builder.CreatePHI(
				PHI->getType(), PHI->getNumOperands(), PHI->getName() + ".neg");
				for (auto I : zip(NegatedIncomingValues, PHI->blocks()))
				NegatedPHI->addIncoming(std::get<0>(I), std::get<1>(I));
				return NegatedPHI;
				}
				case Instruction::Select: {
				{
				// `abs`/`nabs` is always negatible.
				Value LHS, RHS;
				spatelUnsubmitted Done Reply Inline Actions typo: temporarily spatel: typo: temporarily
				SelectPatternFlavor SPF =
				matchSelectPattern(I, LHS, RHS, /CastOp=/nullptr, Depth).Flavor;
				if (SPF == SPF_ABS \|\| SPF == SPF_NABS) {
				auto *NewSelect = cast<SelectInst>(I->clone());
				// Just swap the operands of the select.
				NewSelect->swapValues();
				// Don't swap prof metadata, we didn't change the branch behavior.
				NewSelect->setName(I->getName() + ".neg");
				Builder.Insert(NewSelect);
				return NewSelect;
				}
				}
				// `select` is negatible if both hands of `select` are negatible.
				Value *NegOp1 = visit(I->getOperand(1), Depth + 1);
				if (!NegOp1) // Early return.
				return nullptr;
				Value *NegOp2 = visit(I->getOperand(2), Depth + 1);
				if (!NegOp2)
				return nullptr;
				// Do preserve the metadata!
				return Builder.CreateSelect(I->getOperand(0), NegOp1, NegOp2,
				I->getName() + ".neg", /MDFrom=/I);
				}
				case Instruction::Trunc: {
				// `trunc` is negatible if its operand is negatible.
				Value *NegOp = visit(I->getOperand(0), Depth + 1);
				if (!NegOp) // Early return.
				return nullptr;
				return Builder.CreateTrunc(NegOp, I->getType(), I->getName() + ".neg");
				}
				case Instruction::Shl: {
				// `shl` is negatible if the first operand is negatible.
				Value *NegOp0 = visit(I->getOperand(0), Depth + 1);
				if (!NegOp0) // Early return.
				return nullptr;
				return Builder.CreateShl(NegOp0, I->getOperand(1), I->getName() + ".neg");
				}
				case Instruction::Add: {
				// `add` is negatible if both of its operands are negatible.
				Value *NegOp0 = visit(I->getOperand(0), Depth + 1);
				if (!NegOp0) // Early return.
				return nullptr;
				Value *NegOp1 = visit(I->getOperand(1), Depth + 1);
				if (!NegOp1)
				return nullptr;
				return Builder.CreateAdd(NegOp0, NegOp1, I->getName() + ".neg");
				}
				case Instruction::Xor:
				// `xor` is negatible if one of its operands is invertible.
				// FIXME: InstCombineInverter? But how to connect Inverter and Negator?
				if (auto *C = dyn_cast<Constant>(I->getOperand(1))) {
				Value *Xor = Builder.CreateXor(I->getOperand(0), ConstantExpr::getNot(C));
				return Builder.CreateAdd(Xor, ConstantInt::get(Xor->getType(), 1),
				I->getName() + ".neg");
				}
				return nullptr;
				case Instruction::Mul: {
				// `mul` is negatible if one of its operands is negatible.
				Value NegatedOp, OtherOp;
				// First try the second operand, in case it's a constant it will be best to
				// just invert it instead of sinking the `neg` deeper.
				if (Value *NegOp1 = visit(I->getOperand(1), Depth + 1)) {
				NegatedOp = NegOp1;
				OtherOp = I->getOperand(0);
				} else if (Value *NegOp0 = visit(I->getOperand(0), Depth + 1)) {
				NegatedOp = NegOp0;
				OtherOp = I->getOperand(1);
				} else
				// Can't negate either of them.
				return nullptr;
				return Builder.CreateMul(NegatedOp, OtherOp, I->getName() + ".neg");
				}
				default:
				return nullptr; // Don't know, likely not negatible for free.
				}

				llvm_unreachable("Can't get here. We always return from switch.");
				};

				LLVM_NODISCARD Optional<Negator::Result> Negator::run(Value *Root) {
				Value Negated = visit(Root, /Depth=*/0);
				if (!Negated) {
				// We must cleanup newly-inserted instructions, to avoid any potential
				// endless combine looping.
				llvm::for_each(llvm::reverse(NewInstructions),
				[&](Instruction *I) { I->eraseFromParent(); });
				return llvm::None;
				}
				return std::make_pair(ArrayRef<Instruction *>(NewInstructions), Negated);
				};

				LLVM_NODISCARD Value Negator::Negate(bool LHSIsZero, Value Root,
				InstCombiner &IC) {
				++NegatorTotalNegationsAttempted;
				LLVM_DEBUG(dbgs() << "Negator: attempting to sink negation into " << *Root
				<< "\n");

				if (!NegatorEnabled \|\| !DebugCounter::shouldExecute(NegatorCounter))
				return nullptr;

				Negator N(Root->getContext(), IC.getDataLayout(), LHSIsZero);
				Optional<Result> Res = N.run(Root);
				if (!Res) { // Negation failed.
				LLVM_DEBUG(dbgs() << "Negator: failed to sink negation into " << *Root
				<< "\n");
				return nullptr;
				}

				LLVM_DEBUG(dbgs() << "Negator: successfully sunk negation into " << *Root
				<< "\n NEW: " << *Res->second << "\n");
				++NegatorNumTreesNegated;

				// We must temporarily unset the 'current' insertion point and DebugLoc of the
				// InstCombine's IRBuilder so that it won't interfere with the ones we have
				// already specified when producing negated instructions.
				InstCombiner::BuilderTy::InsertPointGuard Guard(IC.Builder);
				IC.Builder.ClearInsertionPoint();
				IC.Builder.SetCurrentDebugLocation(DebugLoc());

				// And finally, we must add newly-created instructions into the InstCombine's
				// worklist (in a proper order!) so it can attempt to combine them.
				LLVM_DEBUG(dbgs() << "Negator: Propagating " << Res->first.size()
				<< " instrs to InstCombine\n");
				NegatorMaxInstructionsCreated.updateMax(Res->first.size());
				NegatorNumInstructionsNegatedSuccess += Res->first.size();

				// They are in def-use order, so nothing fancy, just insert them in order.
				llvm::for_each(Res->first,
				[&](Instruction *I) { IC.Builder.Insert(I, I->getName()); });

				// And return the new root.
				return Res->second;
				};

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

Show First 20 Lines • Show All 847 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = CV->getNumOperands(); i != e; ++i) {
return nullptr;		return nullptr;
}		}
return ConstantExpr::getNeg(CV);		return ConstantExpr::getNeg(CV);
}		}

return nullptr;		return nullptr;
}		}

/// Get negated V (that is 0-V) without increasing instruction count,
/// assuming that the original V will become unused.
Value InstCombiner::freelyNegateValue(Value V) {
if (Value *NegV = dyn_castNegVal(V))
return NegV;

Instruction *I = dyn_cast<Instruction>(V);
if (!I)
return nullptr;

unsigned BitWidth = I->getType()->getScalarSizeInBits();
switch (I->getOpcode()) {
// 0-(zext i1 A) => sext i1 A
case Instruction::ZExt:
if (I->getOperand(0)->getType()->isIntOrIntVectorTy(1))
return Builder.CreateSExtOrBitCast(
I->getOperand(0), I->getType(), I->getName() + ".neg");
return nullptr;

// 0-(sext i1 A) => zext i1 A
case Instruction::SExt:
if (I->getOperand(0)->getType()->isIntOrIntVectorTy(1))
return Builder.CreateZExtOrBitCast(
I->getOperand(0), I->getType(), I->getName() + ".neg");
return nullptr;

// 0-(A lshr (BW-1)) => A ashr (BW-1)
case Instruction::LShr:
if (match(I->getOperand(1), m_SpecificInt(BitWidth - 1)))
return Builder.CreateAShr(
I->getOperand(0), I->getOperand(1),
I->getName() + ".neg", cast<BinaryOperator>(I)->isExact());
return nullptr;

// 0-(A ashr (BW-1)) => A lshr (BW-1)
case Instruction::AShr:
if (match(I->getOperand(1), m_SpecificInt(BitWidth - 1)))
return Builder.CreateLShr(
I->getOperand(0), I->getOperand(1),
I->getName() + ".neg", cast<BinaryOperator>(I)->isExact());
return nullptr;

// Negation is equivalent to bitwise-not + 1.
case Instruction::Xor: {
// Special case for negate of 'not' - replace with increment:
// 0 - (~A) => ((A ^ -1) ^ -1) + 1 => A + 1
Value *A;
if (match(I, m_Not(m_Value(A))))
return Builder.CreateAdd(A, ConstantInt::get(A->getType(), 1),
I->getName() + ".neg");

// General case xor (not a 'not') requires creating a new xor, so this has a
// one-use limitation:
// 0 - (A ^ C) => ((A ^ C) ^ -1) + 1 => A ^ ~C + 1
Constant *C;
if (match(I, m_OneUse(m_Xor(m_Value(A), m_Constant(C))))) {
Value *Xor = Builder.CreateXor(A, ConstantExpr::getNot(C));
return Builder.CreateAdd(Xor, ConstantInt::get(Xor->getType(), 1),
I->getName() + ".neg");
}
return nullptr;
}

default:
break;
}

// TODO: The "sub" pattern below could also be applied without the one-use
// restriction. Not allowing it for now in line with existing behavior.
if (!I->hasOneUse())
return nullptr;

switch (I->getOpcode()) {
// 0-(A-B) => B-A
case Instruction::Sub:
return Builder.CreateSub(
I->getOperand(1), I->getOperand(0), I->getName() + ".neg");

// 0-(A sdiv C) => A sdiv (0-C) provided the negation doesn't overflow.
case Instruction::SDiv: {
Constant *C = dyn_cast<Constant>(I->getOperand(1));
if (C && !C->containsUndefElement() && C->isNotMinSignedValue() &&
C->isNotOneValue())
return Builder.CreateSDiv(I->getOperand(0), ConstantExpr::getNeg(C),
I->getName() + ".neg", cast<BinaryOperator>(I)->isExact());
return nullptr;
}

// 0-(A<<B) => (0-A)<<B
case Instruction::Shl:
if (Value *NegA = freelyNegateValue(I->getOperand(0)))
return Builder.CreateShl(NegA, I->getOperand(1), I->getName() + ".neg");
return nullptr;

// 0-(trunc A) => trunc (0-A)
case Instruction::Trunc:
if (Value *NegA = freelyNegateValue(I->getOperand(0)))
return Builder.CreateTrunc(NegA, I->getType(), I->getName() + ".neg");
return nullptr;

// 0-(AB) => (0-A)B
// 0-(AB) => A(0-B)
case Instruction::Mul:
if (Value *NegA = freelyNegateValue(I->getOperand(0)))
return Builder.CreateMul(NegA, I->getOperand(1), V->getName() + ".neg");
if (Value *NegB = freelyNegateValue(I->getOperand(1)))
return Builder.CreateMul(I->getOperand(0), NegB, V->getName() + ".neg");
return nullptr;

default:
return nullptr;
}
}

static Value foldOperationIntoSelectOperand(Instruction &I, Value SO,		static Value foldOperationIntoSelectOperand(Instruction &I, Value SO,
InstCombiner::BuilderTy &Builder) {		InstCombiner::BuilderTy &Builder) {
if (auto *Cast = dyn_cast<CastInst>(&I))		if (auto *Cast = dyn_cast<CastInst>(&I))
return Builder.CreateCast(Cast->getOpcode(), SO, I.getType());		return Builder.CreateCast(Cast->getOpcode(), SO, I.getType());

assert(I.isBinaryOp() && "Unexpected opcode for select folding");		assert(I.isBinaryOp() && "Unexpected opcode for select folding");

// Figure out if the constant is the left or the right argument.		// Figure out if the constant is the left or the right argument.
▲ Show 20 Lines • Show All 2,964 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/and-or-icmps.ll

	Show First 20 Lines • Show All 204 Lines • ▼ Show 20 Lines
	; we'd get into foldAndOfICmps() without running InstSimplify			; we'd get into foldAndOfICmps() without running InstSimplify
	; on an 'and' that should have been killed. It's not obvious			; on an 'and' that should have been killed. It's not obvious
	; why, but removing anything hides the bug, hence the long test.			; why, but removing anything hides the bug, hence the long test.

	define void @simplify_before_foldAndOfICmps() {			define void @simplify_before_foldAndOfICmps() {
	; CHECK-LABEL: @simplify_before_foldAndOfICmps(			; CHECK-LABEL: @simplify_before_foldAndOfICmps(
	; CHECK-NEXT: [[A8:%.*]] = alloca i16, align 2			; CHECK-NEXT: [[A8:%.*]] = alloca i16, align 2
	; CHECK-NEXT: [[L7:%.]] = load i16, i16 [[A8]], align 2			; CHECK-NEXT: [[L7:%.]] = load i16, i16 [[A8]], align 2
	; CHECK-NEXT: [[C10:%.*]] = icmp ult i16 [[L7]], 2			; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i16 [[L7]], -1
				; CHECK-NEXT: [[B11:%.*]] = zext i1 [[TMP1]] to i16
				; CHECK-NEXT: [[C10:%.*]] = icmp ugt i16 [[L7]], [[B11]]
				; CHECK-NEXT: [[C5:%.*]] = icmp slt i16 [[L7]], 1
				; CHECK-NEXT: [[C11:%.*]] = icmp ne i16 [[L7]], 0
	; CHECK-NEXT: [[C7:%.*]] = icmp slt i16 [[L7]], 0			; CHECK-NEXT: [[C7:%.*]] = icmp slt i16 [[L7]], 0
	; CHECK-NEXT: [[C18:%.*]] = or i1 [[C7]], [[C10]]			; CHECK-NEXT: [[B15:%.*]] = xor i1 [[C7]], [[C10]]
	; CHECK-NEXT: [[L7_LOBIT:%.*]] = ashr i16 [[L7]], 15			; CHECK-NEXT: [[B19:%.*]] = xor i1 [[C11]], [[B15]]
	; CHECK-NEXT: [[TMP1:%.*]] = sext i16 [[L7_LOBIT]] to i64			; CHECK-NEXT: [[TMP2:%.*]] = and i1 [[C10]], [[C5]]
	; CHECK-NEXT: [[G26:%.]] = getelementptr i1, i1 null, i64 [[TMP1]]			; CHECK-NEXT: [[C3:%.*]] = and i1 [[TMP2]], [[B19]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i1 [[C10]], true
				; CHECK-NEXT: [[C18:%.*]] = or i1 [[C7]], [[TMP3]]
				; CHECK-NEXT: [[TMP4:%.*]] = sext i1 [[C3]] to i64
				; CHECK-NEXT: [[G26:%.]] = getelementptr i1, i1 null, i64 [[TMP4]]
	; CHECK-NEXT: store i16 [[L7]], i16* undef, align 2			; CHECK-NEXT: store i16 [[L7]], i16* undef, align 2
	; CHECK-NEXT: store i1 [[C18]], i1* undef, align 1			; CHECK-NEXT: store i1 [[C18]], i1* undef, align 1
	; CHECK-NEXT: store i1* [[G26]], i1** undef, align 8			; CHECK-NEXT: store i1* [[G26]], i1** undef, align 8
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	%A8 = alloca i16			%A8 = alloca i16
	%L7 = load i16, i16* %A8			%L7 = load i16, i16* %A8
	%G21 = getelementptr i16, i16* %A8, i8 -1			%G21 = getelementptr i16, i16* %A8, i8 -1
	▲ Show 20 Lines • Show All 323 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/fold-sub-of-not-to-inc-of-add.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -instcombine -S < %s \| FileCheck %s			; RUN: opt -instcombine -S < %s \| FileCheck %s

	; Given:			; Given:
	; sub %y, (xor %x, -1)			; sub %y, (xor %x, -1)
	; Transform it to:			; Transform it to:
	; add (add %x, 1), %y			; add (add %x, 1), %y
	; We prefer this form because that is what -reassociate would produce.			; We prefer this form because that is what -reassociate would produce.

	;------------------------------------------------------------------------------;			;------------------------------------------------------------------------------;
	; Scalar tests			; Scalar tests
	;------------------------------------------------------------------------------;			;------------------------------------------------------------------------------;

	define i32 @p0_scalar(i32 %x, i32 %y) {			define i32 @p0_scalar(i32 %x, i32 %y) {
	; CHECK-LABEL: @p0_scalar(			; CHECK-LABEL: @p0_scalar(
	; CHECK-NEXT: [[TMP1:%.]] = add i32 [[Y:%.]], 1			; CHECK-NEXT: [[T0_NEG:%.]] = add i32 [[X:%.]], 1
	; CHECK-NEXT: [[T1:%.]] = add i32 [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[T1:%.]] = add i32 [[T0_NEG]], [[Y:%.]]
	; CHECK-NEXT: ret i32 [[T1]]			; CHECK-NEXT: ret i32 [[T1]]
	;			;
	%t0 = xor i32 %x, -1			%t0 = xor i32 %x, -1
	%t1 = sub i32 %y, %t0			%t1 = sub i32 %y, %t0
	ret i32 %t1			ret i32 %t1
	}			}

	;------------------------------------------------------------------------------;			;------------------------------------------------------------------------------;
	; Vector tests			; Vector tests
	;------------------------------------------------------------------------------;			;------------------------------------------------------------------------------;

	define <4 x i32> @p1_vector_splat(<4 x i32> %x, <4 x i32> %y) {			define <4 x i32> @p1_vector_splat(<4 x i32> %x, <4 x i32> %y) {
	; CHECK-LABEL: @p1_vector_splat(			; CHECK-LABEL: @p1_vector_splat(
	; CHECK-NEXT: [[TMP1:%.]] = add <4 x i32> [[Y:%.]], <i32 1, i32 1, i32 1, i32 1>			; CHECK-NEXT: [[T0_NEG:%.]] = add <4 x i32> [[X:%.]], <i32 1, i32 1, i32 1, i32 1>
	; CHECK-NEXT: [[T1:%.]] = add <4 x i32> [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[T1:%.]] = add <4 x i32> [[T0_NEG]], [[Y:%.]]
	; CHECK-NEXT: ret <4 x i32> [[T1]]			; CHECK-NEXT: ret <4 x i32> [[T1]]
	;			;
	%t0 = xor <4 x i32> %x, <i32 -1, i32 -1, i32 -1, i32 -1>			%t0 = xor <4 x i32> %x, <i32 -1, i32 -1, i32 -1, i32 -1>
	%t1 = sub <4 x i32> %y, %t0			%t1 = sub <4 x i32> %y, %t0
	ret <4 x i32> %t1			ret <4 x i32> %t1
	}			}

	define <4 x i32> @p2_vector_undef(<4 x i32> %x, <4 x i32> %y) {			define <4 x i32> @p2_vector_undef(<4 x i32> %x, <4 x i32> %y) {
	; CHECK-LABEL: @p2_vector_undef(			; CHECK-LABEL: @p2_vector_undef(
	; CHECK-NEXT: [[TMP1:%.]] = add <4 x i32> [[Y:%.]], <i32 1, i32 1, i32 1, i32 1>			; CHECK-NEXT: [[T0_NEG:%.]] = add <4 x i32> [[X:%.]], <i32 1, i32 1, i32 1, i32 1>
	; CHECK-NEXT: [[T1:%.]] = add <4 x i32> [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[T1:%.]] = add <4 x i32> [[T0_NEG]], [[Y:%.]]
	; CHECK-NEXT: ret <4 x i32> [[T1]]			; CHECK-NEXT: ret <4 x i32> [[T1]]
	;			;
	%t0 = xor <4 x i32> %x, <i32 -1, i32 -1, i32 undef, i32 -1>			%t0 = xor <4 x i32> %x, <i32 -1, i32 -1, i32 undef, i32 -1>
	%t1 = sub <4 x i32> %y, %t0			%t1 = sub <4 x i32> %y, %t0
	ret <4 x i32> %t1			ret <4 x i32> %t1
	}			}

	;------------------------------------------------------------------------------;			;------------------------------------------------------------------------------;
	Show All 28 Lines
	;			;
	%t0 = xor i32 %x, -1			%t0 = xor i32 %x, -1
	%t1 = sub i32 %t0, %y ; swapped			%t1 = sub i32 %t0, %y ; swapped
	ret i32 %t1			ret i32 %t1
	}			}

	define i32 @n5_is_not_not(i32 %x, i32 %y) {			define i32 @n5_is_not_not(i32 %x, i32 %y) {
	; CHECK-LABEL: @n5_is_not_not(			; CHECK-LABEL: @n5_is_not_not(
	; CHECK-NEXT: [[T0:%.]] = xor i32 [[X:%.]], 2147483647			; CHECK-NEXT: [[T0_NEG:%.]] = add i32 [[X:%.]], -2147483647
	; CHECK-NEXT: [[T1:%.]] = sub i32 [[Y:%.]], [[T0]]			; CHECK-NEXT: [[T1:%.]] = add i32 [[T0_NEG]], [[Y:%.]]
	; CHECK-NEXT: ret i32 [[T1]]			; CHECK-NEXT: ret i32 [[T1]]
	;			;
	%t0 = xor i32 %x, 2147483647 ; not -1			%t0 = xor i32 %x, 2147483647 ; not -1
	%t1 = sub i32 %y, %t0			%t1 = sub i32 %y, %t0
	ret i32 %t1			ret i32 %t1
	}			}

llvm/test/Transforms/InstCombine/high-bit-signmask-with-trunc.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt %s -instcombine -S \| FileCheck %s		; RUN: opt %s -instcombine -S \| FileCheck %s

define i32 @t0(i64 %x) {		define i32 @t0(i64 %x) {
; CHECK-LABEL: @t0(		; CHECK-LABEL: @t0(
; CHECK-NEXT: [[TMP1:%.]] = ashr i64 [[X:%.]], 63		; CHECK-NEXT: [[T0_NEG:%.]] = ashr i64 [[X:%.]], 63
; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32		; CHECK-NEXT: [[T1_NEG:%.*]] = trunc i64 [[T0_NEG]] to i32
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[T1_NEG]]
;		;
%t0 = lshr i64 %x, 63		%t0 = lshr i64 %x, 63
%t1 = trunc i64 %t0 to i32		%t1 = trunc i64 %t0 to i32
%r = sub i32 0, %t1		%r = sub i32 0, %t1
ret i32 %r		ret i32 %r
}		}
define i32 @t1_exact(i64 %x) {		define i32 @t1_exact(i64 %x) {
; CHECK-LABEL: @t1_exact(		; CHECK-LABEL: @t1_exact(
; CHECK-NEXT: [[TMP1:%.]] = ashr exact i64 [[X:%.]], 63		; CHECK-NEXT: [[T0_NEG:%.]] = ashr exact i64 [[X:%.]], 63
; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32		; CHECK-NEXT: [[T1_NEG:%.*]] = trunc i64 [[T0_NEG]] to i32
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[T1_NEG]]
;		;
%t0 = lshr exact i64 %x, 63		%t0 = lshr exact i64 %x, 63
%t1 = trunc i64 %t0 to i32		%t1 = trunc i64 %t0 to i32
%r = sub i32 0, %t1		%r = sub i32 0, %t1
ret i32 %r		ret i32 %r
}		}
define i32 @t2(i64 %x) {		define i32 @t2(i64 %x) {
; CHECK-LABEL: @t2(		; CHECK-LABEL: @t2(
; CHECK-NEXT: [[TMP1:%.]] = lshr i64 [[X:%.]], 63		; CHECK-NEXT: [[T0_NEG:%.]] = lshr i64 [[X:%.]], 63
; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32		; CHECK-NEXT: [[T1_NEG:%.*]] = trunc i64 [[T0_NEG]] to i32
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[T1_NEG]]
;		;
%t0 = ashr i64 %x, 63		%t0 = ashr i64 %x, 63
%t1 = trunc i64 %t0 to i32		%t1 = trunc i64 %t0 to i32
%r = sub i32 0, %t1		%r = sub i32 0, %t1
ret i32 %r		ret i32 %r
}		}
define i32 @t3_exact(i64 %x) {		define i32 @t3_exact(i64 %x) {
; CHECK-LABEL: @t3_exact(		; CHECK-LABEL: @t3_exact(
; CHECK-NEXT: [[TMP1:%.]] = lshr exact i64 [[X:%.]], 63		; CHECK-NEXT: [[T0_NEG:%.]] = lshr exact i64 [[X:%.]], 63
; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32		; CHECK-NEXT: [[T1_NEG:%.*]] = trunc i64 [[T0_NEG]] to i32
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[T1_NEG]]
;		;
%t0 = ashr exact i64 %x, 63		%t0 = ashr exact i64 %x, 63
%t1 = trunc i64 %t0 to i32		%t1 = trunc i64 %t0 to i32
%r = sub i32 0, %t1		%r = sub i32 0, %t1
ret i32 %r		ret i32 %r
}		}

define <2 x i32> @t4(<2 x i64> %x) {		define <2 x i32> @t4(<2 x i64> %x) {
; CHECK-LABEL: @t4(		; CHECK-LABEL: @t4(
; CHECK-NEXT: [[TMP1:%.]] = ashr <2 x i64> [[X:%.]], <i64 63, i64 63>		; CHECK-NEXT: [[T0_NEG:%.]] = ashr <2 x i64> [[X:%.]], <i64 63, i64 63>
; CHECK-NEXT: [[R:%.*]] = trunc <2 x i64> [[TMP1]] to <2 x i32>		; CHECK-NEXT: [[T1_NEG:%.*]] = trunc <2 x i64> [[T0_NEG]] to <2 x i32>
; CHECK-NEXT: ret <2 x i32> [[R]]		; CHECK-NEXT: ret <2 x i32> [[T1_NEG]]
;		;
%t0 = lshr <2 x i64> %x, <i64 63, i64 63>		%t0 = lshr <2 x i64> %x, <i64 63, i64 63>
%t1 = trunc <2 x i64> %t0 to <2 x i32>		%t1 = trunc <2 x i64> %t0 to <2 x i32>
%r = sub <2 x i32> zeroinitializer, %t1		%r = sub <2 x i32> zeroinitializer, %t1
ret <2 x i32> %r		ret <2 x i32> %r
}		}

define <2 x i32> @t5(<2 x i64> %x) {		define <2 x i32> @t5(<2 x i64> %x) {
Show All 9 Lines	;
ret <2 x i32> %r		ret <2 x i32> %r
}		}

declare void @use64(i64)		declare void @use64(i64)
declare void @use32(i32)		declare void @use32(i32)

define i32 @t6(i64 %x) {		define i32 @t6(i64 %x) {
; CHECK-LABEL: @t6(		; CHECK-LABEL: @t6(
; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63		; CHECK-NEXT: [[T0_NEG:%.]] = ashr i64 [[X:%.]], 63
		; CHECK-NEXT: [[T0:%.*]] = lshr i64 [[X]], 63
; CHECK-NEXT: call void @use64(i64 [[T0]])		; CHECK-NEXT: call void @use64(i64 [[T0]])
; CHECK-NEXT: [[TMP1:%.*]] = ashr i64 [[X]], 63		; CHECK-NEXT: [[T1_NEG:%.*]] = trunc i64 [[T0_NEG]] to i32
; CHECK-NEXT: [[R:%.*]] = trunc i64 [[TMP1]] to i32		; CHECK-NEXT: ret i32 [[T1_NEG]]
; CHECK-NEXT: ret i32 [[R]]
;		;
%t0 = lshr i64 %x, 63		%t0 = lshr i64 %x, 63
call void @use64(i64 %t0)		call void @use64(i64 %t0)
%t1 = trunc i64 %t0 to i32		%t1 = trunc i64 %t0 to i32
%r = sub i32 0, %t1		%r = sub i32 0, %t1
ret i32 %r		ret i32 %r
}		}

Show All 39 Lines	;
%t0 = lshr i64 %x, 62		%t0 = lshr i64 %x, 62
%t1 = trunc i64 %t0 to i32		%t1 = trunc i64 %t0 to i32
%r = sub i32 0, %t1		%r = sub i32 0, %t1
ret i32 %r		ret i32 %r
}		}

define i32 @n10(i64 %x) {		define i32 @n10(i64 %x) {
; CHECK-LABEL: @n10(		; CHECK-LABEL: @n10(
; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63		; CHECK-NEXT: [[T0_NEG:%.]] = ashr i64 [[X:%.]], 63
; CHECK-NEXT: [[T1:%.*]] = trunc i64 [[T0]] to i32		; CHECK-NEXT: [[T1_NEG:%.*]] = trunc i64 [[T0_NEG]] to i32
; CHECK-NEXT: [[R:%.*]] = xor i32 [[T1]], 1		; CHECK-NEXT: [[R:%.*]] = add i32 [[T1_NEG]], 1
; CHECK-NEXT: ret i32 [[R]]		; CHECK-NEXT: ret i32 [[R]]
;		;
%t0 = lshr i64 %x, 63		%t0 = lshr i64 %x, 63
%t1 = trunc i64 %t0 to i32		%t1 = trunc i64 %t0 to i32
%r = sub i32 1, %t1		%r = sub i32 1, %t1
ret i32 %r		ret i32 %r
}		}

llvm/test/Transforms/InstCombine/high-bit-signmask.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt %s -instcombine -S \| FileCheck %s			; RUN: opt %s -instcombine -S \| FileCheck %s

	define i64 @t0(i64 %x) {			define i64 @t0(i64 %x) {
	; CHECK-LABEL: @t0(			; CHECK-LABEL: @t0(
	; CHECK-NEXT: [[R:%.]] = ashr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0_NEG:%.]] = ashr i64 [[X:%.]], 63
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[T0_NEG]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}
	define i64 @t0_exact(i64 %x) {			define i64 @t0_exact(i64 %x) {
	; CHECK-LABEL: @t0_exact(			; CHECK-LABEL: @t0_exact(
	; CHECK-NEXT: [[R:%.]] = ashr exact i64 [[X:%.]], 63			; CHECK-NEXT: [[T0_NEG:%.]] = ashr exact i64 [[X:%.]], 63
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[T0_NEG]]
	;			;
	%t0 = lshr exact i64 %x, 63			%t0 = lshr exact i64 %x, 63
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}
	define i64 @t2(i64 %x) {			define i64 @t2(i64 %x) {
	; CHECK-LABEL: @t2(			; CHECK-LABEL: @t2(
	; CHECK-NEXT: [[R:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0_NEG:%.]] = lshr i64 [[X:%.]], 63
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[T0_NEG]]
	;			;
	%t0 = ashr i64 %x, 63			%t0 = ashr i64 %x, 63
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}
	define i64 @t3_exact(i64 %x) {			define i64 @t3_exact(i64 %x) {
	; CHECK-LABEL: @t3_exact(			; CHECK-LABEL: @t3_exact(
	; CHECK-NEXT: [[R:%.]] = lshr exact i64 [[X:%.]], 63			; CHECK-NEXT: [[T0_NEG:%.]] = lshr exact i64 [[X:%.]], 63
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[T0_NEG]]
	;			;
	%t0 = ashr exact i64 %x, 63			%t0 = ashr exact i64 %x, 63
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define <2 x i64> @t4(<2 x i64> %x) {			define <2 x i64> @t4(<2 x i64> %x) {
	; CHECK-LABEL: @t4(			; CHECK-LABEL: @t4(
	; CHECK-NEXT: [[R:%.]] = ashr <2 x i64> [[X:%.]], <i64 63, i64 63>			; CHECK-NEXT: [[T0_NEG:%.]] = ashr <2 x i64> [[X:%.]], <i64 63, i64 63>
	; CHECK-NEXT: ret <2 x i64> [[R]]			; CHECK-NEXT: ret <2 x i64> [[T0_NEG]]
	;			;
	%t0 = lshr <2 x i64> %x, <i64 63, i64 63>			%t0 = lshr <2 x i64> %x, <i64 63, i64 63>
	%r = sub <2 x i64> zeroinitializer, %t0			%r = sub <2 x i64> zeroinitializer, %t0
	ret <2 x i64> %r			ret <2 x i64> %r
	}			}

	define <2 x i64> @t5(<2 x i64> %x) {			define <2 x i64> @t5(<2 x i64> %x) {
	; CHECK-LABEL: @t5(			; CHECK-LABEL: @t5(
	; CHECK-NEXT: [[T0:%.]] = lshr <2 x i64> [[X:%.]], <i64 63, i64 undef>			; CHECK-NEXT: [[T0:%.]] = lshr <2 x i64> [[X:%.]], <i64 63, i64 undef>
	; CHECK-NEXT: [[R:%.*]] = sub <2 x i64> <i64 0, i64 undef>, [[T0]]			; CHECK-NEXT: [[R:%.*]] = sub <2 x i64> <i64 0, i64 undef>, [[T0]]
	; CHECK-NEXT: ret <2 x i64> [[R]]			; CHECK-NEXT: ret <2 x i64> [[R]]
	;			;
	%t0 = lshr <2 x i64> %x, <i64 63, i64 undef>			%t0 = lshr <2 x i64> %x, <i64 63, i64 undef>
	%r = sub <2 x i64> <i64 0, i64 undef>, %t0			%r = sub <2 x i64> <i64 0, i64 undef>, %t0
	ret <2 x i64> %r			ret <2 x i64> %r
	}			}

	declare void @use64(i64)			declare void @use64(i64)
	declare void @use32(i64)			declare void @use32(i64)

	define i64 @t6(i64 %x) {			define i64 @t6(i64 %x) {
	; CHECK-LABEL: @t6(			; CHECK-LABEL: @t6(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0_NEG:%.]] = ashr i64 [[X:%.]], 63
				; CHECK-NEXT: [[T0:%.*]] = lshr i64 [[X]], 63
	; CHECK-NEXT: call void @use64(i64 [[T0]])			; CHECK-NEXT: call void @use64(i64 [[T0]])
	; CHECK-NEXT: [[R:%.*]] = ashr i64 [[X]], 63			; CHECK-NEXT: ret i64 [[T0_NEG]]
	; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	call void @use64(i64 %t0)			call void @use64(i64 %t0)
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define i64 @n7(i64 %x) {			define i64 @n7(i64 %x) {
	; CHECK-LABEL: @n7(			; CHECK-LABEL: @n7(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0_NEG:%.]] = ashr i64 [[X:%.]], 63
				; CHECK-NEXT: [[T0:%.*]] = lshr i64 [[X]], 63
	; CHECK-NEXT: call void @use32(i64 [[T0]])			; CHECK-NEXT: call void @use32(i64 [[T0]])
	; CHECK-NEXT: [[R:%.*]] = ashr i64 [[X]], 63			; CHECK-NEXT: ret i64 [[T0_NEG]]
	; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	call void @use32(i64 %t0)			call void @use32(i64 %t0)
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define i64 @n8(i64 %x) {			define i64 @n8(i64 %x) {
	; CHECK-LABEL: @n8(			; CHECK-LABEL: @n8(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0_NEG:%.]] = ashr i64 [[X:%.]], 63
				; CHECK-NEXT: [[T0:%.*]] = lshr i64 [[X]], 63
	; CHECK-NEXT: call void @use64(i64 [[T0]])			; CHECK-NEXT: call void @use64(i64 [[T0]])
	; CHECK-NEXT: call void @use32(i64 [[T0]])			; CHECK-NEXT: call void @use32(i64 [[T0]])
	; CHECK-NEXT: [[R:%.*]] = ashr i64 [[X]], 63			; CHECK-NEXT: ret i64 [[T0_NEG]]
	; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	call void @use64(i64 %t0)			call void @use64(i64 %t0)
	call void @use32(i64 %t0)			call void @use32(i64 %t0)
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define i64 @n9(i64 %x) {			define i64 @n9(i64 %x) {
	; CHECK-LABEL: @n9(			; CHECK-LABEL: @n9(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 62			; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 62
	; CHECK-NEXT: [[R:%.*]] = sub nsw i64 0, [[T0]]			; CHECK-NEXT: [[R:%.*]] = sub nsw i64 0, [[T0]]
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 62			%t0 = lshr i64 %x, 62
	%r = sub i64 0, %t0			%r = sub i64 0, %t0
	ret i64 %r			ret i64 %r
	}			}

	define i64 @n10(i64 %x) {			define i64 @n10(i64 %x) {
	; CHECK-LABEL: @n10(			; CHECK-LABEL: @n10(
	; CHECK-NEXT: [[T0:%.]] = lshr i64 [[X:%.]], 63			; CHECK-NEXT: [[T0_NEG:%.]] = ashr i64 [[X:%.]], 63
	; CHECK-NEXT: [[R:%.*]] = xor i64 [[T0]], 1			; CHECK-NEXT: [[R:%.*]] = add nsw i64 [[T0_NEG]], 1
	; CHECK-NEXT: ret i64 [[R]]			; CHECK-NEXT: ret i64 [[R]]
	;			;
	%t0 = lshr i64 %x, 63			%t0 = lshr i64 %x, 63
	%r = sub i64 1, %t0			%r = sub i64 1, %t0
	ret i64 %r			ret i64 %r
	}			}

llvm/test/Transforms/InstCombine/sub-of-negatible.ll

Show All 25 Lines	;
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}

; Shift-left can be negated if all uses can be updated		; Shift-left can be negated if all uses can be updated
define i8 @t2(i8 %x, i8 %y) {		define i8 @t2(i8 %x, i8 %y) {
; CHECK-LABEL: @t2(		; CHECK-LABEL: @t2(
; CHECK-NEXT: [[T0:%.]] = shl i8 -42, [[Y:%.]]		; CHECK-NEXT: [[T0_NEG:%.]] = shl i8 42, [[Y:%.]]
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[T0_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = shl i8 -42, %y		%t0 = shl i8 -42, %y
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @n2(i8 %x, i8 %y) {		define i8 @n2(i8 %x, i8 %y) {
; CHECK-LABEL: @n2(		; CHECK-LABEL: @n2(
; CHECK-NEXT: [[T0:%.]] = shl i8 -42, [[Y:%.]]		; CHECK-NEXT: [[T0:%.]] = shl i8 -42, [[Y:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = shl i8 -42, %y		%t0 = shl i8 -42, %y
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @t3(i8 %x, i8 %y, i8 %z) {		define i8 @t3(i8 %x, i8 %y, i8 %z) {
; CHECK-LABEL: @t3(		; CHECK-LABEL: @t3(
; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Z:%.]]		; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Z:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = shl i8 [[T0]], [[Y:%.]]		; CHECK-NEXT: [[T1_NEG:%.]] = shl i8 [[Z]], [[Y:%.]]
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[T1_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = sub i8 0, %z		%t0 = sub i8 0, %z
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = shl i8 %t0, %y		%t1 = shl i8 %t0, %y
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
Show All 12 Lines	;
call void @use8(i8 %t1)		call void @use8(i8 %t1)
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}

; Select can be negated if all it's operands can be negated and all the users of select can be updated		; Select can be negated if all it's operands can be negated and all the users of select can be updated
define i8 @t4(i8 %x, i1 %y) {		define i8 @t4(i8 %x, i1 %y) {
; CHECK-LABEL: @t4(		; CHECK-LABEL: @t4(
; CHECK-NEXT: [[T0:%.]] = select i1 [[Y:%.]], i8 -42, i8 44		; CHECK-NEXT: [[T0_NEG:%.]] = select i1 [[Y:%.]], i8 42, i8 -44
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[T0_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = select i1 %y, i8 -42, i8 44		%t0 = select i1 %y, i8 -42, i8 44
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @n4(i8 %x, i1 %y) {		define i8 @n4(i8 %x, i1 %y) {
; CHECK-LABEL: @n4(		; CHECK-LABEL: @n4(
Show All 16 Lines	;
%t0 = select i1 %y, i8 -42, i8 %z		%t0 = select i1 %y, i8 -42, i8 %z
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @t6(i8 %x, i1 %y, i8 %z) {		define i8 @t6(i8 %x, i1 %y, i8 %z) {
; CHECK-LABEL: @t6(		; CHECK-LABEL: @t6(
; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Z:%.]]		; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Z:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = select i1 [[Y:%.]], i8 -42, i8 [[T0]]		; CHECK-NEXT: [[T1_NEG:%.]] = select i1 [[Y:%.]], i8 42, i8 [[Z]]
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[T1_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = sub i8 0, %z		%t0 = sub i8 0, %z
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = select i1 %y, i8 -42, i8 %t0		%t1 = select i1 %y, i8 -42, i8 %t0
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
define i8 @t7(i8 %x, i1 %y, i8 %z) {		define i8 @t7(i8 %x, i1 %y, i8 %z) {
; CHECK-LABEL: @t7(		; CHECK-LABEL: @t7(
; CHECK-NEXT: [[T0:%.]] = shl i8 1, [[Z:%.]]		; CHECK-NEXT: [[T0_NEG:%.]] = shl i8 -1, [[Z:%.]]
; CHECK-NEXT: [[T1:%.]] = select i1 [[Y:%.]], i8 0, i8 [[T0]]		; CHECK-NEXT: [[T1_NEG:%.]] = select i1 [[Y:%.]], i8 0, i8 [[T0_NEG]]
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[T1_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = shl i8 1, %z		%t0 = shl i8 1, %z
%t1 = select i1 %y, i8 0, i8 %t0		%t1 = select i1 %y, i8 0, i8 %t0
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
define i8 @n8(i8 %x, i1 %y, i8 %z) {		define i8 @n8(i8 %x, i1 %y, i8 %z) {
; CHECK-LABEL: @n8(		; CHECK-LABEL: @n8(
; CHECK-NEXT: [[T0:%.]] = shl i8 1, [[Z:%.]]		; CHECK-NEXT: [[T0:%.]] = shl i8 1, [[Z:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = select i1 [[Y:%.]], i8 0, i8 [[T0]]		; CHECK-NEXT: [[T1:%.]] = select i1 [[Y:%.]], i8 0, i8 [[T0]]
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = shl i8 1, %z		%t0 = shl i8 1, %z
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = select i1 %y, i8 0, i8 %t0		%t1 = select i1 %y, i8 0, i8 %t0
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}

; Subtraction can be negated by swapping its operands.		; Subtraction can be negated by swapping its operands.
; x - (y - z) -> x - y + z -> x + (z - y)		; x - (y - z) -> x - y + z -> x + (z - y)
define i8 @t9(i8 %x, i8 %y) {		define i8 @t9(i8 %x, i8 %y) {
		spatelUnsubmitted Done Reply Inline Actions I didn't follow the diffs here - was one of these tests redundant? The code comment didn't match before, but it still doesn't? spatel: I didn't follow the diffs here - was one of these tests redundant? The code comment didn't…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions The test was too complicated, was checking more than the minimal pattern - subtraction can be freely negated by swapping its operands. lebedev.ri: The test was too complicated, was checking more than the minimal pattern - subtraction can be…
; CHECK-LABEL: @t9(		; CHECK-LABEL: @t9(
; CHECK-NEXT: [[T0_NEG:%.]] = sub i8 [[X:%.]], [[Y:%.*]]		; CHECK-NEXT: [[T0_NEG:%.]] = sub i8 [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: ret i8 [[T0_NEG]]		; CHECK-NEXT: ret i8 [[T0_NEG]]
;		;
%t0 = sub i8 %y, %x		%t0 = sub i8 %y, %x
%t1 = sub i8 0, %t0		%t1 = sub i8 0, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @n10(i8 %x, i8 %y, i8 %z) {		define i8 @n10(i8 %x, i8 %y, i8 %z) {
; CHECK-LABEL: @n10(		; CHECK-LABEL: @n10(
; CHECK-NEXT: [[T0:%.]] = sub i8 [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[T0_NEG:%.]] = sub i8 [[X:%.]], [[Y:%.*]]
		; CHECK-NEXT: [[T0:%.*]] = sub i8 [[Y]], [[X]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.*]] = sub i8 0, [[T0]]		; CHECK-NEXT: ret i8 [[T0_NEG]]
; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = sub i8 %y, %x		%t0 = sub i8 %y, %x
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = sub i8 0, %t0		%t1 = sub i8 0, %t0
ret i8 %t1		ret i8 %t1
}		}

; Addition can be negated if both operands can be negated		; Addition can be negated if both operands can be negated
Show All 15 Lines	;
%t2 = add i8 %t0, %t1		%t2 = add i8 %t0, %t1
%t3 = sub i8 %x, %t2		%t3 = sub i8 %x, %t2
ret i8 %t3		ret i8 %t3
}		}
define i8 @n13(i8 %x, i8 %y, i8 %z) {		define i8 @n13(i8 %x, i8 %y, i8 %z) {
; CHECK-LABEL: @n13(		; CHECK-LABEL: @n13(
; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]		; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T11:%.]] = sub i8 [[Y]], [[Z:%.]]		; CHECK-NEXT: [[T1_NEG:%.]] = sub i8 [[Y]], [[Z:%.]]
; CHECK-NEXT: [[T2:%.]] = add i8 [[T11]], [[X:%.]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[T1_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = sub i8 0, %y		%t0 = sub i8 0, %y
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = add i8 %t0, %z		%t1 = add i8 %t0, %z
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
Show All 20 Lines
}		}

; Multiplication can be negated if either one of operands can be negated		; Multiplication can be negated if either one of operands can be negated
; x - (y * z) -> x + ((-y) * z) or x + ((-z) * y)		; x - (y * z) -> x + ((-y) * z) or x + ((-z) * y)
define i8 @t15(i8 %x, i8 %y, i8 %z) {		define i8 @t15(i8 %x, i8 %y, i8 %z) {
; CHECK-LABEL: @t15(		; CHECK-LABEL: @t15(
; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]		; CHECK-NEXT: [[T0:%.]] = sub i8 0, [[Y:%.]]
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[TMP1:%.]] = mul i8 [[Z:%.]], [[Y]]		; CHECK-NEXT: [[T1_NEG:%.]] = mul i8 [[Y]], [[Z:%.]]
; CHECK-NEXT: [[T2:%.]] = add i8 [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[T1_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = sub i8 0, %y		%t0 = sub i8 0, %y
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = mul i8 %t0, %z		%t1 = mul i8 %t0, %z
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
Show All 9 Lines	;
%t0 = sub i8 0, %y		%t0 = sub i8 0, %y
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = mul i8 %t0, %z		%t1 = mul i8 %t0, %z
call void @use8(i8 %t1)		call void @use8(i8 %t1)
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}

; Phi can be negated if all incoming values can be negated		; Phi can be negated if all incoming values can be negated
define i8 @t16(i1 %c, i8 %x) {		define i8 @t16(i1 %c, i8 %x) {
		spatelUnsubmitted Done Reply Inline Actions Add these tests with baseline results as pre-commit? spatel: Add these tests with baseline results as pre-commit?
; CHECK-LABEL: @t16(		; CHECK-LABEL: @t16(
; CHECK-NEXT: begin:		; CHECK-NEXT: begin:
; CHECK-NEXT: br i1 [[C:%.]], label [[THEN:%.]], label [[ELSE:%.*]]		; CHECK-NEXT: br i1 [[C:%.]], label [[THEN:%.]], label [[ELSE:%.*]]
; CHECK: then:		; CHECK: then:
; CHECK-NEXT: br label [[END:%.*]]		; CHECK-NEXT: br label [[END:%.*]]
; CHECK: else:		; CHECK: else:
; CHECK-NEXT: br label [[END]]		; CHECK-NEXT: br label [[END]]
; CHECK: end:		; CHECK: end:
; CHECK-NEXT: [[Z:%.]] = phi i8 [ [[X:%.]], [[THEN]] ], [ 42, [[ELSE]] ]		; CHECK-NEXT: [[Z_NEG:%.]] = phi i8 [ [[X:%.]], [[THEN]] ], [ 42, [[ELSE]] ]
; CHECK-NEXT: ret i8 [[Z]]		; CHECK-NEXT: ret i8 [[Z_NEG]]
;		;
begin:		begin:
br i1 %c, label %then, label %else		br i1 %c, label %then, label %else
then:		then:
%y = sub i8 0, %x		%y = sub i8 0, %x
br label %end		br label %end
else:		else:
br label %end		br label %end
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
else:		else:
br label %end		br label %end
end:		end:
%r = phi i8 [ %z, %then], [ %y, %else ]		%r = phi i8 [ %z, %then], [ %y, %else ]
%n = sub i8 0, %r		%n = sub i8 0, %r
ret i8 %n		ret i8 %n
}		}

; truncation can be negated if it's operand can be negated		; truncation can be negated if it's operand can be negated
		spatelUnsubmitted Done Reply Inline Actions it's -> its spatel: it's -> its
define i8 @t20(i8 %x, i16 %y) {		define i8 @t20(i8 %x, i16 %y) {
; CHECK-LABEL: @t20(		; CHECK-LABEL: @t20(
; CHECK-NEXT: [[T0:%.]] = shl i16 -42, [[Y:%.]]		; CHECK-NEXT: [[T0_NEG:%.]] = shl i16 42, [[Y:%.]]
; CHECK-NEXT: [[T1:%.*]] = trunc i16 [[T0]] to i8		; CHECK-NEXT: [[T1_NEG:%.*]] = trunc i16 [[T0_NEG]] to i8
; CHECK-NEXT: [[T2:%.]] = sub i8 [[X:%.]], [[T1]]		; CHECK-NEXT: [[T2:%.]] = add i8 [[T1_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T2]]		; CHECK-NEXT: ret i8 [[T2]]
;		;
%t0 = shl i16 -42, %y		%t0 = shl i16 -42, %y
%t1 = trunc i16 %t0 to i8		%t1 = trunc i16 %t0 to i8
%t2 = sub i8 %x, %t1		%t2 = sub i8 %x, %t1
ret i8 %t2		ret i8 %t2
}		}
define i8 @n21(i8 %x, i16 %y) {		define i8 @n21(i8 %x, i16 %y) {
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	;
%o = xor i4 %x, 5		%o = xor i4 %x, 5
%s = shl i4 %o, %y		%s = shl i4 %o, %y
%r = sub i4 0, %s		%r = sub i4 0, %s
ret i4 %r		ret i4 %r
}		}

define i8 @negate_shl_not_uses(i8 %x, i8 %y) {		define i8 @negate_shl_not_uses(i8 %x, i8 %y) {
; CHECK-LABEL: @negate_shl_not_uses(		; CHECK-LABEL: @negate_shl_not_uses(
; CHECK-NEXT: [[O:%.]] = xor i8 [[X:%.]], -1		; CHECK-NEXT: [[O_NEG:%.]] = add i8 [[X:%.]], 1
		; CHECK-NEXT: [[O:%.*]] = xor i8 [[X]], -1
; CHECK-NEXT: call void @use8(i8 [[O]])		; CHECK-NEXT: call void @use8(i8 [[O]])
; CHECK-NEXT: [[O_NEG:%.*]] = add i8 [[X]], 1
; CHECK-NEXT: [[S_NEG:%.]] = shl i8 [[O_NEG]], [[Y:%.]]		; CHECK-NEXT: [[S_NEG:%.]] = shl i8 [[O_NEG]], [[Y:%.]]
; CHECK-NEXT: ret i8 [[S_NEG]]		; CHECK-NEXT: ret i8 [[S_NEG]]
;		;
%o = xor i8 %x, -1		%o = xor i8 %x, -1
call void @use8(i8 %o)		call void @use8(i8 %o)
%s = shl i8 %o, %y		%s = shl i8 %o, %y
%r = sub i8 0, %s		%r = sub i8 0, %s
ret i8 %r		ret i8 %r
}		}

define <2 x i4> @negate_mul_not_uses_vec(<2 x i4> %x, <2 x i4> %y) {		define <2 x i4> @negate_mul_not_uses_vec(<2 x i4> %x, <2 x i4> %y) {
; CHECK-LABEL: @negate_mul_not_uses_vec(		; CHECK-LABEL: @negate_mul_not_uses_vec(
; CHECK-NEXT: [[O:%.]] = xor <2 x i4> [[X:%.]], <i4 -1, i4 -1>		; CHECK-NEXT: [[O_NEG:%.]] = add <2 x i4> [[X:%.]], <i4 1, i4 1>
		; CHECK-NEXT: [[O:%.*]] = xor <2 x i4> [[X]], <i4 -1, i4 -1>
; CHECK-NEXT: call void @use_v2i4(<2 x i4> [[O]])		; CHECK-NEXT: call void @use_v2i4(<2 x i4> [[O]])
; CHECK-NEXT: [[O_NEG:%.*]] = add <2 x i4> [[X]], <i4 1, i4 1>
; CHECK-NEXT: [[S_NEG:%.]] = mul <2 x i4> [[O_NEG]], [[Y:%.]]		; CHECK-NEXT: [[S_NEG:%.]] = mul <2 x i4> [[O_NEG]], [[Y:%.]]
; CHECK-NEXT: ret <2 x i4> [[S_NEG]]		; CHECK-NEXT: ret <2 x i4> [[S_NEG]]
;		;
%o = xor <2 x i4> %x, <i4 -1, i4 -1>		%o = xor <2 x i4> %x, <i4 -1, i4 -1>
call void @use_v2i4(<2 x i4> %o)		call void @use_v2i4(<2 x i4> %o)
%s = mul <2 x i4> %o, %y		%s = mul <2 x i4> %o, %y
%r = sub <2 x i4> zeroinitializer, %s		%r = sub <2 x i4> zeroinitializer, %s
ret <2 x i4> %r		ret <2 x i4> %r
}		}

; signed division can be negated if divisor can be negated and is not 1/-1		; signed division can be negated if divisor can be negated and is not 1/-1
define i8 @negate_sdiv(i8 %x, i8 %y) {		define i8 @negate_sdiv(i8 %x, i8 %y) {
; CHECK-LABEL: @negate_sdiv(		; CHECK-LABEL: @negate_sdiv(
; CHECK-NEXT: [[T0:%.]] = sdiv i8 [[Y:%.]], 42		; CHECK-NEXT: [[T0_NEG:%.]] = sdiv i8 [[Y:%.]], -42
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[T0_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = sdiv i8 %y, 42		%t0 = sdiv i8 %y, 42
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @negate_sdiv_extrause(i8 %x, i8 %y) {		define i8 @negate_sdiv_extrause(i8 %x, i8 %y) {
; CHECK-LABEL: @negate_sdiv_extrause(		; CHECK-LABEL: @negate_sdiv_extrause(
; CHECK-NEXT: [[T0:%.]] = sdiv i8 [[Y:%.]], 42		; CHECK-NEXT: [[T0:%.]] = sdiv i8 [[Y:%.]], 42
; CHECK-NEXT: call void @use8(i8 [[T0]])		; CHECK-NEXT: call void @use8(i8 [[T0]])
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = sdiv i8 %y, 42		%t0 = sdiv i8 %y, 42
call void @use8(i8 %t0)		call void @use8(i8 %t0)
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
		define i8 @negate_sdiv_extrause2(i8 %x, i8 %y) {
		; CHECK-LABEL: @negate_sdiv_extrause2(
		; CHECK-NEXT: [[T0:%.]] = sdiv i8 [[Y:%.]], 42
		; CHECK-NEXT: call void @use8(i8 [[T0]])
		; CHECK-NEXT: [[T1:%.*]] = sub nsw i8 0, [[T0]]
		; CHECK-NEXT: ret i8 [[T1]]
		;
		%t0 = sdiv i8 %y, 42
		call void @use8(i8 %t0)
		%t1 = sub i8 0, %t0
		ret i8 %t1
		}

; Right-shift sign bit smear is negatible.		; Right-shift sign bit smear is negatible.
define i8 @negate_ashr(i8 %x, i8 %y) {		define i8 @negate_ashr(i8 %x, i8 %y) {
; CHECK-LABEL: @negate_ashr(		; CHECK-LABEL: @negate_ashr(
; CHECK-NEXT: [[T0:%.]] = ashr i8 [[Y:%.]], 7		; CHECK-NEXT: [[T0_NEG:%.]] = lshr i8 [[Y:%.]], 7
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[T0_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = ashr i8 %y, 7		%t0 = ashr i8 %y, 7
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @negate_lshr(i8 %x, i8 %y) {		define i8 @negate_lshr(i8 %x, i8 %y) {
; CHECK-LABEL: @negate_lshr(		; CHECK-LABEL: @negate_lshr(
; CHECK-NEXT: [[T0:%.]] = lshr i8 [[Y:%.]], 7		; CHECK-NEXT: [[T0_NEG:%.]] = ashr i8 [[Y:%.]], 7
; CHECK-NEXT: [[T1:%.]] = sub i8 [[X:%.]], [[T0]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[T0_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = lshr i8 %y, 7		%t0 = lshr i8 %y, 7
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @negate_ashr_extrause(i8 %x, i8 %y) {		define i8 @negate_ashr_extrause(i8 %x, i8 %y) {
; CHECK-LABEL: @negate_ashr_extrause(		; CHECK-LABEL: @negate_ashr_extrause(
Show All 38 Lines	;
%t0 = lshr i8 %y, 6		%t0 = lshr i8 %y, 6
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}

; *ext of i1 is always negatible		; *ext of i1 is always negatible
define i8 @negate_sext(i8 %x, i1 %y) {		define i8 @negate_sext(i8 %x, i1 %y) {
; CHECK-LABEL: @negate_sext(		; CHECK-LABEL: @negate_sext(
; CHECK-NEXT: [[TMP1:%.]] = zext i1 [[Y:%.]] to i8		; CHECK-NEXT: [[T0_NEG:%.]] = zext i1 [[Y:%.]] to i8
; CHECK-NEXT: [[T1:%.]] = add i8 [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[T0_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = sext i1 %y to i8		%t0 = sext i1 %y to i8
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @negate_zext(i8 %x, i1 %y) {		define i8 @negate_zext(i8 %x, i1 %y) {
; CHECK-LABEL: @negate_zext(		; CHECK-LABEL: @negate_zext(
; CHECK-NEXT: [[TMP1:%.]] = sext i1 [[Y:%.]] to i8		; CHECK-NEXT: [[T0_NEG:%.]] = sext i1 [[Y:%.]] to i8
; CHECK-NEXT: [[T1:%.]] = add i8 [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[T1:%.]] = add i8 [[T0_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[T1]]		; CHECK-NEXT: ret i8 [[T1]]
;		;
%t0 = zext i1 %y to i8		%t0 = zext i1 %y to i8
%t1 = sub i8 %x, %t0		%t1 = sub i8 %x, %t0
ret i8 %t1		ret i8 %t1
}		}
define i8 @negate_sext_extrause(i8 %x, i1 %y) {		define i8 @negate_sext_extrause(i8 %x, i1 %y) {
; CHECK-LABEL: @negate_sext_extrause(		; CHECK-LABEL: @negate_sext_extrause(
▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/sub.ll

Show First 20 Lines • Show All 212 Lines • ▼ Show 20 Lines	;
%nx = xor <2 x i8> %x, <i8 undef, i8 -1>		%nx = xor <2 x i8> %x, <i8 undef, i8 -1>
%ny = xor <2 x i8> %y, <i8 -1, i8 undef>		%ny = xor <2 x i8> %y, <i8 -1, i8 undef>
%sub = sub <2 x i8> %nx, %ny		%sub = sub <2 x i8> %nx, %ny
ret <2 x i8> %sub		ret <2 x i8> %sub
}		}

define i32 @test5(i32 %A, i32 %B, i32 %C) {		define i32 @test5(i32 %A, i32 %B, i32 %C) {
; CHECK-LABEL: @test5(		; CHECK-LABEL: @test5(
; CHECK-NEXT: [[D1:%.]] = sub i32 [[C:%.]], [[B:%.*]]		; CHECK-NEXT: [[D_NEG:%.]] = sub i32 [[C:%.]], [[B:%.*]]
; CHECK-NEXT: [[E:%.]] = add i32 [[D1]], [[A:%.]]		; CHECK-NEXT: [[E:%.]] = add i32 [[D_NEG]], [[A:%.]]
; CHECK-NEXT: ret i32 [[E]]		; CHECK-NEXT: ret i32 [[E]]
;		;
%D = sub i32 %B, %C		%D = sub i32 %B, %C
%E = sub i32 %A, %D		%E = sub i32 %A, %D
ret i32 %E		ret i32 %E
}		}

define i32 @test6(i32 %A, i32 %B) {		define i32 @test6(i32 %A, i32 %B) {
▲ Show 20 Lines • Show All 319 Lines • ▼ Show 20 Lines	;
%sub = sub i64 %a, %b		%sub = sub i64 %a, %b
%mul = shl i64 %sub, 2		%mul = shl i64 %sub, 2
%neg = sub i64 0, %mul		%neg = sub i64 0, %mul
ret i64 %neg		ret i64 %neg
}		}

define i64 @test_neg_shl_sub_extra_use1(i64 %a, i64 %b, i64* %p) {		define i64 @test_neg_shl_sub_extra_use1(i64 %a, i64 %b, i64* %p) {
; CHECK-LABEL: @test_neg_shl_sub_extra_use1(		; CHECK-LABEL: @test_neg_shl_sub_extra_use1(
; CHECK-NEXT: [[SUB:%.]] = sub i64 [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[SUB_NEG:%.]] = sub i64 [[B:%.]], [[A:%.*]]
		; CHECK-NEXT: [[SUB:%.*]] = sub i64 [[A]], [[B]]
; CHECK-NEXT: store i64 [[SUB]], i64* [[P:%.*]], align 8		; CHECK-NEXT: store i64 [[SUB]], i64* [[P:%.*]], align 8
; CHECK-NEXT: [[MUL:%.*]] = shl i64 [[SUB]], 2		; CHECK-NEXT: [[MUL_NEG:%.*]] = shl i64 [[SUB_NEG]], 2
; CHECK-NEXT: [[NEG:%.*]] = sub i64 0, [[MUL]]		; CHECK-NEXT: ret i64 [[MUL_NEG]]
; CHECK-NEXT: ret i64 [[NEG]]
;		;
%sub = sub i64 %a, %b		%sub = sub i64 %a, %b
store i64 %sub, i64* %p		store i64 %sub, i64* %p
%mul = shl i64 %sub, 2		%mul = shl i64 %sub, 2
%neg = sub i64 0, %mul		%neg = sub i64 0, %mul
ret i64 %neg		ret i64 %neg
}		}

▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	;
%ext = sext i1 %a to i64		%ext = sext i1 %a to i64
%shl = shl i64 %ext, %b		%shl = shl i64 %ext, %b
%neg = sub i64 0, %shl		%neg = sub i64 0, %shl
ret i64 %neg		ret i64 %neg
}		}

define i64 @test_neg_zext_i1_extra_use(i1 %a, i64 %b, i64* %p) {		define i64 @test_neg_zext_i1_extra_use(i1 %a, i64 %b, i64* %p) {
; CHECK-LABEL: @test_neg_zext_i1_extra_use(		; CHECK-LABEL: @test_neg_zext_i1_extra_use(
; CHECK-NEXT: [[EXT:%.]] = zext i1 [[A:%.]] to i64		; CHECK-NEXT: [[EXT_NEG:%.]] = sext i1 [[A:%.]] to i64
; CHECK-NEXT: [[EXT_NEG:%.*]] = sext i1 [[A]] to i64		; CHECK-NEXT: [[EXT:%.*]] = zext i1 [[A]] to i64
; CHECK-NEXT: store i64 [[EXT]], i64* [[P:%.*]], align 8		; CHECK-NEXT: store i64 [[EXT]], i64* [[P:%.*]], align 8
; CHECK-NEXT: ret i64 [[EXT_NEG]]		; CHECK-NEXT: ret i64 [[EXT_NEG]]
;		;
%ext = zext i1 %a to i64		%ext = zext i1 %a to i64
%neg = sub i64 0, %ext		%neg = sub i64 0, %ext
store i64 %ext, i64* %p		store i64 %ext, i64* %p
ret i64 %neg		ret i64 %neg
}		}

define i64 @test_neg_sext_i1_extra_use(i1 %a, i64 %b, i64* %p) {		define i64 @test_neg_sext_i1_extra_use(i1 %a, i64 %b, i64* %p) {
; CHECK-LABEL: @test_neg_sext_i1_extra_use(		; CHECK-LABEL: @test_neg_sext_i1_extra_use(
; CHECK-NEXT: [[EXT:%.]] = sext i1 [[A:%.]] to i64		; CHECK-NEXT: [[EXT_NEG:%.]] = zext i1 [[A:%.]] to i64
; CHECK-NEXT: [[EXT_NEG:%.*]] = zext i1 [[A]] to i64		; CHECK-NEXT: [[EXT:%.*]] = sext i1 [[A]] to i64
; CHECK-NEXT: store i64 [[EXT]], i64* [[P:%.*]], align 8		; CHECK-NEXT: store i64 [[EXT]], i64* [[P:%.*]], align 8
; CHECK-NEXT: ret i64 [[EXT_NEG]]		; CHECK-NEXT: ret i64 [[EXT_NEG]]
;		;
%ext = sext i1 %a to i64		%ext = sext i1 %a to i64
%neg = sub i64 0, %ext		%neg = sub i64 0, %ext
store i64 %ext, i64* %p		store i64 %ext, i64* %p
ret i64 %neg		ret i64 %neg
}		}
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	;
%neg = sub i64 0, %mul		%neg = sub i64 0, %mul
ret i64 %neg		ret i64 %neg
}		}

define i64 @test_neg_mul_sub_commuted(i64 %a, i64 %b, i64 %c) {		define i64 @test_neg_mul_sub_commuted(i64 %a, i64 %b, i64 %c) {
; CHECK-LABEL: @test_neg_mul_sub_commuted(		; CHECK-LABEL: @test_neg_mul_sub_commuted(
; CHECK-NEXT: [[COMPLEX:%.]] = mul i64 [[C:%.]], [[C]]		; CHECK-NEXT: [[COMPLEX:%.]] = mul i64 [[C:%.]], [[C]]
; CHECK-NEXT: [[SUB_NEG:%.]] = sub i64 [[B:%.]], [[A:%.*]]		; CHECK-NEXT: [[SUB_NEG:%.]] = sub i64 [[B:%.]], [[A:%.*]]
; CHECK-NEXT: [[MUL_NEG:%.*]] = mul i64 [[COMPLEX]], [[SUB_NEG]]		; CHECK-NEXT: [[MUL_NEG:%.*]] = mul i64 [[SUB_NEG]], [[COMPLEX]]
; CHECK-NEXT: ret i64 [[MUL_NEG]]		; CHECK-NEXT: ret i64 [[MUL_NEG]]
;		;
%complex = mul i64 %c, %c		%complex = mul i64 %c, %c
%sub = sub i64 %a, %b		%sub = sub i64 %a, %b
%mul = mul i64 %complex, %sub		%mul = mul i64 %complex, %sub
%neg = sub i64 0, %mul		%neg = sub i64 0, %mul
ret i64 %neg		ret i64 %neg
}		}

define i32 @test27(i32 %x, i32 %y) {		define i32 @test27(i32 %x, i32 %y) {
; CHECK-LABEL: @test27(		; CHECK-LABEL: @test27(
; CHECK-NEXT: [[TMP1:%.]] = shl i32 [[Y:%.]], 3		; CHECK-NEXT: [[MUL_NEG:%.]] = shl i32 [[Y:%.]], 3
; CHECK-NEXT: [[SUB:%.]] = add i32 [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add i32 [[MUL_NEG]], [[X:%.]]
; CHECK-NEXT: ret i32 [[SUB]]		; CHECK-NEXT: ret i32 [[SUB]]
;		;
%mul = mul i32 %y, -8		%mul = mul i32 %y, -8
%sub = sub i32 %x, %mul		%sub = sub i32 %x, %mul
ret i32 %sub		ret i32 %sub
}		}

define <2 x i32> @test27vec(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @test27vec(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @test27vec(		; CHECK-LABEL: @test27vec(
; CHECK-NEXT: [[TMP1:%.]] = mul <2 x i32> [[Y:%.]], <i32 8, i32 6>		; CHECK-NEXT: [[MUL_NEG:%.]] = mul <2 x i32> [[Y:%.]], <i32 8, i32 6>
; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[MUL_NEG]], [[X:%.]]
; CHECK-NEXT: ret <2 x i32> [[SUB]]		; CHECK-NEXT: ret <2 x i32> [[SUB]]
;		;
%mul = mul <2 x i32> %y, <i32 -8, i32 -6>		%mul = mul <2 x i32> %y, <i32 -8, i32 -6>
%sub = sub <2 x i32> %x, %mul		%sub = sub <2 x i32> %x, %mul
ret <2 x i32> %sub		ret <2 x i32> %sub
}		}

define <2 x i32> @test27vecsplat(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @test27vecsplat(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @test27vecsplat(		; CHECK-LABEL: @test27vecsplat(
; CHECK-NEXT: [[TMP1:%.]] = shl <2 x i32> [[Y:%.]], <i32 3, i32 3>		; CHECK-NEXT: [[MUL_NEG:%.]] = shl <2 x i32> [[Y:%.]], <i32 3, i32 3>
; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[MUL_NEG]], [[X:%.]]
; CHECK-NEXT: ret <2 x i32> [[SUB]]		; CHECK-NEXT: ret <2 x i32> [[SUB]]
;		;
%mul = mul <2 x i32> %y, <i32 -8, i32 -8>		%mul = mul <2 x i32> %y, <i32 -8, i32 -8>
%sub = sub <2 x i32> %x, %mul		%sub = sub <2 x i32> %x, %mul
ret <2 x i32> %sub		ret <2 x i32> %sub
}		}

define <2 x i32> @test27vecmixed(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @test27vecmixed(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @test27vecmixed(		; CHECK-LABEL: @test27vecmixed(
; CHECK-NEXT: [[TMP1:%.]] = mul <2 x i32> [[Y:%.]], <i32 8, i32 -8>		; CHECK-NEXT: [[MUL_NEG:%.]] = mul <2 x i32> [[Y:%.]], <i32 8, i32 -8>
; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[MUL_NEG]], [[X:%.]]
; CHECK-NEXT: ret <2 x i32> [[SUB]]		; CHECK-NEXT: ret <2 x i32> [[SUB]]
;		;
%mul = mul <2 x i32> %y, <i32 -8, i32 8>		%mul = mul <2 x i32> %y, <i32 -8, i32 8>
%sub = sub <2 x i32> %x, %mul		%sub = sub <2 x i32> %x, %mul
ret <2 x i32> %sub		ret <2 x i32> %sub
}		}

define i32 @test27commuted(i32 %x, i32 %y) {		define i32 @test27commuted(i32 %x, i32 %y) {
; CHECK-LABEL: @test27commuted(		; CHECK-LABEL: @test27commuted(
; CHECK-NEXT: [[TMP1:%.]] = shl i32 [[Y:%.]], 3		; CHECK-NEXT: [[MUL_NEG:%.]] = shl i32 [[Y:%.]], 3
; CHECK-NEXT: [[SUB:%.]] = add i32 [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add i32 [[MUL_NEG]], [[X:%.]]
; CHECK-NEXT: ret i32 [[SUB]]		; CHECK-NEXT: ret i32 [[SUB]]
;		;
%mul = mul i32 -8, %y		%mul = mul i32 -8, %y
%sub = sub i32 %x, %mul		%sub = sub i32 %x, %mul
ret i32 %sub		ret i32 %sub
}		}

define <2 x i32> @test27commutedvec(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @test27commutedvec(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @test27commutedvec(		; CHECK-LABEL: @test27commutedvec(
; CHECK-NEXT: [[TMP1:%.]] = mul <2 x i32> [[Y:%.]], <i32 8, i32 6>		; CHECK-NEXT: [[MUL_NEG:%.]] = mul <2 x i32> [[Y:%.]], <i32 8, i32 6>
; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[MUL_NEG]], [[X:%.]]
; CHECK-NEXT: ret <2 x i32> [[SUB]]		; CHECK-NEXT: ret <2 x i32> [[SUB]]
;		;
%mul = mul <2 x i32> <i32 -8, i32 -6>, %y		%mul = mul <2 x i32> <i32 -8, i32 -6>, %y
%sub = sub <2 x i32> %x, %mul		%sub = sub <2 x i32> %x, %mul
ret <2 x i32> %sub		ret <2 x i32> %sub
}		}

define <2 x i32> @test27commutedvecsplat(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @test27commutedvecsplat(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @test27commutedvecsplat(		; CHECK-LABEL: @test27commutedvecsplat(
; CHECK-NEXT: [[TMP1:%.]] = shl <2 x i32> [[Y:%.]], <i32 3, i32 3>		; CHECK-NEXT: [[MUL_NEG:%.]] = shl <2 x i32> [[Y:%.]], <i32 3, i32 3>
; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[MUL_NEG]], [[X:%.]]
; CHECK-NEXT: ret <2 x i32> [[SUB]]		; CHECK-NEXT: ret <2 x i32> [[SUB]]
;		;
%mul = mul <2 x i32> <i32 -8, i32 -8>, %y		%mul = mul <2 x i32> <i32 -8, i32 -8>, %y
%sub = sub <2 x i32> %x, %mul		%sub = sub <2 x i32> %x, %mul
ret <2 x i32> %sub		ret <2 x i32> %sub
}		}

define <2 x i32> @test27commutedvecmixed(<2 x i32> %x, <2 x i32> %y) {		define <2 x i32> @test27commutedvecmixed(<2 x i32> %x, <2 x i32> %y) {
; CHECK-LABEL: @test27commutedvecmixed(		; CHECK-LABEL: @test27commutedvecmixed(
; CHECK-NEXT: [[TMP1:%.]] = mul <2 x i32> [[Y:%.]], <i32 8, i32 -8>		; CHECK-NEXT: [[MUL_NEG:%.]] = mul <2 x i32> [[Y:%.]], <i32 8, i32 -8>
; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add <2 x i32> [[MUL_NEG]], [[X:%.]]
; CHECK-NEXT: ret <2 x i32> [[SUB]]		; CHECK-NEXT: ret <2 x i32> [[SUB]]
;		;
%mul = mul <2 x i32> <i32 -8, i32 8>, %y		%mul = mul <2 x i32> <i32 -8, i32 8>, %y
%sub = sub <2 x i32> %x, %mul		%sub = sub <2 x i32> %x, %mul
ret <2 x i32> %sub		ret <2 x i32> %sub
}		}

define i32 @test28(i32 %x, i32 %y, i32 %z) {		define i32 @test28(i32 %x, i32 %y, i32 %z) {
▲ Show 20 Lines • Show All 295 Lines • ▼ Show 20 Lines
;		;
%sub = sub i32 254, %X		%sub = sub i32 254, %X
%res = and i32 %sub, 127		%res = and i32 %sub, 127
ret i32 %res		ret i32 %res
}		}

define <2 x i1> @test53(<2 x i1> %A, <2 x i1> %B) {		define <2 x i1> @test53(<2 x i1> %A, <2 x i1> %B) {
; CHECK-LABEL: @test53(		; CHECK-LABEL: @test53(
; CHECK-NEXT: [[SUB:%.]] = xor <2 x i1> [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[SUB:%.]] = xor <2 x i1> [[B:%.]], [[A:%.*]]
; CHECK-NEXT: ret <2 x i1> [[SUB]]		; CHECK-NEXT: ret <2 x i1> [[SUB]]
;		;
%sub = sub <2 x i1> %A, %B		%sub = sub <2 x i1> %A, %B
ret <2 x i1> %sub		ret <2 x i1> %sub
}		}

define i32 @test54(i1 %C) {		define i32 @test54(i1 %C) {
; CHECK-LABEL: @test54(		; CHECK-LABEL: @test54(
Show All 27 Lines

define i32 @test55(i1 %which) {		define i32 @test55(i1 %which) {
; CHECK-LABEL: @test55(		; CHECK-LABEL: @test55(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br i1 [[WHICH:%.]], label [[FINAL:%.]], label [[DELAY:%.*]]		; CHECK-NEXT: br i1 [[WHICH:%.]], label [[FINAL:%.]], label [[DELAY:%.*]]
; CHECK: delay:		; CHECK: delay:
; CHECK-NEXT: br label [[FINAL]]		; CHECK-NEXT: br label [[FINAL]]
; CHECK: final:		; CHECK: final:
; CHECK-NEXT: [[A:%.]] = phi i32 [ -877, [[ENTRY:%.]] ], [ 113, [[DELAY]] ]		; CHECK-NEXT: [[A_NEG:%.]] = phi i32 [ -877, [[ENTRY:%.]] ], [ 113, [[DELAY]] ]
; CHECK-NEXT: ret i32 [[A]]		; CHECK-NEXT: ret i32 [[A_NEG]]
;		;
entry:		entry:
br i1 %which, label %final, label %delay		br i1 %which, label %final, label %delay

delay:		delay:
br label %final		br label %final

final:		final:
%A = phi i32 [ 1000, %entry ], [ 10, %delay ]		%A = phi i32 [ 1000, %entry ], [ 10, %delay ]
%value = sub i32 123, %A		%value = sub i32 123, %A
ret i32 %value		ret i32 %value
}		}

define <2 x i32> @test55vec(i1 %which) {		define <2 x i32> @test55vec(i1 %which) {
; CHECK-LABEL: @test55vec(		; CHECK-LABEL: @test55vec(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br i1 [[WHICH:%.]], label [[FINAL:%.]], label [[DELAY:%.*]]		; CHECK-NEXT: br i1 [[WHICH:%.]], label [[FINAL:%.]], label [[DELAY:%.*]]
; CHECK: delay:		; CHECK: delay:
; CHECK-NEXT: br label [[FINAL]]		; CHECK-NEXT: br label [[FINAL]]
; CHECK: final:		; CHECK: final:
; CHECK-NEXT: [[A:%.]] = phi <2 x i32> [ <i32 -877, i32 -877>, [[ENTRY:%.]] ], [ <i32 113, i32 113>, [[DELAY]] ]		; CHECK-NEXT: [[A_NEG:%.]] = phi <2 x i32> [ <i32 -877, i32 -877>, [[ENTRY:%.]] ], [ <i32 113, i32 113>, [[DELAY]] ]
; CHECK-NEXT: ret <2 x i32> [[A]]		; CHECK-NEXT: ret <2 x i32> [[A_NEG]]
;		;
entry:		entry:
br i1 %which, label %final, label %delay		br i1 %which, label %final, label %delay

delay:		delay:
br label %final		br label %final

final:		final:
%A = phi <2 x i32> [ <i32 1000, i32 1000>, %entry ], [ <i32 10, i32 10>, %delay ]		%A = phi <2 x i32> [ <i32 1000, i32 1000>, %entry ], [ <i32 10, i32 10>, %delay ]
%value = sub <2 x i32> <i32 123, i32 123>, %A		%value = sub <2 x i32> <i32 123, i32 123>, %A
ret <2 x i32> %value		ret <2 x i32> %value
}		}

define <2 x i32> @test55vec2(i1 %which) {		define <2 x i32> @test55vec2(i1 %which) {
; CHECK-LABEL: @test55vec2(		; CHECK-LABEL: @test55vec2(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br i1 [[WHICH:%.]], label [[FINAL:%.]], label [[DELAY:%.*]]		; CHECK-NEXT: br i1 [[WHICH:%.]], label [[FINAL:%.]], label [[DELAY:%.*]]
; CHECK: delay:		; CHECK: delay:
; CHECK-NEXT: br label [[FINAL]]		; CHECK-NEXT: br label [[FINAL]]
; CHECK: final:		; CHECK: final:
; CHECK-NEXT: [[A:%.]] = phi <2 x i32> [ <i32 -877, i32 -2167>, [[ENTRY:%.]] ], [ <i32 113, i32 303>, [[DELAY]] ]		; CHECK-NEXT: [[A_NEG:%.]] = phi <2 x i32> [ <i32 -877, i32 -2167>, [[ENTRY:%.]] ], [ <i32 113, i32 303>, [[DELAY]] ]
; CHECK-NEXT: ret <2 x i32> [[A]]		; CHECK-NEXT: ret <2 x i32> [[A_NEG]]
;		;
entry:		entry:
br i1 %which, label %final, label %delay		br i1 %which, label %final, label %delay

delay:		delay:
br label %final		br label %final

final:		final:
▲ Show 20 Lines • Show All 140 Lines • ▼ Show 20 Lines
; CHECK-NEXT: [[B:%.]] = shl <2 x i32> [[A:%.]], <i32 1, i32 1>		; CHECK-NEXT: [[B:%.]] = shl <2 x i32> [[A:%.]], <i32 1, i32 1>
; CHECK-NEXT: ret <2 x i32> [[B]]		; CHECK-NEXT: ret <2 x i32> [[B]]
;		;
%B = sub <2 x i32> <i32 1, i32 1>, %A		%B = sub <2 x i32> <i32 1, i32 1>, %A
%C = shl <2 x i32> %B, <i32 1, i32 1>		%C = shl <2 x i32> %B, <i32 1, i32 1>
%D = sub <2 x i32> <i32 2, i32 2>, %C		%D = sub <2 x i32> <i32 2, i32 2>, %C
ret <2 x i32> %D		ret <2 x i32> %D
}		}

define i32 @test64(i32 %x) {		define i32 @test64(i32 %x) {
		xbolva00Unsubmitted Done Reply Inline Actions Remove FIXME? xbolva00: Remove FIXME?
; CHECK-LABEL: @test64(		; CHECK-LABEL: @test64(
; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[X:%.]], 255		; CHECK-NEXT: [[TMP1:%.]] = icmp slt i32 [[X:%.]], 255
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 255		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 255
; CHECK-NEXT: [[RES:%.*]] = add nsw i32 [[TMP2]], 1		; CHECK-NEXT: [[DOTNEG:%.*]] = add nsw i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[RES]]		; CHECK-NEXT: ret i32 [[DOTNEG]]
;		;
%1 = xor i32 %x, -1		%1 = xor i32 %x, -1
%2 = icmp sgt i32 %1, -256		%2 = icmp sgt i32 %1, -256
%3 = select i1 %2, i32 %1, i32 -256		%3 = select i1 %2, i32 %1, i32 -256
%res = sub i32 0, %3		%res = sub i32 0, %3
ret i32 %res		ret i32 %res
}		}

define i32 @test65(i32 %x) {		define i32 @test65(i32 %x) {
; CHECK-LABEL: @test65(		; CHECK-LABEL: @test65(
; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[X:%.]], -256		; CHECK-NEXT: [[TMP1:%.]] = icmp sgt i32 [[X:%.]], -256
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 -256		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 -256
; CHECK-NEXT: [[RES:%.*]] = add i32 [[TMP2]], 1		; CHECK-NEXT: [[DOTNEG:%.*]] = add i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[RES]]		; CHECK-NEXT: ret i32 [[DOTNEG]]
;		;
%1 = xor i32 %x, -1		%1 = xor i32 %x, -1
%2 = icmp slt i32 %1, 255		%2 = icmp slt i32 %1, 255
%3 = select i1 %2, i32 %1, i32 255		%3 = select i1 %2, i32 %1, i32 255
%res = sub i32 0, %3		%res = sub i32 0, %3
ret i32 %res		ret i32 %res
}		}

define i32 @test66(i32 %x) {		define i32 @test66(i32 %x) {
; CHECK-LABEL: @test66(		; CHECK-LABEL: @test66(
; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[X:%.]], -101		; CHECK-NEXT: [[TMP1:%.]] = icmp ult i32 [[X:%.]], -101
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 -101		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 -101
; CHECK-NEXT: [[RES:%.*]] = add nuw i32 [[TMP2]], 1		; CHECK-NEXT: [[DOTNEG:%.*]] = add nuw i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[RES]]		; CHECK-NEXT: ret i32 [[DOTNEG]]
;		;
%1 = xor i32 %x, -1		%1 = xor i32 %x, -1
%2 = icmp ugt i32 %1, 100		%2 = icmp ugt i32 %1, 100
%3 = select i1 %2, i32 %1, i32 100		%3 = select i1 %2, i32 %1, i32 100
%res = sub i32 0, %3		%res = sub i32 0, %3
ret i32 %res		ret i32 %res
}		}

define i32 @test67(i32 %x) {		define i32 @test67(i32 %x) {
; CHECK-LABEL: @test67(		; CHECK-LABEL: @test67(
; CHECK-NEXT: [[TMP1:%.]] = icmp ugt i32 [[X:%.]], 100		; CHECK-NEXT: [[TMP1:%.]] = icmp ugt i32 [[X:%.]], 100
; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 100		; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], i32 [[X]], i32 100
; CHECK-NEXT: [[RES:%.*]] = add i32 [[TMP2]], 1		; CHECK-NEXT: [[DOTNEG:%.*]] = add i32 [[TMP2]], 1
; CHECK-NEXT: ret i32 [[RES]]		; CHECK-NEXT: ret i32 [[DOTNEG]]
;		;
%1 = xor i32 %x, -1		%1 = xor i32 %x, -1
%2 = icmp ult i32 %1, -101		%2 = icmp ult i32 %1, -101
%3 = select i1 %2, i32 %1, i32 -101		%3 = select i1 %2, i32 %1, i32 -101
%res = sub i32 0, %3		%res = sub i32 0, %3
ret i32 %res		ret i32 %res
}		}

; Check splat vectors too		; Check splat vectors too
define <2 x i32> @test68(<2 x i32> %x) {		define <2 x i32> @test68(<2 x i32> %x) {
; CHECK-LABEL: @test68(		; CHECK-LABEL: @test68(
; CHECK-NEXT: [[TMP1:%.]] = icmp slt <2 x i32> [[X:%.]], <i32 255, i32 255>		; CHECK-NEXT: [[TMP1:%.]] = icmp slt <2 x i32> [[X:%.]], <i32 255, i32 255>
; CHECK-NEXT: [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i32> [[X]], <2 x i32> <i32 255, i32 255>		; CHECK-NEXT: [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i32> [[X]], <2 x i32> <i32 255, i32 255>
; CHECK-NEXT: [[RES:%.*]] = add nsw <2 x i32> [[TMP2]], <i32 1, i32 1>		; CHECK-NEXT: [[DOTNEG:%.*]] = add nsw <2 x i32> [[TMP2]], <i32 1, i32 1>
; CHECK-NEXT: ret <2 x i32> [[RES]]		; CHECK-NEXT: ret <2 x i32> [[DOTNEG]]
;		;
%1 = xor <2 x i32> %x, <i32 -1, i32 -1>		%1 = xor <2 x i32> %x, <i32 -1, i32 -1>
%2 = icmp sgt <2 x i32> %1, <i32 -256, i32 -256>		%2 = icmp sgt <2 x i32> %1, <i32 -256, i32 -256>
%3 = select <2 x i1> %2, <2 x i32> %1, <2 x i32> <i32 -256, i32 -256>		%3 = select <2 x i1> %2, <2 x i32> %1, <2 x i32> <i32 -256, i32 -256>
%res = sub <2 x i32> zeroinitializer, %3		%res = sub <2 x i32> zeroinitializer, %3
ret <2 x i32> %res		ret <2 x i32> %res
}		}

; And non-splat constant vectors.		; And non-splat constant vectors.
define <2 x i32> @test69(<2 x i32> %x) {		define <2 x i32> @test69(<2 x i32> %x) {
; CHECK-LABEL: @test69(		; CHECK-LABEL: @test69(
; CHECK-NEXT: [[TMP1:%.]] = icmp slt <2 x i32> [[X:%.]], <i32 255, i32 127>		; CHECK-NEXT: [[TMP1:%.]] = icmp slt <2 x i32> [[X:%.]], <i32 255, i32 127>
; CHECK-NEXT: [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i32> [[X]], <2 x i32> <i32 255, i32 127>		; CHECK-NEXT: [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i32> [[X]], <2 x i32> <i32 255, i32 127>
; CHECK-NEXT: [[RES:%.*]] = add <2 x i32> [[TMP2]], <i32 1, i32 1>		; CHECK-NEXT: [[DOTNEG:%.*]] = add <2 x i32> [[TMP2]], <i32 1, i32 1>
; CHECK-NEXT: ret <2 x i32> [[RES]]		; CHECK-NEXT: ret <2 x i32> [[DOTNEG]]
;		;
%1 = xor <2 x i32> %x, <i32 -1, i32 -1>		%1 = xor <2 x i32> %x, <i32 -1, i32 -1>
%2 = icmp sgt <2 x i32> %1, <i32 -256, i32 -128>		%2 = icmp sgt <2 x i32> %1, <i32 -256, i32 -128>
%3 = select <2 x i1> %2, <2 x i32> %1, <2 x i32> <i32 -256, i32 -128>		%3 = select <2 x i1> %2, <2 x i32> %1, <2 x i32> <i32 -256, i32 -128>
%res = sub <2 x i32> zeroinitializer, %3		%res = sub <2 x i32> zeroinitializer, %3
ret <2 x i32> %res		ret <2 x i32> %res
}		}

▲ Show 20 Lines • Show All 100 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/zext-bool-add-sub.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -instcombine -S \| FileCheck %s		; RUN: opt < %s -instcombine -S \| FileCheck %s

; rdar://11748024		; rdar://11748024

define i32 @a(i1 zeroext %x, i1 zeroext %y) {		define i32 @a(i1 zeroext %x, i1 zeroext %y) {
; CHECK-LABEL: @a(		; CHECK-LABEL: @a(
; CHECK-NEXT: [[CONV3_NEG:%.]] = sext i1 [[Y:%.]] to i32		; CHECK-NEXT: [[CONV3_NEG1:%.]] = sext i1 [[Y:%.]] to i32
; CHECK-NEXT: [[SUB:%.]] = select i1 [[X:%.]], i32 2, i32 1		; CHECK-NEXT: [[SUB:%.]] = select i1 [[X:%.]], i32 2, i32 1
; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[SUB]], [[CONV3_NEG]]		; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[SUB]], [[CONV3_NEG1]]
; CHECK-NEXT: ret i32 [[ADD]]		; CHECK-NEXT: ret i32 [[ADD]]
;		;
%conv = zext i1 %x to i32		%conv = zext i1 %x to i32
%conv3 = zext i1 %y to i32		%conv3 = zext i1 %y to i32
%conv3.neg = sub i32 0, %conv3		%conv3.neg = sub i32 0, %conv3
%sub = add i32 %conv, 1		%sub = add i32 %conv, 1
%add = add i32 %sub, %conv3.neg		%add = add i32 %sub, %conv3.neg
ret i32 %add		ret i32 %add
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	;
%add = add <2 x i32> %zext, <i32 42, i32 23>		%add = add <2 x i32> %zext, <i32 42, i32 23>
ret <2 x i32> %add		ret <2 x i32> %add
}		}

declare void @use(i64)		declare void @use(i64)

define i64 @zext_negate(i1 %A) {		define i64 @zext_negate(i1 %A) {
; CHECK-LABEL: @zext_negate(		; CHECK-LABEL: @zext_negate(
; CHECK-NEXT: [[SUB:%.]] = sext i1 [[A:%.]] to i64		; CHECK-NEXT: [[EXT_NEG:%.]] = sext i1 [[A:%.]] to i64
; CHECK-NEXT: ret i64 [[SUB]]		; CHECK-NEXT: ret i64 [[EXT_NEG]]
;		;
%ext = zext i1 %A to i64		%ext = zext i1 %A to i64
%sub = sub i64 0, %ext		%sub = sub i64 0, %ext
ret i64 %sub		ret i64 %sub
}		}

define i64 @zext_negate_extra_use(i1 %A) {		define i64 @zext_negate_extra_use(i1 %A) {
; CHECK-LABEL: @zext_negate_extra_use(		; CHECK-LABEL: @zext_negate_extra_use(
; CHECK-NEXT: [[EXT:%.]] = zext i1 [[A:%.]] to i64		; CHECK-NEXT: [[EXT_NEG:%.]] = sext i1 [[A:%.]] to i64
; CHECK-NEXT: [[SUB:%.*]] = sext i1 [[A]] to i64		; CHECK-NEXT: [[EXT:%.*]] = zext i1 [[A]] to i64
; CHECK-NEXT: call void @use(i64 [[EXT]])		; CHECK-NEXT: call void @use(i64 [[EXT]])
; CHECK-NEXT: ret i64 [[SUB]]		; CHECK-NEXT: ret i64 [[EXT_NEG]]
;		;
%ext = zext i1 %A to i64		%ext = zext i1 %A to i64
%sub = sub i64 0, %ext		%sub = sub i64 0, %ext
call void @use(i64 %ext)		call void @use(i64 %ext)
ret i64 %sub		ret i64 %sub
}		}

define <2 x i64> @zext_negate_vec(<2 x i1> %A) {		define <2 x i64> @zext_negate_vec(<2 x i1> %A) {
; CHECK-LABEL: @zext_negate_vec(		; CHECK-LABEL: @zext_negate_vec(
; CHECK-NEXT: [[SUB:%.]] = sext <2 x i1> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[EXT_NEG:%.]] = sext <2 x i1> [[A:%.]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[SUB]]		; CHECK-NEXT: ret <2 x i64> [[EXT_NEG]]
;		;
%ext = zext <2 x i1> %A to <2 x i64>		%ext = zext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> zeroinitializer, %ext		%sub = sub <2 x i64> zeroinitializer, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define <2 x i64> @zext_negate_vec_undef_elt(<2 x i1> %A) {		define <2 x i64> @zext_negate_vec_undef_elt(<2 x i1> %A) {
; CHECK-LABEL: @zext_negate_vec_undef_elt(		; CHECK-LABEL: @zext_negate_vec_undef_elt(
; CHECK-NEXT: [[SUB:%.]] = sext <2 x i1> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[EXT_NEG:%.]] = sext <2 x i1> [[A:%.]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[SUB]]		; CHECK-NEXT: ret <2 x i64> [[EXT_NEG]]
;		;
%ext = zext <2 x i1> %A to <2 x i64>		%ext = zext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> <i64 0, i64 undef>, %ext		%sub = sub <2 x i64> <i64 0, i64 undef>, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define i64 @zext_sub_const(i1 %A) {		define i64 @zext_sub_const(i1 %A) {
; CHECK-LABEL: @zext_sub_const(		; CHECK-LABEL: @zext_sub_const(
Show All 35 Lines
;		;
%ext = zext <2 x i1> %A to <2 x i64>		%ext = zext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> <i64 42, i64 undef>, %ext		%sub = sub <2 x i64> <i64 42, i64 undef>, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define i64 @sext_negate(i1 %A) {		define i64 @sext_negate(i1 %A) {
; CHECK-LABEL: @sext_negate(		; CHECK-LABEL: @sext_negate(
; CHECK-NEXT: [[SUB:%.]] = zext i1 [[A:%.]] to i64		; CHECK-NEXT: [[EXT_NEG:%.]] = zext i1 [[A:%.]] to i64
; CHECK-NEXT: ret i64 [[SUB]]		; CHECK-NEXT: ret i64 [[EXT_NEG]]
;		;
%ext = sext i1 %A to i64		%ext = sext i1 %A to i64
%sub = sub i64 0, %ext		%sub = sub i64 0, %ext
ret i64 %sub		ret i64 %sub
}		}

define i64 @sext_negate_extra_use(i1 %A) {		define i64 @sext_negate_extra_use(i1 %A) {
; CHECK-LABEL: @sext_negate_extra_use(		; CHECK-LABEL: @sext_negate_extra_use(
; CHECK-NEXT: [[EXT:%.]] = sext i1 [[A:%.]] to i64		; CHECK-NEXT: [[EXT_NEG:%.]] = zext i1 [[A:%.]] to i64
; CHECK-NEXT: [[SUB:%.*]] = zext i1 [[A]] to i64		; CHECK-NEXT: [[EXT:%.*]] = sext i1 [[A]] to i64
; CHECK-NEXT: call void @use(i64 [[EXT]])		; CHECK-NEXT: call void @use(i64 [[EXT]])
; CHECK-NEXT: ret i64 [[SUB]]		; CHECK-NEXT: ret i64 [[EXT_NEG]]
;		;
%ext = sext i1 %A to i64		%ext = sext i1 %A to i64
%sub = sub i64 0, %ext		%sub = sub i64 0, %ext
call void @use(i64 %ext)		call void @use(i64 %ext)
ret i64 %sub		ret i64 %sub
}		}

define <2 x i64> @sext_negate_vec(<2 x i1> %A) {		define <2 x i64> @sext_negate_vec(<2 x i1> %A) {
; CHECK-LABEL: @sext_negate_vec(		; CHECK-LABEL: @sext_negate_vec(
; CHECK-NEXT: [[SUB:%.]] = zext <2 x i1> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[EXT_NEG:%.]] = zext <2 x i1> [[A:%.]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[SUB]]		; CHECK-NEXT: ret <2 x i64> [[EXT_NEG]]
;		;
%ext = sext <2 x i1> %A to <2 x i64>		%ext = sext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> zeroinitializer, %ext		%sub = sub <2 x i64> zeroinitializer, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define <2 x i64> @sext_negate_vec_undef_elt(<2 x i1> %A) {		define <2 x i64> @sext_negate_vec_undef_elt(<2 x i1> %A) {
; CHECK-LABEL: @sext_negate_vec_undef_elt(		; CHECK-LABEL: @sext_negate_vec_undef_elt(
; CHECK-NEXT: [[SUB:%.]] = zext <2 x i1> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[EXT_NEG:%.]] = zext <2 x i1> [[A:%.]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[SUB]]		; CHECK-NEXT: ret <2 x i64> [[EXT_NEG]]
;		;
%ext = sext <2 x i1> %A to <2 x i64>		%ext = sext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> <i64 0, i64 undef>, %ext		%sub = sub <2 x i64> <i64 0, i64 undef>, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define i64 @sext_sub_const(i1 %A) {		define i64 @sext_sub_const(i1 %A) {
; CHECK-LABEL: @sext_sub_const(		; CHECK-LABEL: @sext_sub_const(
Show All 35 Lines
;		;
%ext = sext <2 x i1> %A to <2 x i64>		%ext = sext <2 x i1> %A to <2 x i64>
%sub = sub <2 x i64> <i64 undef, i64 42>, %ext		%sub = sub <2 x i64> <i64 undef, i64 42>, %ext
ret <2 x i64> %sub		ret <2 x i64> %sub
}		}

define i8 @sext_sub(i8 %x, i1 %y) {		define i8 @sext_sub(i8 %x, i1 %y) {
; CHECK-LABEL: @sext_sub(		; CHECK-LABEL: @sext_sub(
; CHECK-NEXT: [[TMP1:%.]] = zext i1 [[Y:%.]] to i8		; CHECK-NEXT: [[SEXT_NEG:%.]] = zext i1 [[Y:%.]] to i8
; CHECK-NEXT: [[SUB:%.]] = add i8 [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add i8 [[SEXT_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[SUB]]		; CHECK-NEXT: ret i8 [[SUB]]
;		;
%sext = sext i1 %y to i8		%sext = sext i1 %y to i8
%sub = sub i8 %x, %sext		%sub = sub i8 %x, %sext
ret i8 %sub		ret i8 %sub
}		}

; Vectors get the same transform.		; Vectors get the same transform.

define <2 x i8> @sext_sub_vec(<2 x i8> %x, <2 x i1> %y) {		define <2 x i8> @sext_sub_vec(<2 x i8> %x, <2 x i1> %y) {
; CHECK-LABEL: @sext_sub_vec(		; CHECK-LABEL: @sext_sub_vec(
; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i1> [[Y:%.]] to <2 x i8>		; CHECK-NEXT: [[SEXT_NEG:%.]] = zext <2 x i1> [[Y:%.]] to <2 x i8>
; CHECK-NEXT: [[SUB:%.]] = add <2 x i8> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add <2 x i8> [[SEXT_NEG]], [[X:%.]]
; CHECK-NEXT: ret <2 x i8> [[SUB]]		; CHECK-NEXT: ret <2 x i8> [[SUB]]
;		;
%sext = sext <2 x i1> %y to <2 x i8>		%sext = sext <2 x i1> %y to <2 x i8>
%sub = sub <2 x i8> %x, %sext		%sub = sub <2 x i8> %x, %sext
ret <2 x i8> %sub		ret <2 x i8> %sub
}		}

; NSW is preserved.		; NSW is preserved.

define <2 x i8> @sext_sub_vec_nsw(<2 x i8> %x, <2 x i1> %y) {		define <2 x i8> @sext_sub_vec_nsw(<2 x i8> %x, <2 x i1> %y) {
; CHECK-LABEL: @sext_sub_vec_nsw(		; CHECK-LABEL: @sext_sub_vec_nsw(
; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i1> [[Y:%.]] to <2 x i8>		; CHECK-NEXT: [[SEXT_NEG:%.]] = zext <2 x i1> [[Y:%.]] to <2 x i8>
; CHECK-NEXT: [[SUB:%.]] = add nsw <2 x i8> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add <2 x i8> [[SEXT_NEG]], [[X:%.]]
; CHECK-NEXT: ret <2 x i8> [[SUB]]		; CHECK-NEXT: ret <2 x i8> [[SUB]]
;		;
%sext = sext <2 x i1> %y to <2 x i8>		%sext = sext <2 x i1> %y to <2 x i8>
%sub = sub nsw <2 x i8> %x, %sext		%sub = sub nsw <2 x i8> %x, %sext
ret <2 x i8> %sub		ret <2 x i8> %sub
}		}

; We favor the canonical zext+add over keeping the NUW.		; We favor the canonical zext+add over keeping the NUW.

define i8 @sext_sub_nuw(i8 %x, i1 %y) {		define i8 @sext_sub_nuw(i8 %x, i1 %y) {
; CHECK-LABEL: @sext_sub_nuw(		; CHECK-LABEL: @sext_sub_nuw(
; CHECK-NEXT: [[TMP1:%.]] = zext i1 [[Y:%.]] to i8		; CHECK-NEXT: [[SEXT_NEG:%.]] = zext i1 [[Y:%.]] to i8
; CHECK-NEXT: [[SUB:%.]] = add i8 [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[SUB:%.]] = add i8 [[SEXT_NEG]], [[X:%.]]
; CHECK-NEXT: ret i8 [[SUB]]		; CHECK-NEXT: ret i8 [[SUB]]
;		;
%sext = sext i1 %y to i8		%sext = sext i1 %y to i8
%sub = sub nuw i8 %x, %sext		%sub = sub nuw i8 %x, %sext
ret i8 %sub		ret i8 %sub
}		}

define i32 @sextbool_add(i1 %c, i32 %x) {		define i32 @sextbool_add(i1 %c, i32 %x) {
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	;
%b = zext i1 %c to i32		%b = zext i1 %c to i32
call void @use32(i32 %b)		call void @use32(i32 %b)
%s = sub i32 %x, %b		%s = sub i32 %x, %b
ret i32 %s		ret i32 %s
}		}

define <4 x i32> @zextbool_sub_vector(<4 x i1> %c, <4 x i32> %x) {		define <4 x i32> @zextbool_sub_vector(<4 x i1> %c, <4 x i32> %x) {
; CHECK-LABEL: @zextbool_sub_vector(		; CHECK-LABEL: @zextbool_sub_vector(
; CHECK-NEXT: [[TMP1:%.]] = sext <4 x i1> [[C:%.]] to <4 x i32>		; CHECK-NEXT: [[B_NEG:%.]] = sext <4 x i1> [[C:%.]] to <4 x i32>
; CHECK-NEXT: [[S:%.]] = add <4 x i32> [[TMP1]], [[X:%.]]		; CHECK-NEXT: [[S:%.]] = add <4 x i32> [[B_NEG]], [[X:%.]]
; CHECK-NEXT: ret <4 x i32> [[S]]		; CHECK-NEXT: ret <4 x i32> [[S]]
;		;
%b = zext <4 x i1> %c to <4 x i32>		%b = zext <4 x i1> %c to <4 x i32>
%s = sub <4 x i32> %x, %b		%s = sub <4 x i32> %x, %b
ret <4 x i32> %s		ret <4 x i32> %s
}		}