This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Analysis/
-
Analysis/
13/33
InstructionSimplify.cpp
-
test/Transforms/InstSimplify/
-
Transforms/
-
InstSimplify/
1/1
shift.ll

Differential D48828

[InstSimplify] fold extracting from std::pair (1/2)
ClosedPublic

Authored by inouehrs on Jul 2 2018, 6:03 AM.

Download Raw Diff

Details

Reviewers

• dberlin
spatel
efriedma
echristo
kbarton
nemanjai
lebedev.ri

Commits

rG02f79eae0689: [InstSimplify] fold extracting from std::pair (1/2)
rL338485: [InstSimplify] fold extracting from std::pair (1/2)

Summary

This patch intends to enable jump threading with a method whose return type is std::pair<int, bool> or std::pair<bool, int>
For example, jump threading does not work for the if statement in func.

std::pair<int, bool> callee(int v) {
  int a = dummy(v);
  if (a) return std::make_pair(dummy(v), true);
  else return std::make_pair(v, v < 0);
}

int func(int v) {
  std::pair<int, bool> rc = callee(v);
  if (rc.second) {
    // do something
  }

SROA executed before the method inlining replaces std::pair by i64 without splitting in both callee and func since at this point no access to the individual fields is seen to SROA.
After inlining, jump threading fails to identify that the incoming value is a constant due to additional instructions (like or, and, trunc).

This series of patch add two patterns in InstructionSimplify to fold extraction of members of std::pair. To help jump threading, actually we need to optimize the code sequence spanning multiple BBs.
These patches does not handle phi by itself, but these additional patterns help NewGVN pass, which calls instsimplify to check opportunities for simplifying instructions over phi, apply phi-of-ops optimization to result in successful jump threading.

Later, in the CFG simplification pass, the similar code modification happens. But it is too late to help jump threading.
This series of patches replaces my old patch D44626 which do similar optimization in InstCombine as suggested by reviewers.

This first patch in the series handles code sequences that merges two values using shl and or and then extracts one value using lshr.

Alive proof for the shift case: https://rise4fun.com/Alive/epNB, mask Y case: https://rise4fun.com/Alive/vgH

Diff Detail

Event Timeline

inouehrs created this revision.Jul 2 2018, 6:03 AM

Herald added a subscriber: Prazek. · View Herald TranscriptJul 2 2018, 6:03 AM

Looks nice!

The InstSimplify change needs it's own test set in test/Transforms/InstSimplify.

lib/Analysis/InstructionSimplify.cpp
1320–1324	I think this shouldn't talk about C++ here. Just talk about IR. // Given Op0 l>> Op1. // If Op1 is a constant, and Op0 is (X nuw<< Op1) \| Y // If Y l>> Op1 == 0. we can extract `X` without extra instructions.
1327	Extra unneeded brace `()`
1327	What about commutativity? This should be `m_c_Or`.
1332	Why do we check this? I think `nuw` already implies that.

add more test cases
make the algorithm more general using m_c_Or instead of m_Or

inouehrs marked 3 inline comments as done.Jul 16 2018, 7:03 AM

inouehrs added inline comments.

lib/Analysis/InstructionSimplify.cpp
1320–1324	Is this better? I mention std::pair as an example.
1327	Modified. Thank you for pointing this out.
1332	Fixed.

lebedev.ri added inline comments.Jul 16 2018, 7:32 AM

lib/Analysis/InstructionSimplify.cpp
1320–1324	There are LLVM IR structures https://llvm.org/docs/LangRef.html#structure-types Is the comment talking about them? Huh, why does it operate on integers then? And other thoughts the readers will have later on. I'd concentrate on IR.
1328–1332	Can you add a comment explaining what/how this does? (Checks that `(%x << %op1) \| %y` does not touch any bits in `%x`)
1331	But this isn't the width of `Y`, that's `Y->getType()->getScalarSizeInBits()`. Maybe effective width of `Y`.
test/Transforms/InstSimplify/pair.ll
1 ↗	(On Diff #155670)	Use `./utils/update_test_checks.py` please.

All of the tests for instsimplify are already folded if you run -instcombine. Could the motivating problem also be considered a pass ordering bug?

If this is going to be an instsimplify patch, then I agree with the previous feedback: the code comments should use IR examples and explain what is happening to analyze/transform the IR instructions. C++ examples just confuse things.

update comments and test cases

@lebedev.ri @spatel
Thak you so much for the advices. I avoid mentioning about C++ in the comment.

@lebedev.ri

Use ./utils/update_test_checks.py please.

I updated the test cases using update_test_checks.py and moved them into existing ll files since pair.ll is no longer good file name.

@spatel
The simplified test cases can be optimized by current instcombine pass. But to enable jump threading for std::pair, which is the original motivation of the patch, we must apply this folding over a phi node as discussed in https://reviews.llvm.org/D44626.
Executing jump threading pass again after CFG simplification may catch this opportunity. But I think it is better to do jump threading early to help other optimizers.

In D48828#1164928, @inouehrs wrote:

In D48828#1163506, @spatel wrote:

All of the tests for instsimplify are already folded if you run -instcombine. Could the motivating problem also be considered a pass ordering bug?

@spatel
The simplified test cases can be optimized by current instcombine pass.

I do agree that it it concerning that instcombine already handles this.
However this pattern really looks like something for instsimplify.

So i guess my question is, what in instcombine does this fold?
Is it very general, and this is only one of the cases it handles?
If not, maybe it should be refactored into instsimplify.

In D48828#1164928, @inouehrs wrote:

But to enable jump threading for std::pair, which is the original motivation of the patch,
we must apply this folding over a phi node as discussed in https://reviews.llvm.org/D44626.

Can you explain that in layman terms?
I don't see anything phi-related in instsimplify changes.

Executing jump threading pass again after CFG simplification may catch this opportunity. But I think it is better to do jump threading early to help other optimizers.

lib/Analysis/InstructionSimplify.cpp
1860	I do not understand. Why is this only handling the case where `Y` is `bool`?

inouehrs updated this revision to Diff 156241.Jul 19 2018, 5:22 AM

So i guess my question is, what in instcombine does this fold?
Is it very general, and this is only one of the cases it handles?
If not, maybe it should be refactored into instsimplify.

For the shl->or->lshr case, instcombine first identifies or is redundant and eliminates it. Then shl shr pair is eliminated. I think it is not general compared to my code.
For the shl->or->and case, instcombine does more general but more costly analysis in SimplifyDemandedInstructionBits. The same analysis seems too costly, but I expand the scope of my code for non-boolean cases.

Can you explain that in layman terms?
I don't see anything phi-related in instsimplify changes.

To help jump threading, actually we need to optimize the code sequence spanning multiple BBs. For an example of shl->or->and case looks like

BB1:
  %shl = shl nuw i64 %val, 32
  %or = or i64 %shl, 1
  br %BB2
BB2:
  %phi = phi i64 [ %or, %BB1 ], ... 
  %and = and i64 %phi, 1

The current instcombine cannot optimize such cases.
My instsimplify patch does not handle phi by itself. The NewGVN calls instsimplify to check opportunities for simplifying instructions over phi.

lib/Analysis/InstructionSimplify.cpp
1860	I made the code more generic. What we really need to check is that this AND op selects all bits of X or Y, and no bit from the another.

Some more comments.
Please enhance test coverage, and note the note about possible miscompile.

In D48828#1167921, @inouehrs wrote:
! In D48828#1166791, @lebedev.ri wrote:
Can you explain that in layman terms?
I don't see anything phi-related in instsimplify changes.

To help jump threading, actually we need to optimize the code sequence spanning multiple BBs. For an example of shl->or->and case looks like
BB1:
  %shl = shl nuw i64 %val, 32
  %or = or i64 %shl, 1
  br %BB2
BB2:
  %phi = phi i64 [ %or, %BB1 ], ... 
  %and = and i64 %phi, 1
The current instcombine cannot optimize such cases.
My instsimplify patch does not handle phi by itself. The NewGVN calls instsimplify to check opportunities for simplifying instructions over phi.

Aha, that is the bit i was looking for. Please update the differential's description with that.

lib/Analysis/InstructionSimplify.cpp
1329–1330	Please add some negative test where this fails.
1860	What we really need to check is that this AND op selects all bits of X or Y, and no bit from the another. That was exactly my point :)
1869	I think you can use `const APInt& Mask` here.
1871	But it seems like this only supports extraction of a bit-wide values? Please at least add a `FIXME` comment then.
1878–1879	Aha. This will miscompile if you are operating on types wider than `i64`. Please add tests with wider types (`i128`, with elements of `i64`, e.g.), and use `APInt` in these calculations.
1889	This also needs some negative tests.
test/Transforms/InstSimplify/AndOrXor.ll
994 ↗	(On Diff #156241)	Please also add two tests with `i32 \| i32` - test extracting low part, and test high part.

fix a bug with an integer larger than 64 bit
add more test cases
remove an unnecessary check

inouehrs marked 11 inline comments as done.Jul 23 2018, 6:08 AM

inouehrs added inline comments.

lib/Analysis/InstructionSimplify.cpp
1871	Actually, this is an unnecessary check. I removed it to increase the optimization opportunities.
1878–1879	Right. Fixed using APInt.
test/Transforms/InstSimplify/AndOrXor.ll
994 ↗	(On Diff #156241)	More tests (including negative tests) added.

inouehrs marked 3 inline comments as done.Jul 23 2018, 6:09 AM

lebedev.ri added inline comments.Jul 23 2018, 6:12 AM

lib/Analysis/InstructionSimplify.cpp
1869	I was indeed specifically talking about `const APInt&`, not `const APInt`, note the `&`.

inouehrs updated this revision to Diff 156765.Jul 23 2018, 6:21 AM

inouehrs updated this revision to Diff 156767.Jul 23 2018, 6:24 AM

inouehrs added inline comments.

lib/Analysis/InstructionSimplify.cpp
1869	Fixed. Thank you for the repeated comments.

Hmm, this looks about right.
SimplifyRightShift() change looks good,
but i have hopefully a last portion of comments on the SimplifyAndInst()..

lib/Analysis/InstructionSimplify.cpp
1866–1868	Value Y, Shift; if (isa<ConstantInt>(Op1) && match(Op0, m_c_Or(m_CombineAnd(m_NUWShl(m_Value(X), m_APInt(ShAmt)), m_Value(Shift)), m_Value(Y)))) { }
1878	Pedantically, i still somehow don't like these calculations :/ I think at least this should be: const APInt EffBitsY = APInt::getLowBitsSet(Width, EffWidthY); Not sure how to more nicely express the `EffBitsX`.
1882–1884	Please see if `APInt::intersects()` and `APInt::isSubsetOf()` could be used here.
1885–1888	return Shift;
test/Transforms/InstSimplify/AndOrXor.ll
973 ↗	(On Diff #156767)	Please give names to all these variables in all the tests, prefix the numeric id with tmp - `s/%/%tmp/`.
1078–1091 ↗	(On Diff #156767)	Could you please move this to before the negative tests?

addressed the comments from @lebedev.ri

lebedev.ri edited the summary of this revision. (Show Details)Jul 26 2018, 12:39 PM

lebedev.ri set the repository for this revision to rL LLVM.

lebedev.ri edited the summary of this revision. (Show Details)Jul 26 2018, 1:13 PM

Yeah, ok, i've convinced myself that this is ok. LGTM.
I do believe that the reasoning for this to be in instsimplify are sound (gvn needs it, running instcombine won't cut it.)

@inouehrs please commit tests now (as of the current trunk, to not angry the bots).
And please wait a bit (2 days?) before committing the transform itself (and the test changes themselves) in case @spatel / others want to comment.

lib/Analysis/InstructionSimplify.cpp
1881	Hmm, right, nice! Even fits within 80-char width :)
1884–1887	`isSubsetOf()` - This operation checks that all bits set in this APInt are also set in RHS. So we check that the mask covers all the possibly-set bits of `X` / `Y`. `intersects()` - This operation tests if there are any pairs of corresponding bits between this APInt and RHS that are both set. And we are checking that the mask does not cover any possibly-set bits of the `Y` / `X`. Looks about right..

This revision is now accepted and ready to land.Jul 26 2018, 1:25 PM

inouehrs mentioned this in rL338107: [InstSimplify] tests for D48828: fold extraction from std::pair.Jul 27 2018, 12:21 AM

In D48828#1177277, @lebedev.ri wrote:

Yeah, ok, i've convinced myself that this is ok. LGTM.
I do believe that the reasoning for this to be in instsimplify are sound (gvn needs it, running instcombine won't cut it.)

@inouehrs please commit tests now (as of the current trunk, to not angry the bots).

Yes, I'd also like to see the tests get committed now with baseline CHECKs.
We've made a good argument for having this in instsimplify, but I don't think we answered a related question: does adding either or both of these to instsimplify allow us to remove code from instcombine? Hopefully, there are existing regression tests for instcombine to verify its transforms - would those tests pass when this patch is applied?
This is 2 independent patches in 1 review (1 transform for 'lshr'; 1 transform for 'and'). It's best if we split them into separate patches.
At least some of the simplify tests do not appear to be minimized. I would expect that all tests end with 'lshr' or 'and' since that is where the pattern matching starts.
Why does the code use isa<ConstantInt> rather than match(op, m_APInt(C))? Using the matcher would give us vector splat functionality for free IIUC.

inouehrs updated this revision to Diff 157850.Jul 28 2018, 6:10 AM

Yes, I'd also like to see the tests get committed now with baseline CHECKs.

I have committed unit tests in https://reviews.llvm.org/rL338107

We've made a good argument for having this in instsimplify, but I don't think we answered a related question: does adding either or both of these to instsimplify allow us to remove code from instcombine?

I have investigated related instcombine patterns. But so far, I do not find something redundant with this instsimplify patch.
The pattern for 'and' is handled by SimplifyDemandedInstructionBits in instcombine and it has much wider coverage than this pattern.
The pattern for 'lshr' is handled by a combination of multiple patterns for or and shifts in instcombine. My instsimplify patch fully covers neither patterns in instcombine.

This is 2 independent patches in 1 review (1 transform for 'lshr'; 1 transform for 'and'). It's best if we split them into separate patches.

I will commit the transformation in lshr and and separately.

At least some of the simplify tests do not appear to be minimized. I would expect that all tests end with 'lshr' or 'and' since that is where the pattern matching starts.

I fixed the test. I intend to make the generated code simpler.

Why does the code use isa<ConstantInt> rather than match(op, m_APInt(C))? Using the matcher would give us vector splat functionality for free IIUC.

Fixed. Thank you for the suggestion.

In D48828#1179343, @inouehrs wrote:

Yes, I'd also like to see the tests get committed now with baseline CHECKs.

I have committed unit tests in https://reviews.llvm.org/rL338107

We've made a good argument for having this in instsimplify, but I don't think we answered a related question: does adding either or both of these to instsimplify allow us to remove code from instcombine?

I have investigated related instcombine patterns. But so far, I do not find something redundant with this instsimplify patch.
The pattern for 'and' is handled by SimplifyDemandedInstructionBits in instcombine and it has much wider coverage than this pattern.
The pattern for 'lshr' is handled by a combination of multiple patterns for or and shifts in instcombine. My instsimplify patch fully covers neither patterns in instcombine.

This is 2 independent patches in 1 review (1 transform for 'lshr'; 1 transform for 'and'). It's best if we split them into separate patches.

I will commit the transformation in lshr and and separately.

Commit - yes. But it wouldn't be bad to review (concurrently) these things as two differentials, too.

At least some of the simplify tests do not appear to be minimized. I would expect that all tests end with 'lshr' or 'and' since that is where the pattern matching starts.

I fixed the test. I intend to make the generated code simpler.

Why does the code use isa<ConstantInt> rather than match(op, m_APInt(C))? Using the matcher would give us vector splat functionality for free IIUC.

Fixed. Thank you for the suggestion.

I'm not sure of the exact problem (vector tests needed?), but added one nit.

lib/Analysis/InstructionSimplify.cpp
1323–1325	I'm not sure this is better, or the full fix (tests needed.) I would think you'd need const APInt ShAmt0, ShAmt1; if (match(Op1, m_APInt(ShAmt2)) && match(Op0, m_c_Or(m_NUWShl(m_Value(X), m_APInt(ShAmt1)), m_Value(Y))) && ShAmt0 == ShAmt1) { const APInt *ShAmt = ShAmt1;
1883–1886	`intersects()` is commutative, so this shouldn't matter.

In D48828#1179348, @lebedev.ri wrote:

In D48828#1179343, @inouehrs wrote:

Why does the code use isa<ConstantInt> rather than match(op, m_APInt(C))? Using the matcher would give us vector splat functionality for free IIUC.

Fixed. Thank you for the suggestion.

I'm not sure of the exact problem (vector tests needed?), but added one nit.

Yes - I think we should have at least 1 minimal test with vector types for each transform, so we know if that's working as expected. I think it will work, but I haven't stepped through to confirm that.

Separate the patch into two; this one is the first of the two.
Add test cases with vector data type.

inouehrs added inline comments.Jul 30 2018, 4:59 AM

lib/Analysis/InstructionSimplify.cpp
1323–1325	Sorry but I cannot catch why you use `m_APInt` in the matcher and then compare the values instead of using `m_Specific`. What kind of code sequences you want to cover with this?

inouehrs mentioned this in D49981: [InstSimplify] fold extracting from std::pair (2/2).Jul 30 2018, 5:18 AM

lebedev.ri added inline comments.Jul 30 2018, 5:34 AM

lib/Analysis/InstructionSimplify.cpp
1323–1325	I'm not sure it matters right now. `m_APInt()` could potentially (not right now) match splat constant with undef's - `<i32 42, i32 undef, i32 42>` But `m_Specific()` compares the pointers, not the underlying data. So if `ShAmt0` and `ShAmt1` are both splat, but have different `undef`s (e.g. only one of them has `undef` elements), they would not have the same constant. Theoretically, that code in my comment would still match this case. But this does not matter right now since `m_APInt()` does not accept constants with undef elements.

Thanks for splitting it up. This is close to good IMO - just a few minor points:

Please add/adjust the tests with baseline checks as a preliminary step; we don't want to lose those in case the code change gets reverted.
Seeing this diff on its own makes it clear that we're overspecifying the more general SimplifyDemandedBits transform (what about the case where the shifts are in the opposite order?). That should be noted as a code comment and in the commit message.*
I don't expect any compile-time problems given that the computeKnownBits is buried under the other pattern checks, but be aware of that concern and watch for regressions.

It has been discussed before that SimplifyDemandedBits really shouldn't be included in InstCombine; it should be its own pass. If that structural change was made, would it make adjusting the optimization pipeline a more appealing solution than this?

test/Transforms/InstSimplify/shift.ll
218	Swap the 'or' operands here so we have coverage for the commuted case? In general, I like to see a test comment that points that out too, so we know what's changing between the tests. Also, it's a matter of taste, but the more common test format would put the test comment above the test definition, so it's clearly separated from the auto-generated CHECK lines.

inouehrs mentioned this in rL338350: [InstSimplify] tests for D48828, D49981: fold extraction from std::pair.Jul 30 2018, 10:11 PM

inouehrs mentioned this in rL338351: [InstSimplify] tests for D48828, D49981: fold extraction from std::pair.Jul 30 2018, 10:29 PM

inouehrs updated this revision to Diff 158182.Jul 31 2018, 12:44 AM

Please add/adjust the tests with baseline checks as a preliminary step; we don't want to lose those in case the code change gets reverted.

I have updated baseline checks.

Seeing this diff on its own makes it clear that we're overspecifying the more general SimplifyDemandedBits transform (what about the case where the shifts are in the opposite order?). That should be noted as a code comment and in the commit message.*

I added comments. I will also mention in the commit message.

It has been discussed before that SimplifyDemandedBits really shouldn't be included in InstCombine; it should be its own pass. If that structural change was made, would it make adjusting the optimization pipeline a more appealing solution than this?

For the original motivating examples on jump threading, it needs inter-BB optimization (e.g. code below). So If SimplifyDemandedBits pass can support inter-BB opt as well as intra-BB opt and executed before the jump threading, it will be more general than this patch.

BB1:
  %shl = shl nuw i64 1, 32
  %or = or i64 %shl, %v
  br %BB2
BB2:
  %phi = phi i64 [ %or, %BB1 ], ... 
  %shr = lshr i64 %phi, 32

lib/Analysis/InstructionSimplify.cpp
1323–1325	I got it. I rewrite the code as you suggested for safety.

LGTM

Closed by commit rL338485: [InstSimplify] fold extracting from std::pair (1/2) (authored by inouehrs). · Explain WhyJul 31 2018, 9:41 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Analysis/

InstructionSimplify.cpp

17 lines

test/

Transforms/

InstSimplify/

shift.ll

12 lines

Diff 158182

lib/Analysis/InstructionSimplify.cpp

Show First 20 Lines • Show All 1,311 Lines • ▼ Show 20 Lines	if (Value *V = SimplifyRightShift(Instruction::LShr, Op0, Op1, isExact, Q,
MaxRecurse))		MaxRecurse))
return V;		return V;

// (X << A) >> A -> X		// (X << A) >> A -> X
Value *X;		Value *X;
if (match(Op0, m_NUWShl(m_Value(X), m_Specific(Op1))))		if (match(Op0, m_NUWShl(m_Value(X), m_Specific(Op1))))
return X;		return X;

		// ((X << A) \| Y) >> A -> X if effective width of Y is not larger than A.
		// We can return X as we do in the above case since OR alters no bits in X.
		// SimplifyDemandedBits in InstCombine can do more general optimization for
		// bit manipulation. This pattern aims to provide opportunities for other
		// optimizers by supporting a simple but common case in InstSimplify.
		lebedev.riUnsubmitted Not Done Reply Inline Actions I think this shouldn't talk about C++ here. Just talk about IR. // Given Op0 l>> Op1. // If Op1 is a constant, and Op0 is (X nuw<< Op1) \| Y // If Y l>> Op1 == 0. we can extract `X` without extra instructions. lebedev.ri: I think this shouldn't talk about C++ here. Just talk about IR. ``` // Given Op0 l>> Op1. //…
		inouehrsAuthorUnsubmitted Not Done Reply Inline Actions Is this better? I mention std::pair as an example. inouehrs: Is this better? I mention std::pair as an example.
		lebedev.riUnsubmitted Not Done Reply Inline Actions There are LLVM IR structures https://llvm.org/docs/LangRef.html#structure-types Is the comment talking about them? Huh, why does it operate on integers then? And other thoughts the readers will have later on. I'd concentrate on IR. lebedev.ri: There are LLVM IR structures https://llvm.org/docs/LangRef.html#structure-types Is the…
		Value *Y;
		lebedev.riUnsubmitted Not Done Reply Inline Actions I'm not sure this is better, or the full fix (tests needed.) I would think you'd need const APInt ShAmt0, ShAmt1; if (match(Op1, m_APInt(ShAmt2)) && match(Op0, m_c_Or(m_NUWShl(m_Value(X), m_APInt(ShAmt1)), m_Value(Y))) && ShAmt0 == ShAmt1) { const APInt ShAmt = ShAmt1; lebedev.ri:* I'm not sure this is better, or the full fix (tests needed.) I would think you'd need ```…
		inouehrsAuthorUnsubmitted Not Done Reply Inline Actions Sorry but I cannot catch why you use `m_APInt` in the matcher and then compare the values instead of using `m_Specific`. What kind of code sequences you want to cover with this? inouehrs: Sorry but I cannot catch why you use `m_APInt` in the matcher and then compare the values…
		lebedev.riUnsubmitted Not Done Reply Inline Actions I'm not sure it matters right now. `m_APInt()` could potentially (not right now) match splat constant with undef's - `<i32 42, i32 undef, i32 42>` But `m_Specific()` compares the pointers, not the underlying data. So if `ShAmt0` and `ShAmt1` are both splat, but have different `undef`s (e.g. only one of them has `undef` elements), they would not have the same constant. Theoretically, that code in my comment would still match this case. But this does not matter right now since `m_APInt()` does not accept constants with undef elements. lebedev.ri: I'm not sure it matters right now. `m_APInt()` could potentially (not right now) match splat…
		inouehrsAuthorUnsubmitted Not Done Reply Inline Actions I got it. I rewrite the code as you suggested for safety. inouehrs: I got it. I rewrite the code as you suggested for safety.
		const APInt ShRAmt, ShLAmt;
		if (match(Op1, m_APInt(ShRAmt)) &&
		lebedev.riUnsubmitted Done Reply Inline Actions Extra unneeded brace `()` lebedev.ri: Extra unneeded brace `()`
		lebedev.riUnsubmitted Done Reply Inline Actions What about commutativity? This should be `m_c_Or`. lebedev.ri: What about commutativity? This should be `m_c_Or`.
		inouehrsAuthorUnsubmitted Not Done Reply Inline Actions Modified. Thank you for pointing this out. inouehrs: Modified. Thank you for pointing this out.
		match(Op0, m_c_Or(m_NUWShl(m_Value(X), m_APInt(ShLAmt)), m_Value(Y))) &&
		ShRAmt == ShLAmt) {
		const KnownBits YKnown = computeKnownBits(Y, Q.DL, 0, Q.AC, Q.CxtI, Q.DT);
		lebedev.riUnsubmitted Done Reply Inline Actions Please add some negative test where this fails. lebedev.ri: Please add some negative test where this fails.
		const unsigned Width = Op0->getType()->getScalarSizeInBits();
		lebedev.riUnsubmitted Done Reply Inline Actions But this isn't the width of `Y`, that's `Y->getType()->getScalarSizeInBits()`. Maybe effective width of `Y`. lebedev.ri: But this isn't the width of `Y`, that's `Y->getType()->getScalarSizeInBits()`. Maybe…
		const unsigned EffWidthY = Width - YKnown.countMinLeadingZeros();
		lebedev.riUnsubmitted Done Reply Inline Actions Why do we check this? I think `nuw` already implies that. lebedev.ri: Why do we check this? I think `nuw` already implies that.
		inouehrsAuthorUnsubmitted Done Reply Inline Actions Fixed. inouehrs: Fixed.
		lebedev.riUnsubmitted Done Reply Inline Actions Can you add a comment explaining what/how this does? (Checks that `(%x << %op1) \| %y` does not touch any bits in `%x`) lebedev.ri: Can you add a comment explaining what/how this does? (Checks that `(%x << %op1) \| %y` does not…
		if (EffWidthY <= ShRAmt->getZExtValue())
		return X;
		}

return nullptr;		return nullptr;
}		}

Value llvm::SimplifyLShrInst(Value Op0, Value *Op1, bool isExact,		Value llvm::SimplifyLShrInst(Value Op0, Value *Op1, bool isExact,
const SimplifyQuery &Q) {		const SimplifyQuery &Q) {
return ::SimplifyLShrInst(Op0, Op1, isExact, Q, RecursionLimit);		return ::SimplifyLShrInst(Op0, Op1, isExact, Q, RecursionLimit);
}		}

▲ Show 20 Lines • Show All 507 Lines • ▼ Show 20 Lines	static Value SimplifyAndInst(Value Op0, Value *Op1, const SimplifyQuery &Q,
// operating on all incoming values of the phi always yields the same value.		// operating on all incoming values of the phi always yields the same value.
if (isa<PHINode>(Op0) \|\| isa<PHINode>(Op1))		if (isa<PHINode>(Op0) \|\| isa<PHINode>(Op1))
if (Value *V = ThreadBinOpOverPHI(Instruction::And, Op0, Op1, Q,		if (Value *V = ThreadBinOpOverPHI(Instruction::And, Op0, Op1, Q,
MaxRecurse))		MaxRecurse))
return V;		return V;

return nullptr;		return nullptr;
}		}

		lebedev.riUnsubmitted Done Reply Inline Actions I do not understand. Why is this only handling the case where `Y` is `bool`? lebedev.ri: I do not understand. Why is this only handling the case where `Y` is `bool`?
		inouehrsAuthorUnsubmitted Done Reply Inline Actions I made the code more generic. What we really need to check is that this AND op selects all bits of X or Y, and no bit from the another. inouehrs: I made the code more generic. What we really need to check is that this AND op selects all bits…
		lebedev.riUnsubmitted Done Reply Inline Actions What we really need to check is that this AND op selects all bits of X or Y, and no bit from the another. That was exactly my point :) lebedev.ri: > What we really need to check is that this AND op selects all bits of X or Y, and no bit from…
Value llvm::SimplifyAndInst(Value Op0, Value *Op1, const SimplifyQuery &Q) {		Value llvm::SimplifyAndInst(Value Op0, Value *Op1, const SimplifyQuery &Q) {
return ::SimplifyAndInst(Op0, Op1, Q, RecursionLimit);		return ::SimplifyAndInst(Op0, Op1, Q, RecursionLimit);
}		}

/// Given operands for an Or, see if we can fold the result.		/// Given operands for an Or, see if we can fold the result.
/// If not, this returns null.		/// If not, this returns null.
static Value SimplifyOrInst(Value Op0, Value *Op1, const SimplifyQuery &Q,		static Value SimplifyOrInst(Value Op0, Value *Op1, const SimplifyQuery &Q,
unsigned MaxRecurse) {		unsigned MaxRecurse) {
		lebedev.riUnsubmitted Not Done Reply Inline Actions Value Y, Shift; if (isa<ConstantInt>(Op1) && match(Op0, m_c_Or(m_CombineAnd(m_NUWShl(m_Value(X), m_APInt(ShAmt)), m_Value(Shift)), m_Value(Y)))) { } lebedev.ri: ``` Value Y, Shift; if (isa<ConstantInt>(Op1) && match(Op0, m_c_Or(m_CombineAnd(m_NUWShl…
if (Constant *C = foldOrCommuteConstant(Instruction::Or, Op0, Op1, Q))		if (Constant *C = foldOrCommuteConstant(Instruction::Or, Op0, Op1, Q))
		lebedev.riUnsubmitted Done Reply Inline Actions I think you can use `const APInt& Mask` here. lebedev.ri: I think you can use `const APInt& Mask` here.
		lebedev.riUnsubmitted Not Done Reply Inline Actions I was indeed specifically talking about `const APInt&`, not `const APInt`, note the `&`. lebedev.ri: I was indeed specifically talking about `const APInt&`, not `const APInt`, note the `&`.
		inouehrsAuthorUnsubmitted Not Done Reply Inline Actions Fixed. Thank you for the repeated comments. inouehrs: Fixed. Thank you for the repeated comments.
return C;		return C;

		lebedev.riUnsubmitted Done Reply Inline Actions But it seems like this only supports extraction of a bit-wide values? Please at least add a `FIXME` comment then. lebedev.ri: But it seems like this only supports extraction of a bit-wide values? Please at least add a…
		inouehrsAuthorUnsubmitted Not Done Reply Inline Actions Actually, this is an unnecessary check. I removed it to increase the optimization opportunities. inouehrs: Actually, this is an unnecessary check. I removed it to increase the optimization opportunities.
// X \| undef -> -1		// X \| undef -> -1
// X \| -1 = -1		// X \| -1 = -1
// Do not return Op1 because it may contain undef elements if it's a vector.		// Do not return Op1 because it may contain undef elements if it's a vector.
if (match(Op1, m_Undef()) \|\| match(Op1, m_AllOnes()))		if (match(Op1, m_Undef()) \|\| match(Op1, m_AllOnes()))
return Constant::getAllOnesValue(Op0->getType());		return Constant::getAllOnesValue(Op0->getType());

// X \| X = X		// X \| X = X
		lebedev.riUnsubmitted Not Done Reply Inline Actions Pedantically, i still somehow don't like these calculations :/ I think at least this should be: const APInt EffBitsY = APInt::getLowBitsSet(Width, EffWidthY); Not sure how to more nicely express the `EffBitsX`. lebedev.ri: Pedantically, i still somehow don't like these calculations :/ I think at least this should be…
// X \| 0 = X		// X \| 0 = X
		lebedev.riUnsubmitted Done Reply Inline Actions Aha. This will miscompile if you are operating on types wider than `i64`. Please add tests with wider types (`i128`, with elements of `i64`, e.g.), and use `APInt` in these calculations. lebedev.ri: Aha. This will miscompile if you are operating on types wider than `i64`. Please add tests…
		inouehrsAuthorUnsubmitted Not Done Reply Inline Actions Right. Fixed using APInt. inouehrs: Right. Fixed using APInt.
if (Op0 == Op1 \|\| match(Op1, m_Zero()))		if (Op0 == Op1 \|\| match(Op1, m_Zero()))
return Op0;		return Op0;
		lebedev.riUnsubmitted Not Done Reply Inline Actions Hmm, right, nice! Even fits within 80-char width :) lebedev.ri: Hmm, right, nice! Even fits within 80-char width :)

// A \| ~A = ~A \| A = -1		// A \| ~A = ~A \| A = -1
if (match(Op0, m_Not(m_Specific(Op1))) \|\|		if (match(Op0, m_Not(m_Specific(Op1))) \|\|
		lebedev.riUnsubmitted Not Done Reply Inline Actions Please see if `APInt::intersects()` and `APInt::isSubsetOf()` could be used here. lebedev.ri: Please see if `APInt::intersects()` and `APInt::isSubsetOf()` could be used here.
match(Op1, m_Not(m_Specific(Op0))))		match(Op1, m_Not(m_Specific(Op0))))
return Constant::getAllOnesValue(Op0->getType());		return Constant::getAllOnesValue(Op0->getType());
		lebedev.riUnsubmitted Not Done Reply Inline Actions `intersects()` is commutative, so this shouldn't matter. lebedev.ri: `intersects()` is commutative, so this shouldn't matter.

		lebedev.riUnsubmitted Not Done Reply Inline Actions `isSubsetOf()` - This operation checks that all bits set in this APInt are also set in RHS. So we check that the mask covers all the possibly-set bits of `X` / `Y`. `intersects()` - This operation tests if there are any pairs of corresponding bits between this APInt and RHS that are both set. And we are checking that the mask does not cover any possibly-set bits of the `Y` / `X`. Looks about right.. lebedev.ri: > `isSubsetOf()` - This operation checks that all bits set in this APInt are also set in RHS.
// (A & ?) \| A = A		// (A & ?) \| A = A
		lebedev.riUnsubmitted Not Done Reply Inline Actions return Shift; lebedev.ri: ``` return Shift; ```
if (match(Op0, m_c_And(m_Specific(Op1), m_Value())))		if (match(Op0, m_c_And(m_Specific(Op1), m_Value())))
		lebedev.riUnsubmitted Not Done Reply Inline Actions This also needs some negative tests. lebedev.ri: This also needs //some// negative tests.
return Op1;		return Op1;

// A \| (A & ?) = A		// A \| (A & ?) = A
if (match(Op1, m_c_And(m_Specific(Op0), m_Value())))		if (match(Op1, m_c_And(m_Specific(Op0), m_Value())))
return Op0;		return Op0;

// ~(A & ?) \| A = -1		// ~(A & ?) \| A = -1
if (match(Op0, m_Not(m_c_And(m_Specific(Op1), m_Value()))))		if (match(Op0, m_Not(m_c_And(m_Specific(Op1), m_Value()))))
▲ Show 20 Lines • Show All 3,242 Lines • Show Last 20 Lines

test/Transforms/InstSimplify/shift.ll

Show First 20 Lines • Show All 172 Lines • ▼ Show 20 Lines	;
%s = sext <2 x i1> %x to <2 x i8>		%s = sext <2 x i1> %x to <2 x i8>
%r = shl <2 x i8> %y, %s		%r = shl <2 x i8> %y, %s
ret <2 x i8> %r		ret <2 x i8> %r
}		}

define i64 @shl_or_shr(i32 %a, i32 %b) {		define i64 @shl_or_shr(i32 %a, i32 %b) {
; CHECK-LABEL: @shl_or_shr(		; CHECK-LABEL: @shl_or_shr(
; CHECK-NEXT: [[TMP1:%.]] = zext i32 [[A:%.]] to i64		; CHECK-NEXT: [[TMP1:%.]] = zext i32 [[A:%.]] to i64
; CHECK-NEXT: [[TMP2:%.]] = zext i32 [[B:%.]] to i64		; CHECK-NEXT: ret i64 [[TMP1]]
; CHECK-NEXT: [[TMP3:%.*]] = shl nuw i64 [[TMP1]], 32
; CHECK-NEXT: [[TMP4:%.*]] = or i64 [[TMP2]], [[TMP3]]
; CHECK-NEXT: [[TMP5:%.*]] = lshr i64 [[TMP4]], 32
; CHECK-NEXT: ret i64 [[TMP5]]
;		;
%tmp1 = zext i32 %a to i64		%tmp1 = zext i32 %a to i64
%tmp2 = zext i32 %b to i64		%tmp2 = zext i32 %b to i64
%tmp3 = shl nuw i64 %tmp1, 32		%tmp3 = shl nuw i64 %tmp1, 32
%tmp4 = or i64 %tmp2, %tmp3		%tmp4 = or i64 %tmp2, %tmp3
%tmp5 = lshr i64 %tmp4, 32		%tmp5 = lshr i64 %tmp4, 32
ret i64 %tmp5		ret i64 %tmp5
}		}
Show All 15 Lines	;
%tmp5 = lshr i64 %tmp4, 31		%tmp5 = lshr i64 %tmp4, 31
ret i64 %tmp5		ret i64 %tmp5
}		}

; Unit test for vector integer		; Unit test for vector integer
define <2 x i64> @shl_or_shr1v(<2 x i32> %a, <2 x i32> %b) {		define <2 x i64> @shl_or_shr1v(<2 x i32> %a, <2 x i32> %b) {
; CHECK-LABEL: @shl_or_shr1v(		; CHECK-LABEL: @shl_or_shr1v(
; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i32> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i32> [[A:%.]] to <2 x i64>
; CHECK-NEXT: [[TMP2:%.]] = zext <2 x i32> [[B:%.]] to <2 x i64>		; CHECK-NEXT: ret <2 x i64> [[TMP1]]
; CHECK-NEXT: [[TMP3:%.*]] = shl nuw <2 x i64> [[TMP1]], <i64 32, i64 32>
; CHECK-NEXT: [[TMP4:%.*]] = or <2 x i64> [[TMP3]], [[TMP2]]
; CHECK-NEXT: [[TMP5:%.*]] = lshr <2 x i64> [[TMP4]], <i64 32, i64 32>
; CHECK-NEXT: ret <2 x i64> [[TMP5]]
;		;
%tmp1 = zext <2 x i32> %a to <2 x i64>		%tmp1 = zext <2 x i32> %a to <2 x i64>
%tmp2 = zext <2 x i32> %b to <2 x i64>		%tmp2 = zext <2 x i32> %b to <2 x i64>
%tmp3 = shl nuw <2 x i64> %tmp1, <i64 32, i64 32>		%tmp3 = shl nuw <2 x i64> %tmp1, <i64 32, i64 32>
%tmp4 = or <2 x i64> %tmp3, %tmp2		%tmp4 = or <2 x i64> %tmp3, %tmp2
		spatelUnsubmitted Done Reply Inline Actions Swap the 'or' operands here so we have coverage for the commuted case? In general, I like to see a test comment that points that out too, so we know what's changing between the tests. Also, it's a matter of taste, but the more common test format would put the test comment above the test definition, so it's clearly separated from the auto-generated CHECK lines. spatel: Swap the 'or' operands here so we have coverage for the commuted case? In general, I like to…
%tmp5 = lshr <2 x i64> %tmp4, <i64 32, i64 32>		%tmp5 = lshr <2 x i64> %tmp4, <i64 32, i64 32>
ret <2 x i64> %tmp5		ret <2 x i64> %tmp5
}		}

; Negative unit test for vector integer		; Negative unit test for vector integer
define <2 x i64> @shl_or_shr2v(<2 x i32> %a, <2 x i32> %b) {		define <2 x i64> @shl_or_shr2v(<2 x i32> %a, <2 x i32> %b) {
; CHECK-LABEL: @shl_or_shr2v(		; CHECK-LABEL: @shl_or_shr2v(
; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i32> [[A:%.]] to <2 x i64>		; CHECK-NEXT: [[TMP1:%.]] = zext <2 x i32> [[A:%.]] to <2 x i64>
Show All 13 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] fold extracting from std::pair (1/2)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 158182

lib/Analysis/InstructionSimplify.cpp

test/Transforms/InstSimplify/shift.ll

[InstSimplify] fold extracting from std::pair (1/2)
ClosedPublic