This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp
ClosedPublic

Authored by amehsan on Oct 3 2016, 11:03 AM.

Details

Summary

This is motivated by PR-30483. New testcases and comments in the code explain the opportunities. Still more work can be done for this. I will open new PR for remaining cases (including those that Sanjay mentioned in the PR 30483)

Diff Detail

Event Timeline

amehsan updated this revision to Diff 73303.Oct 3 2016, 11:03 AM
amehsan retitled this revision from to [InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp.
amehsan updated this object.
amehsan added reviewers: majnemer, spatel.
amehsan added a subscriber: llvm-commits.
amehsan added a subscriber: kbarton.Oct 4 2016, 3:30 PM

I know we want to expand the patterns that are matched beyond what's included in this patch, but it's not clear to me why we need IsAnyBitSet() for these cases. Eg, there are no tests for the 'icmp ugt' case.

Is it simpler to just use more direct pattern matching for now and then see how it can be recoded as we add cases?

I'm also wondering if there's some way to reuse the existing decomposeBitTestICmp() or the code in foldSelectInstWithICmp() that is marked with a FIXME. We also have isSignBitCheck() and isSignTest() in InstCombineCompares.cpp. There must be some way to refactor all of this logic because we're looking for similar patterns in several different places?

test/Transforms/InstCombine/and-or-icmps.ll
54–60

You can use utils/update_test_checks.py to auto-generate better checks for all of these tests. That's how the existing checks in this file were produced.

majnemer added inline comments.Oct 5 2016, 11:26 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
744–750

A tuple with multiple bool types can be a little confusing. I'd prefer a little struct. At the very least, this needs to be documented.

751

Please follow the LLVM naming convention.

752–754

Prefer early return to reduce indentation.

752–755

Prefer auto *

756–769

Prefer std::initialzier_list syntax here to reduce visual noise.

762–765

These parenthesis are not necessary.

831–832

auto *

833

Please stick the return on its own line.

I know we want to expand the patterns that are matched beyond what's included in this patch, but it's not clear to me why we need IsAnyBitSet() for these cases. Eg, there are no tests for the 'icmp ugt' case.

Is it simpler to just use more direct pattern matching for now and then see how it can be recoded as we add cases?

I'm also wondering if there's some way to reuse the existing decomposeBitTestICmp() or the code in foldSelectInstWithICmp() that is marked with a FIXME. We also have isSignBitCheck() and isSignTest() in InstCombineCompares.cpp. There must be some way to refactor all of this logic because we're looking for similar patterns in several different places?

IsAnyBitSet factors out common pattern matching logic in FoldAndOFICmp and FoldXorOfICmp. I can add testcase for UGT case. DecomposeBitTestICmp will be helpful to be called for extending the current functionality. Currently IsAnyBitSet expects patterns similar to those generated by decomposeBitTestICmp so I believe we can first call decomposeBitTestICmp and then IsAnyBitSet to extend the functionality. I prefer to leave that as an extension of the current functionality. isSignBitCheck is a similar but different functionality because it looks at a single special bit. IsAnyBitSet looks at masks that might have more than one bit. I also looked at the FIXME in simplifySelectWithICmpCond if that is what you mean. It says the logic is similar to decomposeBitTestICmp. So given what I said about decomposeBitTestICmp this is a separate issue.

amehsan marked 9 inline comments as done.Oct 24 2016, 1:05 PM
amehsan added inline comments.
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
744–750

I figured that I will need to add some comments even if I have a struct, because the field names will not be clear enough. So I concluded creating a struct will not be a significant help to clarify things and so I added explanation to the comments before the function to explain each component of the return value. Will update the patch shortly.

amehsan updated this revision to Diff 75639.Oct 24 2016, 1:23 PM
amehsan edited edge metadata.
amehsan marked an inline comment as done.

Applied the comments marked as DONE.

amehsan added inline comments.Oct 24 2016, 3:07 PM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
744–750

Actually, I changed my mind about this. Will add an struct and update the patch.

amehsan updated this revision to Diff 75762.Oct 25 2016, 12:24 PM

returning a struct from the helper function instead of std::tuple

majnemer added inline comments.Oct 28 2016, 1:09 PM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
760

auto *

762

Er, shouldn't this be !Inst || Inst->getOpcode() != Instruction::And

amehsan updated this revision to Diff 76737.EditedNov 2 2016, 10:35 AM

Thanks David for the review. I have tested this with test-suite. I forgot to do so, after I made first set of changes.

amehsan marked 2 inline comments as done.Nov 2 2016, 10:36 AM

ping :-)

Could someone take a look at this?

@spatel @eli.friedman Could one of you please look at this patch? @majnemer said he is busy and will be even busier next week.

@spatel @eli.friedman Could one of you please look at this patch? @majnemer said he is busy and will be even busier next week.

To be clear, @majnemer said that one of you guys can look into this instead of him.

spatel added inline comments.Nov 18 2016, 4:27 PM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
747–748

Is the Valid field necessary? The callers can always just check if (!Mask) ?

755–758

I've scanned over this a few times, and it just never reads clearly to me. Can you add comments and example patterns, so we'll know how this will be used?

762–763

This was not updated to address the earlier review comment.

765

Please use m_APInt here or add a TODO comment because this won't work for vectors.

776–778

The UGT case still has no test coverage?

831–832

Same as above - use m_APInt or add a TODO about the lack of vector support.

837

Can we check that RHSCst == LHSCst and remove the getBitWidth checks below?

841–843

This needs a comment that shows the pattern we are matching and what it folds to.

936

Can we check that RHSCst == LHSCst and remove the getBitWidth checks below?

942–948

This needs an explanatory comment with an example.

test/Transforms/InstCombine/and-or-icmps.ll
54–56

Each test should have a comment to explain the motivation. Ie, it's good to have negative tests, but I can't tell from glancing at these why we would do the transform in some cases and skip it in the others.

amehsan marked 7 inline comments as done.Nov 28 2016, 3:53 PM

Will shortly update the patch.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
755–758

added a comment. I hope that make is it more readable.

837

See the respond to a similar comment below.

936

getBitWidth is used for the mask which is operand of the and operator. RHSCst and LHSCst are operands of icmp instructions. I double checked the set of conditions that I have used below and they are all required.

942–948

Example is provided in lines 945-946. (line numbers may change after updating the review). Will revise the comment to be more general and contains more details.

amehsan updated this revision to Diff 79473.Nov 28 2016, 4:42 PM
amehsan marked an inline comment as done.

Thanks for adding the code and test comments - this is much easier to follow now. See inline comments for a couple of small items.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
758–766

Simplify?

ConstantInt *Mask;
if (!match(LHS, m_And(m_Value(), m_ConstantInt(Mask))))
  return BGC;
837

Sorry for not seeing it, but I don't know how it's possible that:

(*BGC1.Mask == *BGC2.Mask) && (BGC1.Mask->getBitWidth() != BGC2.Mask->getBitWidth())

Can you add a test case to show where that happens?

To be clear, my proposal is to change the initial check:

if (RHSCst->isZero() && LHSCst->isZero())

to:

if (RHSCst->isZero() && LHSCst == RHSCst)

And then remove the bit width check from the later 'if' statement. The operand types of icmp and xor/and must match, so I thought that would be sufficient?

2795

Period at end of sentence.

amehsan marked an inline comment as not done.Nov 29 2016, 1:08 PM
amehsan added inline comments.
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
758–766

Sure.

837

operator== that is called with (*BGC1.Mask == *BGC2.Mask) has an assertion that the two masks has the same bitwidth. IIUC, if we want to do LHSCst == RHSCst the same operator== is called, and so we would need to check the bit widths of RHSCst and LHSCst. So overall it does not make a difference.

2795

Sure.

spatel added inline comments.Nov 29 2016, 1:38 PM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
837

*BGC1.Mask == *BGC2.Mask is the APInt operator== and yes, it asserts on bitwidth.
LHSCst == RHSCst is just an equality check of 2 pointers. Ie, I think there can only be one '0' value of a given bitwidth.

amehsan added inline comments.Nov 29 2016, 1:42 PM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
837

Sorry, I misinterpreted the code. Will change.

amehsan updated this revision to Diff 79897.Dec 1 2016, 5:22 AM
spatel added a comment.Dec 1 2016, 9:00 AM

Sorry for not noticing this earlier, but InstCombine always canonicalizes the 'icmp ugt X, 0' cases to 'icmp ne X, 0' (see foldICmpUsingKnownBits()).

Can you just assert that the predicate is not 'ugt' in isAnyBitSet() and simplify that a bit more? Ie, we don't need a switch if we know that eq/ne are the only predicate possibilities for this transform.

spatel added a comment.Dec 1 2016, 9:12 AM

Can you just assert that the predicate is not 'ugt' in isAnyBitSet() and simplify that a bit more? Ie, we don't need a switch if we know that eq/ne are the only predicate possibilities for this transform.

"assert" is probably too strong...I've made that mistake before. :)
But I think it's still safe to ignore the 'ugt' case and assume that InstCombine will churn that away sooner or later.

amehsan added a comment.EditedDec 1 2016, 11:10 AM

Can you just assert that the predicate is not 'ugt' in isAnyBitSet() and simplify that a bit more? Ie, we don't need a switch if we know that eq/ne are the only predicate possibilities for this transform.

"assert" is probably too strong...I've made that mistake before. :)
But I think it's still safe to ignore the 'ugt' case and assume that InstCombine will churn that away sooner or later.

You are right (I just tried it :-) If that is the only concern is it possible to approve this and I will remove that line from the switch stmt. (I think it does not hurt to leave the testcases though) and commit.

spatel accepted this revision.Dec 1 2016, 11:19 AM
spatel edited edge metadata.

You are right (I just tried it :-) If that is the only concern is it possible to approve this and I will remove that line from the switch stmt. (I think it does not hurt to leave the testcases though) and commit.

Yes, it's good to keep the tests to make sure the underlying assumption doesn't break.
LGTM - although I don't think we need a switch for just 2 cases. :)

This revision is now accepted and ready to land.Dec 1 2016, 11:19 AM

You are right (I just tried it :-) If that is the only concern is it possible to approve this and I will remove that line from the switch stmt. (I think it does not hurt to leave the testcases though) and commit.

Yes, it's good to keep the tests to make sure the underlying assumption doesn't break.
LGTM - although I don't think we need a switch for just 2 cases. :)

Sure :-) Will change it to if statement after removing UGT