This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold ashr-exact into a icmp-ugt.
ClosedPublic

Authored by nadav on Jan 13 2022, 2:28 PM.

Details

Summary

This commit optimizes the code sequence icmp-XXX (ashr-exact (X, C_1), C_2). Instcombine already implements this optimization for sgt, and this patch adds support to additional predicates. The transformation is legal for all predicates if the 'exact' flag is set, and to SGE, UGE, SLT, ULT when the exact flag is not present.

This pattern is found in the std::vector bounds checks code of the at() method.

Alive2 proof:
https://alive2.llvm.org/ce/z/JT_WL8

Diff Detail

Event Timeline

nadav created this revision.Jan 13 2022, 2:28 PM
nadav requested review of this revision.Jan 13 2022, 2:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 13 2022, 2:28 PM
nadav edited the summary of this revision. (Show Details)Jan 13 2022, 2:28 PM

Upload with full context please.

lebedev.ri edited the summary of this revision. (Show Details)Jan 13 2022, 2:37 PM
lebedev.ri added a reviewer: lebedev.ri.

Given exact ashr, this is a valid fold regardless of the predicate.
Non-exact is valid for ult/uge / slt/sge predicates.

nadav updated this revision to Diff 399842.EditedJan 13 2022, 5:03 PM
nadav edited the summary of this revision. (Show Details)

@lebedev.ri Good catch. Thank you. I verified that when 'exact' is present the transformation is valid for all predicates.

@craig.topper I added context to the patch. Thank you.

craig.topper added inline comments.Jan 13 2022, 5:10 PM
lib/Transforms/InstCombine/InstCombineCompares.cpp
2242 ↗(On Diff #399842)

Can ICMP_SGE and ICMP_UGE get here? We're starting from a shift by constant that should be canonicalized to SGT and UGT right?

2242 ↗(On Diff #399842)

Oops we're starting with a compare with constant that should be canonicalized to SGT to UGT if it was SGE or UGE.

craig.topper added inline comments.Jan 13 2022, 5:19 PM
lib/Transforms/InstCombine/InstCombineCompares.cpp
2250 ↗(On Diff #399842)

Does this transform work for ICMP_UGT?

craig.topper added inline comments.Jan 13 2022, 5:29 PM
test/Transforms/InstCombine/icmp-shr-lt-gt.ll
2219 ↗(On Diff #399842)

Top of the test context is missing. Should this test be generated by update_test_checks.py?

nadav updated this revision to Diff 399871.Jan 13 2022, 7:44 PM

@craig.topper thank you for the review. I removed the two predicated that are canonicalized before this method is called (ICMP_SGE and ICMP_UGE). I also used the script update_test_checks.py to update the check lines. I also verified with alive2 that the transformation is correct for ICMP_UGT, but only when the exact flag is present (link: https://alive2.llvm.org/ce/z/qBhxAM).

@craig.topper thank you for the review. I removed the two predicated that are canonicalized before this method is called (ICMP_SGE and ICMP_UGE). I also used the script update_test_checks.py to update the check lines. I also verified with alive2 that the transformation is correct for ICMP_UGT, but only when the exact flag is present (link: https://alive2.llvm.org/ce/z/qBhxAM).

Thanks Nadav. My UGT question was directed at the transform immediately below. Where it’s looking for SGT and a constant that is 1 less. That would be the form that UGE would canonicalize to I think. Something like this https://alive2.llvm.org/ce/z/6EXVFs

nadav marked an inline comment as done.Jan 13 2022, 9:42 PM

@craig.topper Ah. I think that the ICMP_UGT transformation is correct, but if you don't mind I prefer to land the two changes separately. I can start a second review for the ICMP_UGT change. Is that okay with you?

@craig.topper Ah. I think that the ICMP_UGT transformation is correct, but if you don't mind I prefer to land the two changes separately. I can start a second review for the ICMP_UGT change. Is that okay with you?

That's ok with me.

Logic looks good.
Please pre-commit the new tests with baseline CHECK lines.
It would be nice to have at least one test with a vector type and splat constants, so we have coverage for that possibility. I'd also add at least one test with an extra use of the ashr, so we know that doesn't impede the transform.

lib/Transforms/InstCombine/InstCombineCompares.cpp
2243–2244 ↗(On Diff #399871)

This comment should be updated to be more general. Author's pref for notation/formatting, but it's something like this:

// When ShAmtC can be shifted losslessly:
// (X s>> ShAmtC) < C --> X < (C << ShAmtC)  (either signed/unsigned)
// (X s_exact>> ShAmtC) [AnyPred] C --> X [AnyPred] (C << ShAmtC)
test/Transforms/InstCombine/icmp-shr.ll
716 ↗(On Diff #399871)

Here and below: remove stale test comments to avoid confusion.

@spatel Thank you for the review. I added the pre-commit tests in a new diff (https://reviews.llvm.org/D117338). I've also made the changes to the comment as you suggested. Are you willing to approve the diff or do you want to see the comment and do another round of review?

spatel accepted this revision.Jan 14 2022, 10:49 AM

@spatel Thank you for the review. I added the pre-commit tests in a new diff (https://reviews.llvm.org/D117338). I've also made the changes to the comment as you suggested. Are you willing to approve the diff or do you want to see the comment and do another round of review?

I don't have anything other than the minor feedback, so no, I don't need another round of review. LGTM.

This revision is now accepted and ready to land.Jan 14 2022, 10:49 AM
lebedev.ri added inline comments.Jan 14 2022, 11:01 AM
lib/Transforms/InstCombine/InstCombineCompares.cpp
2242 ↗(On Diff #399871)

This should be

if (IsExact || Pred == CmpInst::ICMP_SLT || Pred == CmpInst::ICMP_ULT) {
test/Transforms/InstCombine/icmp-shr-lt-gt.ll
2219 ↗(On Diff #399871)

Could you please ensure that there is a test for *every* comparison prediocate? (x2 because with/without exact)

@lebedev.ri Okay. I'll add the extra tests, but I'll do it in a separate diff because I've already landed the pre-commit diff that added the extra tests. I also need to look at the optimization that craig pointed out.

This revision was landed with ongoing or failed builds.Jan 14 2022, 12:59 PM
This revision was automatically updated to reflect the committed changes.

FYI we see internal test became flaky after this change under MSAN. It was resolved later by https://reviews.llvm.org/D117110 (@spatel). Unfortunately I don't have any additional information.