This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold x + (x | -x) to x & (x - 1)
ClosedPublic

Authored by marcauberer on Sep 6 2022, 8:34 AM.

Details

Summary

Fixes #57531

This transformation may be particularly useful on x86-64, because x & (x - 1) can be performed by a single blsr instruction.

Diff Detail

Event Timeline

marcauberer created this revision.Sep 6 2022, 8:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 6 2022, 8:34 AM
marcauberer requested review of this revision.Sep 6 2022, 8:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 6 2022, 8:34 AM

Sorry for the short/blunt comments - running out of time today!

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1403

Use m_Neg()? Also, you need to use m_Deferred() for those later m_Value()

llvm/test/Transforms/InstCombine/add_or_sub.ll
26

vector cases? multiuse cases? negative cases?

Improve, format and add tests

marcauberer marked an inline comment as done.Sep 6 2022, 11:36 AM

@RKSimon thanks for the review. I've uploaded improvements. Just one question left ...

llvm/test/Transforms/InstCombine/add_or_sub.ll
26

I just added vector and multi-use cases. How do I do negative cases?

spatel added inline comments.Sep 6 2022, 12:06 PM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1399–1402

This shows the expected set of commuted variants of the general pattern, but I don't think the match is handling all of them.

You'll want to add tests for each of these variations, and that's going to require some bonus instructions to make the patterns stick until we reach this point in visitAdd().

Here are some tests that I recently added that can serve as templates - note especially the lines with "thwart complexity-based canoinicalization":
d4a4004c0f9d3070

spatel added inline comments.Sep 6 2022, 12:12 PM
llvm/test/Transforms/InstCombine/add_or_sub.ll
26

For each positive condition in the match statement(s), create a test that does not satisfy the requirement.

So for example, we're looking for a negate (sub) of a specific operand in the 'or', then create a test with a negate of a different value:

%sub = sub i8 0, %y  <-- mismatch, so this should not transform
%or = or i8 %sub, %x
%add = add i8 %or, %x

Or maybe the 'sub' isn't a negate; it might be: (1 - x). Change the 'or' to an 'xor', etc.

Test all combinations and add negative tests

marcauberer marked an inline comment as done.Sep 6 2022, 1:46 PM

@spatel @RKSimon I now added some more extensive tests, which should cover all relevant cases.

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1399–1402

As a matter of fact, it seems like the match is covering all four cases. The first four test cases in the test ref file should cover those. I didn't even need the "thwart complexity based canonicalization instructions"

llvm/test/Transforms/InstCombine/add_or_sub.ll
26

Thanks for the explanation! I've added a few negative tests. Please let me know if something is missing.

marcauberer marked 2 inline comments as done.Sep 7 2022, 3:42 AM
spatel added inline comments.Sep 7 2022, 5:31 AM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1399–1402

The patch doesn't transform this example:

define i32 @add_or_sub_comb_i32_variant1(i32 %p) {
  %x = mul i32 %p, %p ; thwart complexity-based canonicalization
  %sub = sub i32 0, %x
  %or = or i32 %x, %sub
  %add = add i32 %or, %x
  ret i32 %add
}

It will be easier to see what's happening if we commit the baseline tests first (and we'll do that anyway), but you should probably create the expected tests for commuted patterns first.

Fix bug and fine-tune tests

marcauberer marked an inline comment as done.Sep 7 2022, 1:15 PM
marcauberer added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1399–1402

I think I got now what you mean. I added the "thwart complexity-based canonicalization" to every positive test and fixed the bug in the code. Could you take another look?

The baseline tests are now live here: https://reviews.llvm.org/D133449

marcauberer marked an inline comment as done.Sep 7 2022, 1:16 PM

Rebase and update test file

spatel added inline comments.Sep 8 2022, 5:38 AM
llvm/test/Transforms/InstCombine/add_or_sub.ll
80

Oops - I missed this detail in the pre-commit. Shouldn't the vector constant have zeros rather than -1 to exercise this patch?

102–103

We need to bail out on this pattern, so we don't have extra instructions (ie, this will be a negative test). See matcher for "m_OneUse()".

102–103

We need to bail out on this pattern, so we don't have extra instructions (ie, this will be a negative test). See matcher for "m_OneUse()".

102–103

Can we propagate the no-wrap flags to the new add (ie, is it correct based on Alive2 proofs)?

marcauberer added inline comments.Sep 8 2022, 8:19 AM
llvm/test/Transforms/InstCombine/add_or_sub.ll
80

Oh yes, you are right! Is a separate change required for this?

spatel added inline comments.Sep 8 2022, 8:21 AM
llvm/test/Transforms/InstCombine/add_or_sub.ll
80

No - it's minor and obvious in hindsight. Just fix it up in this patch.

Add oneUse to or inst and fix test mistake

marcauberer marked 3 inline comments as done.Sep 8 2022, 8:39 AM
marcauberer added inline comments.Sep 8 2022, 2:22 PM
llvm/test/Transforms/InstCombine/add_or_sub.ll
102–103

Do we need to propagate the no-wrap flags?

Without propagating the flags: https://alive2.llvm.org/ce/z/3rM4KF
With propagating the flags: https://alive2.llvm.org/ce/z/tWyFm7

marcauberer marked an inline comment as done.Sep 8 2022, 2:22 PM
spatel added inline comments.Sep 9 2022, 5:06 AM
llvm/test/Transforms/InstCombine/add_or_sub.ll
102–103

If we can propagate/generate flags during the transform, that is best. It saves some potentially expensive (in compile-time) analysis that tries to regenerate the flags or prevents losing information completely.

Note that "nuw" on these patterns is a special-case:
https://alive2.llvm.org/ce/z/h1763a

I have no idea if that pattern occurs in the real world, but we don't seem to get the optimal result with or without this patch.

It's a small change to the Builder.CreateAdd() call to propagate flags from "I", so we should do that. Please add tests with "nsw" alone and "nuw" alone to verify that we do the right thing in all cases.

spatel added inline comments.Sep 9 2022, 5:14 AM
llvm/test/Transforms/InstCombine/add_or_sub.ll
102–103

On 2nd thought, just add no-wrap flags to some of the adds in the existing tests, and that will more efficiently verify that we are behaving correctly. We don't really need explicit tests just to show flag propagation. You can add a code comment above the test or modify the test function name to explain the intent.

Better handling for nuw and nsw flags in tests

Code cleanup

marcauberer marked 3 inline comments as done.Sep 9 2022, 11:31 AM
marcauberer added inline comments.
llvm/test/Transforms/InstCombine/add_or_sub.ll
102–103

Done, but the positive tests including nuw behave differently, like you mentioned. Do we leave it like this?

marcauberer marked an inline comment as done.Sep 9 2022, 11:32 AM
spatel added inline comments.Sep 9 2022, 11:56 AM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1406

Nit: use CamelCase for variable names, so HasNuw or HasNUW (similar for the others)...or just roll those values anonymously into the CreateAdd() line and let clang-format wrap the long line as it likes.

llvm/test/Transforms/InstCombine/add_or_sub.ll
102–103

I think it's ok since we have plenty of tests covering all of the code paths. The tests with 'nuw' demonstrate that we can chain together 2 or more transforms. If we ever implement the ideal nuw-specific transform after this patch, those tests can confirm if the folds are in the right order to get the best result.

Code cleanup

marcauberer marked 2 inline comments as done.Sep 9 2022, 12:49 PM
marcauberer added inline comments.
llvm/test/Transforms/InstCombine/add_or_sub.ll
102–103

Okay, perfect.

marcauberer marked an inline comment as done.Sep 9 2022, 12:49 PM
spatel accepted this revision.Sep 9 2022, 12:52 PM

Thanks for iterating on this! LGTM

This revision is now accepted and ready to land.Sep 9 2022, 12:52 PM

Update test ref

@spatel sorry, forgot to refresh the test ref. Now the change is ready for the final review ...

This revision was automatically updated to reflect the committed changes.