This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fold `(a & ~b) & ~c` to `a & ~(b | c)`
ClosedPublic

Authored by rampitec on Oct 19 2021, 4:22 PM.

Details

Summary
%not1 = xor i32 %b, -1
%not2 = xor i32 %c, -1
%and1 = and i32 %a, %not1
%and2 = and i32 %and1, %not2

>

%i1 = or i32 %b, %c
%i2 = xor i32 %1, -1
%and2 = and i32 %i2, %a

Diff Detail

Event Timeline

rampitec created this revision.Oct 19 2021, 4:22 PM
rampitec requested review of this revision.Oct 19 2021, 4:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 19 2021, 4:22 PM
Herald added a subscriber: wdng. · View Herald Transcript
xbolva00 added inline comments.
llvm/test/Transforms/InstCombine/and-xor-or.ll
566

Atleast one vector test please

rampitec updated this revision to Diff 380823.Oct 19 2021, 4:46 PM
rampitec marked an inline comment as done.

Added vector test.

Please pre-commit the tests with current output, so we can confirm that we are testing all of the commuted patterns correctly.

This seems like another short-coming of the reassociation pass, but I think it's ok to deal with the minimal case here.
For example (if I'm seeing it correctly), this test still won't change:

define i4 @src(i4 %a, i4 %b, i4 %c, i4 %d) {
  %notb = xor i4 %b, -1
  %notc = xor i4 %c, -1
  %and1 = and i4 %a, %notb
  %and2 = and i4 %and1, %d
  %and3 = and i4 %and2, %notc
  ret i4 %and3
}

https://alive2.llvm.org/ce/z/_T8ZhP

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2024–2025

I suspect this is dead code (so it can be removed) - complexity-based canonicalization should guarantee that an and is always op0 in this pattern.

llvm/test/Transforms/InstCombine/and-xor-or.ll
531

'not' is considered more complex than an argument, so this is not testing the pattern that you wanted to test.
See InstCombiner::getComplexity() for the details. Search for "thwart complexity-based canonicalization" in this test directory for test coverage that works around it.

578

Similar to above test comment - this is probably getting altered before we get to the new code in this patch, so it doesn't test what you expected.

rampitec updated this revision to Diff 381034.Oct 20 2021, 11:17 AM
rampitec marked 3 inline comments as done.

Addressed review comments.
Pre-commited test and rebased.

This seems like another short-coming of the reassociation pass, but I think it's ok to deal with the minimal case here.
For example (if I'm seeing it correctly), this test still won't change:

define i4 @src(i4 %a, i4 %b, i4 %c, i4 %d) {
  %notb = xor i4 %b, -1
  %notc = xor i4 %c, -1
  %and1 = and i4 %a, %notb
  %and2 = and i4 %and1, %d
  %and3 = and i4 %and2, %notc
  ret i4 %and3
}

https://alive2.llvm.org/ce/z/_T8ZhP

Yes, this test does not change.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2024–2025

That seems to be right. Removed.

spatel added inline comments.Oct 20 2021, 11:53 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2018–2020

You can fold the one-use check into the 1st match with m_OneUse(). Could also use && for the match clauses instead of nested ifs to make the structure more like the code above here.

llvm/test/Transforms/InstCombine/and-xor-or.ll
556

In this test, we do not want to thwart complexity-based canonicalization - we want %a to remain as op1, so you don't need the extra sdiv.

571–572

This one is going to be commuted no matter what we do. It's fine to include this for completeness, but let's have it show another variation - add an extra use for %not2 with something like:

call void @use(i32 %not2)

Similarly, we should have a negative test for the transform when %and1 has another use.

rampitec updated this revision to Diff 381060.Oct 20 2021, 12:31 PM
rampitec marked 3 inline comments as done.

Addressed review comments.

spatel accepted this revision.EditedOct 20 2021, 12:47 PM

LGTM - I'd prefer that the NFC test changes get committed first followed by the code patch and its test changes. That way, we'll still have the expected tests/results in place even if the patch gets reverted for some reason.

This revision is now accepted and ready to land.Oct 20 2021, 12:47 PM
rampitec updated this revision to Diff 381073.Oct 20 2021, 12:53 PM

Rebased on pre-comitted test update.

This revision was landed with ongoing or failed builds.Oct 20 2021, 1:06 PM
This revision was automatically updated to reflect the committed changes.

This seems like another short-coming of the reassociation pass, but I think it's ok to deal with the minimal case here.
For example (if I'm seeing it correctly), this test still won't change:

define i4 @src(i4 %a, i4 %b, i4 %c, i4 %d) {
  %notb = xor i4 %b, -1
  %notc = xor i4 %c, -1
  %and1 = and i4 %a, %notb
  %and2 = and i4 %and1, %d
  %and3 = and i4 %and2, %notc
  ret i4 %and3
}

https://alive2.llvm.org/ce/z/_T8ZhP

Yes, this test does not change.

Actually it is transformed too. It does not with opt -instcombine but does with opt -reassociate -instcombine or just opt -O3.

This seems like another short-coming of the reassociation pass, but I think it's ok to deal with the minimal case here.
For example (if I'm seeing it correctly), this test still won't change:

define i4 @src(i4 %a, i4 %b, i4 %c, i4 %d) {
  %notb = xor i4 %b, -1
  %notc = xor i4 %c, -1
  %and1 = and i4 %a, %notb
  %and2 = and i4 %and1, %d
  %and3 = and i4 %and2, %notc
  ret i4 %and3
}

https://alive2.llvm.org/ce/z/_T8ZhP

Yes, this test does not change.

Actually it is transformed too. It does not with opt -instcombine but does with opt -reassociate -instcombine or just opt -O3.

Thanks, that's good news. If you want to make sure that combination of transforms doesn't break invisibly, you could add a test like that for -O{1,2,3} runs to /test/Transforms/PhaseOrdering..

I forgot to mention it during the review, but we should always have these kinds of bitwise logic folds apply to the Demorgan'ized sibling form too (and this fold itself is just reassociation + Demorgan):
3888de9507c7

I forgot to mention it during the review, but we should always have these kinds of bitwise logic folds apply to the Demorgan'ized sibling form too (and this fold itself is just reassociation + Demorgan):
3888de9507c7

Thanks Sanjay! I now wander if a next testcase I am looking at needs a pattern as I wanted to add to visitOr, because it also looks like a more complex case of reassociation:

(c & ~(a | b)) | (b & ~(a | c)) --> ~a & (b ^ c)

We currently cannot simplify it.

define i32 @or_not_and(i32 %a, i32 %b, i32 %c) {
  %or1 = or i32 %a, %b
  %not1 = xor i32 %or1, -1
  %and1 = and i32 %not1, %c
  %or2 = or i32 %a, %c
  %not2 = xor i32 %or2, -1
  %and2 = and i32 %not2, %b
  %or3 = or i32 %and1, %and2
  ret i32 %or3
}

Actually it is transformed too. It does not with opt -instcombine but does with opt -reassociate -instcombine or just opt -O3.

Thanks, that's good news. If you want to make sure that combination of transforms doesn't break invisibly, you could add a test like that for -O{1,2,3} runs to /test/Transforms/PhaseOrdering..

D112258

(c & ~(a | b)) | (b & ~(a | c)) --> ~a & (b ^ c)

We currently cannot simplify it.

define i32 @or_not_and(i32 %a, i32 %b, i32 %c) {
  %or1 = or i32 %a, %b
  %not1 = xor i32 %or1, -1
  %and1 = and i32 %not1, %c
  %or2 = or i32 %a, %c
  %not2 = xor i32 %or2, -1
  %and2 = and i32 %not2, %b
  %or3 = or i32 %and1, %and2
  ret i32 %or3
}

Hmm...so that's 2 of the patterns in this patch glued together. I don't see an intermediate fold that we can use to reduce it. Either you have to hard-code a big pattern match or add a generalized boolean logic solver (as its own pass).

(c & ~(a | b)) | (b & ~(a | c)) --> ~a & (b ^ c)

We currently cannot simplify it.

define i32 @or_not_and(i32 %a, i32 %b, i32 %c) {
  %or1 = or i32 %a, %b
  %not1 = xor i32 %or1, -1
  %and1 = and i32 %not1, %c
  %or2 = or i32 %a, %c
  %not2 = xor i32 %or2, -1
  %and2 = and i32 %not2, %b
  %or3 = or i32 %and1, %and2
  ret i32 %or3
}

Hmm...so that's 2 of the patterns in this patch glued together. I don't see an intermediate fold that we can use to reduce it. Either you have to hard-code a big pattern match or add a generalized boolean logic solver (as its own pass).

Having a separate boolean logic solver might be a good idea in a long run. I.e. collect operands unused ouside of a dag solved, build truth table, generate a minimal expression. But that probably needs much more thought than this initial idea.