This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Try harder to simplify ~(X & Y) -> ~X | ~Y and ~(X | Y) -> ~X & ~Y when X and Y have more than one uses.
AbandonedPublic

Authored by bmakam on Feb 8 2016, 4:39 AM.

Details

Summary

This is a fix for PR26083:
My first attempt to fix this was to skip CSE of compares that were input to idempotent instructions. But, this could leave redundant code and might inhibit inliner from inlining if the IR is bloated. This version enhances the instcombine to look for users where we can elide xor and clone those users if profitable.

Diff Detail

Event Timeline

bmakam updated this revision to Diff 47173.Feb 8 2016, 4:39 AM
bmakam retitled this revision from to [EalryCSE] Ignore conditional idempotent instructions.
bmakam updated this object.
bmakam added a reviewer: mcrosier.
bmakam changed the visibility from "Public (No Login Required)" to "bmakam (Balaram Makam)".
bmakam changed the visibility from "bmakam (Balaram Makam)" to "Public (No Login Required)".
mssimpso edited edge metadata.Feb 8 2016, 7:03 AM

This is a very basic question. But have you thought about enhancing instcombine to catch this case rather than preventing early-cse from optimizing it? It seems like the case you're aiming to prevent could exist without early-cse having created it.

bmakam added a comment.Feb 8 2016, 8:32 AM

This is a very basic question. But have you thought about enhancing instcombine to catch this case rather than preventing early-cse from optimizing it? It seems like the case you're aiming to prevent could exist without early-cse having created it.

Good question. I initially started off implementing an enhancement to instcombine. Specifically, whenever we see a candidate that can be simplified using DeMorgan's Law we currently check if the predicate has more than one use because we would want to avoid inverting the predicate for other users and cause a miscompile. I began implementing an enhancement which checked if all the uses of this predicate can be inverted but this requires walking the use-def chain and seemed very inefficient to catch this corner case. I also considered if we can move the Demorgan Laws simplification to instsimplify but that requires mutation of CFG so that is not possible. Another solution that I thought of in parallel to this solution is to clone the compare instruction during instcombine while applying DeMorgan's Law. However, this will be a performance neutral simplification in most cases as we will be trading an icmp for xor. So, I came up with this solution first to see if it fires for any of Spec2k6.

mcrosier retitled this revision from [EalryCSE] Ignore conditional idempotent instructions to [EarlyCSE] Ignore conditional idempotent instructions.Feb 10 2016, 11:25 AM
mcrosier edited edge metadata.
bmakam updated this revision to Diff 48089.Feb 16 2016, 12:00 PM
bmakam retitled this revision from [EarlyCSE] Ignore conditional idempotent instructions to [InstCombine] Try harder to simplify ~(X & Y) -> ~X | ~Y and ~(X | Y) -> ~X & ~Y when X and Y have more than one uses..
bmakam updated this object.
bmakam added a subscriber: llvm-commits.
majnemer added inline comments.Feb 17 2016, 8:39 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1246

Functions should start with a lower case letter.

1247

Sentances start with an upper case letter.

1251–1253

Why?

1277–1285

Seems like we might do an awfully large amount of work. How large does Users typically get when this xform is profitable?

1289

What stops Users.size() from equaling zero?

1291

You are using cast_or_null but indiscriminately dereference the result, this seems wrong.

test/Transforms/InstCombine/not2.ll
2

This is too broad, please CHECK your tests more thoroughly.

bmakam added inline comments.Mar 9 2016, 7:28 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1277–1285

Sorry for late response. This was added to avoid unnecessarily commutating the operands of AND instruction when we could not elide a xor because it resulted in suboptimal codegen due to another issue that I am addressing in D17942. This is still on my plate and I plan to address the comments here once D17942 is landed.

bmakam updated this revision to Diff 50476.Mar 11 2016, 2:19 PM

Refactored code addressing some review comments now that D17942 is landed.

bmakam marked 3 inline comments as done.Mar 11 2016, 2:25 PM
bmakam added inline comments.
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1245–1301

Only supporting for non vector types for now as the codegen for vector types is suboptimal.

1245–1301

I have refactored the code after landing D17942. To reduce the amount of work I am now checking only local users.

1245–1301

Refactored code so that Users will always contain instructions.

test/Transforms/InstCombine/not2.ll
4

There is nothing more to test in these testcases other than checking if xor has been elided similar to the tests in test/Transforms/InstCombine/not.ll

mcrosier added inline comments.Mar 14 2016, 9:57 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1257

Perhaps you could move the isa<BinaryOperator> check into a separate if statement. IMO, it doesn't go with the comment above.

1262

This can be a cast<> rather than a dyn_cast<> because you've already checked we have a BO.

Alternatively, you could do the following to address the previous comment:

BinaryOperator *BO = dyn_cast<BinaryOperator>(V);
if (!BO)
  return false;

This will also decrease the indention.

1277

Is VI loop invariant? If so, please hoist this out of the loop. Also, I believe we know V is either a cmp or BO, so we can use a cast<> rather than a dyn_cast<>?

mcrosier added inline comments.Mar 14 2016, 9:58 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1257

Ignore this comment.. :)

1262

Ignore this one too..

bmakam updated this revision to Diff 50616.Mar 14 2016, 11:02 AM

Address Chad's comments.

bmakam marked an inline comment as done.Mar 14 2016, 11:03 AM
majnemer added inline comments.Mar 14 2016, 11:15 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1287–1288

Can't this be simplified to a range-based for loop?

bmakam updated this revision to Diff 50629.Mar 14 2016, 12:43 PM

Address David's comments.

bmakam marked an inline comment as done.Mar 14 2016, 12:43 PM
bmakam added inline comments.
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1287–1288

Thanks David, I have simplified it to a range-based for loop.

bmakam marked an inline comment as done.Mar 14 2016, 12:45 PM
bmakam updated this revision to Diff 50636.Mar 14 2016, 1:32 PM

Minor fix to limit patch such that BinaryOp must be Add/Sub.

bmakam updated this revision to Diff 50638.Mar 14 2016, 1:36 PM

another attempt to minor fix, sorry.

mcrosier added inline comments.Mar 14 2016, 1:46 PM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1263

This might read a bit better with something like:

if (BinaryOperator *BO = dyn_cast<BinaryOperator>(V)) {
  // Must be an 'add' or a 'sub'.
  unsigned Opc = BO->getOpcode();
  if (Opc != Instruction::Add && Opc != Instruction::Sub)
    return false;
  // At least one operand must be a constant.
  if (!isa<Constant>(BO->getOperand(0)) &&
      !isa<Constant>(BO->getOperand(1)))
      return false;
}

Also, is it not the case that constants are always canonicalized to the RHS, which would mean the check on operands 0 is unnecessary?

bmakam updated this revision to Diff 50648.Mar 14 2016, 2:26 PM

Update with Chad's suggestion.

bmakam marked an inline comment as done.Mar 14 2016, 2:28 PM
bmakam added inline comments.
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1263

IIRC, instcombine is still early to assume that that constants are always canonicalized to the RHS.

mcrosier added inline comments.Mar 15 2016, 6:22 AM
lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1263

Okay, I just wanted to check. FWIW, the reassocation pass is responsible for the canonicalization.

gberry resigned from this revision.May 23 2016, 1:46 PM
gberry removed a reviewer: gberry.

Removing myself as others seem to be handling review

mcrosier resigned from this revision.Jul 5 2016, 10:40 AM
mcrosier removed a reviewer: mcrosier.

Just getting this off my queue. If you decide to pursue this further, please add me back as a reviewer, Balaram.

mcrosier removed a subscriber: mcrosier.Jul 5 2016, 10:40 AM
bmakam abandoned this revision.Jul 5 2016, 11:02 AM
bmakam marked an inline comment as done.

Performance data was not promising, so abandoning this change.