This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Fold (x & ~y) | y patterns
ClosedPublic

Authored by nikic on Mar 9 2019, 2:28 AM.

Details

Summary

This is an alternative to D59066. We introduce DAGCombine patterns for (x & ~y) | y to x | y.

This allows us to lower the expansion of vselect c, x, -1, which is (x & ~c) | (-1 & c) to just x | y, recovering the uaddsat results from D59066.

Diff Detail

Repository
rL LLVM

Event Timeline

nikic created this revision.Mar 9 2019, 2:28 AM
nikic marked 2 inline comments as done.Mar 9 2019, 3:32 AM
nikic added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
6927 ↗(On Diff #189984)

An alternative and probably better solution would be to delay BIC formation until after legalization (at least vec op legalization), so there is more opportunity for generic combines to apply first. Is there a way to know in here which legalization stage we are currently in and would it be okay to check it?

llvm/test/CodeGen/AArch64/sat-add.ll
607 ↗(On Diff #189984)

There's still some redundant bic's left here. The reason is that isBitwiseNot (and the whole isConstOrConstSplat functionality) does not support build vector constants. Handling this in isBitwiseNot is possible (and apparently only affects this test across all targets), but I think the more robust solution would be to migrate the whole isConstOrConstSplat functionality to return APInt and support implicit build vector truncation.

nikic marked 2 inline comments as done.Mar 10 2019, 3:07 AM
nikic added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
6927 ↗(On Diff #189984)

I've opened D59187 for this variant.

llvm/test/CodeGen/AArch64/sat-add.ll
607 ↗(On Diff #189984)

I've opened https://bugs.llvm.org/show_bug.cgi?id=41020 to keep track of this.

These seem to be managing special cases of the bit select pattern OR(AND(X,Y),AND(Z,~Y)) - would we benefit from adding a helper function to detect these, similar to what we have for isBitwiseNot ? Although in this case we'd have to generate the -1 constant.

nikic added a comment.Mar 10 2019, 6:50 AM

These seem to be managing special cases of the bit select pattern OR(AND(X,Y),AND(Z,~Y)) - would we benefit from adding a helper function to detect these, similar to what we have for isBitwiseNot ? Although in this case we'd have to generate the -1 constant.

Don't think a helper would be useful here, as this is specifically looking for degenerate forms of the bit select pattern where one of the ands has been folded away.

RKSimon edited reviewers, added: efriedma; removed: eli.friedman.Mar 13 2019, 2:01 PM
nikic updated this revision to Diff 190898.Mar 15 2019, 2:22 PM
nikic retitled this revision from [DAGCombine][AArch64] Fold (x & ~y) | y patterns to [DAGCombine] Fold (x & ~y) | y patterns.
nikic edited the summary of this revision. (Show Details)

Remove AArch64 specific portion, which has been handled by rL356299.

These seem to be managing special cases of the bit select pattern OR(AND(X,Y),AND(Z,~Y)) - would we benefit from adding a helper function to detect these, similar to what we have for isBitwiseNot ? Although in this case we'd have to generate the -1 constant.

I'm not sure why/how/for what it would be useful?

This looks good, but is partial, nit added.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5429 ↗(On Diff #190898)

There are 4 variations of this pattern, both or and and are commutative.

nikic updated this revision to Diff 191012.Mar 17 2019, 3:58 AM
nikic marked an inline comment as done.

Handle more commuted variants.

nikic marked 2 inline comments as done.Mar 17 2019, 4:08 AM
nikic added inline comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5429 ↗(On Diff #190898)

Right you are. I've added these now, which resulted in a few more changes to the variablemask tests. I was under the mistaken impression that (and X, (xor Y, -1)) is a canonicalized and-not pattern.

llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask-const.ll
344 ↗(On Diff #191012)

Looks like this doesn't fold for the SSE1 case, because FXOR and friends are created via custom DAG combine very early on.

lebedev.ri accepted this revision.Mar 17 2019, 5:39 AM

Thanks, looks ok, other than repetition..

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5429 ↗(On Diff #190898)

That is indeed a very subtle (and costly!) thing.
(and X, (xor Y, -1)) will be in that order (well, some deterministic order).
But what if X and Y aren't arguments, but some values (instructions)?
Then they will also affect the weights and thus ordering.

5429 ↗(On Diff #190898)

Hmm, nobody likes that there is no pattern matching in back end :/
But this duplication isn't too nice too.
Can you try this instead:

// Fold all 4 commutative variations of    or (and (xor Y, -1), X), Y
// to                                      or X, Y
static SDValue foldDegradedBitselect(SDNode *N, SelectionDAG &DAG) {
  assert(N->getOpcode() == ISD::OR && "outermost node should be 'or'.");

  SDValue X, Y;
  auto outerMatcher = [N, &X, &Y](bool swap) {
    SDValue And = N->getOperand(0);
    Y = N->getOperand(1);

    if (And.getOpcode() != ISD::AND || swap)
      std::swap(And, Y);
    if (And.getOpcode() != ISD::AND)
      return false;

    auto innerMatcher = [And, Y, &X](bool swap) {
      assert(And.getOpcode() == ISD::AND);

      SDValue NotY = And.getOperand(0);
      X = And.getOperand(1);

      if (!isBitwiseNot(NotY) || swap)
        std::swap(NotY, X);
      if (!isBitwiseNot(NotY))
        return false;

      SDValue Y2 = NotY.getOperand(0);

      return Y == Y2;
    };

    return innerMatcher(/*swap=*/false) || innerMatcher(/*swap=*/true);
  };

  if (!(outerMatcher(/*swap=*/false) || outerMatcher(/*swap=*/true)))
    return SDValue();

  return DAG.getNode(ISD::OR, SDLoc(Y), Y.getValueType(), X, Y);
}

?

If that doesn't work, at least factor out these 4 duplicate if's into a function.

llvm/test/CodeGen/AArch64/unfold-masked-merge-scalar-variablemask.ll
358 ↗(On Diff #191012)

(good thing i wrote all these then-pointless redundant tests i guess)

This revision is now accepted and ready to land.Mar 17 2019, 5:39 AM
nikic updated this revision to Diff 191017.Mar 17 2019, 6:30 AM

Add visitORCommutative() function to reduce code duplication when matching commuting patterns.

nikic marked an inline comment as done.Mar 17 2019, 6:37 AM
nikic added inline comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5429 ↗(On Diff #190898)

I've extracted a function that will be called with N0, N1 and N1, N0 to avoid repeating the outer pattern, and because this seems useful for potential future transforms as well.

This still repeats the and commutation, but I think the explicit repetition here is more obvious than the operand swapping variant.

lebedev.ri accepted this revision.Mar 17 2019, 6:44 AM
lebedev.ri added inline comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5429 ↗(On Diff #190898)

Ok, in this case i guess that works too.

lebedev.ri added inline comments.Mar 17 2019, 6:49 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5418–5419 ↗(On Diff #191017)

Actually, there already is this function, but it is only called in one order.
I suppose a follow-up to move contents of visitORCommutative() into visitORLike(),
and calling visitORLike() second time may be a good idea.

This revision was automatically updated to reflect the committed changes.
nikic marked an inline comment as done.Mar 17 2019, 8:54 AM
nikic added inline comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5418–5419 ↗(On Diff #191017)

The existing transforms inside visitORLike wouldn't benefit from being called with commuted arguments (apart from the very first one maybe), as they check for symmetric patterns. Looking at other folds in visitOR most everything either has one constant operand (which is always on the RHS) or deals with symmetric patterns, or is a very simple fold that probably should stay at the start. I think that MatchRotate could in principle be refactored to be called twice with commuted arguments, but it would be larger change to the logic.