This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Narrow type of logical operation chains in certain cases
Changes PlannedPublic

Authored by mnadeem on Sep 14 2021, 8:33 PM.

Details

Summary

Reassociate chains of
extend(X) | (extend(Y) | Z) --> (extend(X) | extend(Y)) | Z
which will then get optimized to --> extend(X | Y) | Z

Allows performing operations with a higher vector factor when vectorized.
https://alive2.llvm.org/ce/z/ha4JJC

Diff Detail

Event Timeline

mnadeem created this revision.Sep 14 2021, 8:33 PM
mnadeem requested review of this revision.Sep 14 2021, 8:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 14 2021, 8:33 PM
mnadeem edited the summary of this revision. (Show Details)Sep 14 2021, 8:39 PM
mnadeem added reviewers: spatel, lebedev.ri, nikic.

Please

  1. Reduce the tests - they should only contain the minimal needed pattern - two extends and two logical ops.
  2. Add tests with extra uses - this fold can not be performed when the existing logical ops have other uses.
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1650–1653
1654–1656

You should create the inner op with Builder,
and outer BinaryOperator::Create, and just return latter.

mnadeem updated this revision to Diff 373368.Sep 17 2021, 4:43 PM
mnadeem retitled this revision from [InstCombine] Narrow type of logical operation chains where one side is an extend to [InstCombine] Narrow type of logical operation chains when possible.

Address comments.

mnadeem marked 2 inline comments as done.Sep 17 2021, 4:44 PM

Please precommit the tests

mnadeem updated this revision to Diff 373429.Sep 18 2021, 11:41 AM

rebase on precommited tests

lebedev.ri added inline comments.Sep 19 2021, 8:08 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1653

Wait, why are you checking that I, the root instruction, is a single-use?

I think I looked at this as a problem for -reassociate a long time ago. It makes changes on larger patterns like:
https://alive2.llvm.org/ce/z/UD56tJ
...but misses the shorter sequences.

(and this patch would not help match that example in -instcombine alone IIUC)

I think this patch also misses cases where we have a logic op with a wide constant:
https://alive2.llvm.org/ce/z/Ls8hXZ

Do you plan to extend it to deal with those patterns too?

mnadeem updated this revision to Diff 373769.Sep 20 2021, 7:24 PM
mnadeem marked an inline comment as done.
mnadeem retitled this revision from [InstCombine] Narrow type of logical operation chains when possible to [InstCombine] Narrow type of logical operation chains in certain cases.

remove base instruction's check for one use

mnadeem added a comment.EditedSep 20 2021, 7:25 PM

I think I looked at this as a problem for -reassociate a long time ago. It makes changes on larger patterns like:
https://alive2.llvm.org/ce/z/UD56tJ
...but misses the shorter sequences.

(and this patch would not help match that example in -instcombine alone IIUC)

I think this patch also misses cases where we have a logic op with a wide constant:
https://alive2.llvm.org/ce/z/Ls8hXZ

Do you plan to extend it to deal with those patterns too?

I shouldn't have given this revision a general title, I was only targeting a specific case i.e. extend(X) | (extend(Y) | Z)

But your first case can probably be handled in instcombine like this (I havent tested this though):

// %conv1 = sext i4 %c to i6
// %conv2 = sext i4 %d to i6
// %or1 = or i6 %conv1, %a
// %or2 = or i6 %conv2, %b
// %or3 = or i6 %or1, %or2
// to
// %conv1 = sext i4 %c to i6
// %conv2 = sext i4 %d to i6
// %or1 = or i6 %conv1, %conv2 --> will later be converted to extend (or i4 ...)
// %or2 = or i6 %a, %b
// %or3 = or i6 %or1, %or2
Instruction *InstCombinerImpl::reassosNestedCastedBitwiseLogic(BinaryOperator &I) {
  auto LogicOpc = I.getOpcode();
  assert(I.isBitwiseLogicOp() && "Unexpected opcode for bitwise logic instcombine");

  auto *Op0 = dyn_cast<BinaryOperator>(I.getOperand(0));
  auto *Op1 = dyn_cast<BinaryOperator>(I.getOperand(1));
  if (!Op0 || !Op1)
    return nullptr;
  if (Op0->getOpcode() != Op1->getOpcode() || Op0->getOpcode() != LogicOpc)
    return nullptr;
  if (match(Op0, m_OneUse(m_c_BinOp(m_CombineAnd(m_ZExtOrSExt(m_Value()), m_Value(A)), m_Value(B)))) &&
      match(Op1, m_OneUse(m_c_BinOp(m_CombineAnd(m_ZExtOrSExt(m_Value()), m_Value(C)), m_Value(D))))) {
    Value *OrWithExtends = Builder.CreateBinOp(LogicOpc, A, Z);
    Value *OrWithOutExtends = Builder.CreateBinOp(LogicOpc, B, D);
    return BinaryOperator::Create(LogicOpc, NewOp, Z);
  }
  return nullptr;
}

I shouldn't have given this revision a general title, I was only targeting a specific case i.e. extend(X) | (extend(Y) | Z)

No problem - I was just wondering if you had a more general solution in mind. We handle some of the most basic reassociation patterns in instcombine already, so this patch seems fine to me.

if (match(Op0, m_OneUse(m_c_BinOp(m_CombineAnd(m_ZExtOrSExt(m_Value()), m_Value(A)), m_Value(B)))) &&
    match(Op1, m_OneUse(m_c_BinOp(m_CombineAnd(m_ZExtOrSExt(m_Value()), m_Value(C)), m_Value(D))))) {

On the other hand, this is probably going too far. We can keep adding logic ops to the chain to move the casts further and further apart, and there's no realistic way to bring them back together in instcombine. That's the job of the -reassociate pass.

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1628–1633

It's not clear to me why this swap is necessary. Do you have a test case where a logic binop has a cast as operand 0 and a binop as operand 1? Complexity-based canonicalization is supposed to prevent that. See InstCombinerImpl::SimplifyAssociativeOrCommutative().

llvm/test/Transforms/InstCombine/and-xor-or.ll
509–510

I think we're missing tests:

  1. Negative test with different logic opcodes.
  2. Negative test with different cast opcodes.
  3. Test with different cast source types.
  4. Test with multiple uses of cast instruction(s).
  5. Tests where the first cast is operand 1 of the logic op (notice in the original tests that the operands are commuted from where they started - search around the test directory for "thwart complexity-based canonicalization" for ways to prevent that).
mnadeem planned changes to this revision.Sep 21 2021, 10:07 AM
mnadeem updated this revision to Diff 378411.Oct 8 2021, 9:49 PM

Added more tests as per comments.

mnadeem marked an inline comment as done.Oct 8 2021, 9:59 PM
mnadeem added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1628–1633

I think you have it the other way around, the code currently only checks for the cast to be on the LHS.
This swap handles the case when the Op0 (higher complexity) is another binop.

spatel added inline comments.Oct 13 2021, 9:10 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1628–1633

Ah, I see.

I wonder if it would be easier to read if we just match this in its "natural" form then:
(extend(X) | Y) | extend(Z) --> (extend(X) | extend(Z)) | Y

You would hoist this code above the current bailout for !Cast0 if you did it that way.

llvm/test/Transforms/InstCombine/and-xor-or.ll
509–510

I don't see an example of "3" here yet. I'm imagining something like this:

define i64 @f(i64 %a, i8 %b, i16 %c) {
  %conv = zext i8 %b to i64
  %conv2 = sext i16 %c to i64
  %xor = xor i64 %conv, %a
  %xor2 = xor i64 %xor, %conv2
  ret i64 %xor2
}
533

This test doesn't add much value. In InstCombine, I don't think we ever care if the instruction that is being replaced has multiple uses.

551–552

Similar to test comment above: this does not add value vs. the previous test if it is just an extra use of the final value.

573–575

That seems like a reasonable canonicalization (and the 'and' test below shows a potential win)...
But we should note that this is intentional in a code comment.

But what happens if "%conv" is trunc or fptoui or some other non-ext cast?

Please add test(s) with those patterns.

mnadeem updated this revision to Diff 382881.Oct 27 2021, 6:54 PM
mnadeem updated this revision to Diff 382886.Oct 27 2021, 7:20 PM
mnadeem marked 2 inline comments as done.
mnadeem added inline comments.
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1628–1633

Relying on the exit below to keep things simple:

if (!SrcTy->isIntOrIntVectorTy())
  return nullptr;

I can hoist the code but then I would have to add tests for floating point conversions.

llvm/test/Transforms/InstCombine/and-xor-or.ll
509–510

added zext_xor_chain_diffSrcTy above it is similar to your snippet but with both zext casts.

533

Removed the test

551–552

Removed the test

573–575

Added the comments in one of the other tests above and also in the c code.

But what happens if "%conv" is trunc or fptoui or some other non-ext cast?

added zext_trunc_and_chain etc at the bottom. Is this what you meant?

spatel added inline comments.Oct 29 2021, 8:02 AM
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1646

This code comment is not accurate - we allow non-extend casts for one of the operands. See comment on the tests below.

llvm/test/Transforms/InstCombine/and-xor-or.ll
573–575

Sure, that is one example of different cast/type.

But there's an inconsistency here. I don't think you intended it, and I don't think we want it: why are we reassociating any cast as long as the source type is an int or int vector, but not allowing FP casts/types?

It's fine if this patch only deals with pairs of extends or it allows any cast ops, but we're in some in-between state currently if I'm seeing it correctly. We want to avoid an arbitrary line in canonicalization logic.

Please add these tests:

define i32 @zext_bitcast_int_vec_and_chain(i32 %a, i16 %b, <2 x i16> %c) {
  %conv = zext i16 %b to i32
  %conv2 = bitcast <2 x i16> %c to i32
  %and = and i32 %conv, %a
  %and2 = and i32 %and, %conv2
  ret i32 %and2
}

define i32 @zext_bitcast_fp_vec_and_chain(i32 %a, i16 %b, <2 x half> %c) {
  %conv = zext i16 %b to i32
  %conv2 = bitcast <2 x half> %c to i32
  %and = and i32 %conv, %a
  %and2 = and i32 %and, %conv2
  ret i32 %and2
}
lebedev.ri resigned from this revision.Jan 12 2023, 5:24 PM

This review seems to be stuck/dead, consider abandoning if no longer relevant.

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 5:24 PM
Herald added a subscriber: StephenFan. · View Herald Transcript
mnadeem planned changes to this revision.Feb 27 2023, 11:55 AM

Removing from reviewer's ready to review list for now. Will come back to this patch when/if time permits.