This is an archive of the discontinued LLVM Phabricator instance.

And for the record: It's easy enough to prove this correct with value tables and alive. Though I can't currently figure out a way to deduce this formula myself so there may be generalizations or underlying principles that I am missing which would produce more instcombine patterns...

Harbormaster completed remote builds in B122393: Diff 370398.Sep 2 2021, 2:47 PM

Seems this is called masked merge and there is prior discussion/code:

Was this combine here missed or is it somehow not good? It does make things slightly faster in my code...

So @lebedev.ri if I did my history research correctly, then the pattern in my patch here has also been part of https://reviews.llvm.org/D46814. That diff however was reverted in the meantime (see https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180528/556578.html @echristo) because of undef behavior? My version here adds a freeze instruction (thanks to alive2.llvm.org for catching this or I would have missed that subtlety too!).

So maybe it this is ok now?

This is an alternative bit select pattern - do we know if many backends match this pattern as well as the more common (a & b) | (~a & c) to their bit select instructions (bsl/pcmov/etc.)?

The xor pattern is better on SSE for cases where b and c were both constants, plus its entirely commutable.

• hafixo added a commit: rCRT373035: hwasan: Compatibility fixes for short granules..Sep 6 2021, 12:44 AM

• hafixo added a commit: rGc336557f0238: hwasan: Compatibility fixes for short granules..Sep 6 2021, 12:47 AM

thopre removed a commit: rGc336557f0238: hwasan: Compatibility fixes for short granules..Sep 7 2021, 2:47 AM

thopre removed a commit: rCRT373035: hwasan: Compatibility fixes for short granules..Sep 7 2021, 2:51 AM

do we know if many backends match this pattern as well as the more common (a & b) | (~a & c) to their bit select instructions (bsl/pcmov/etc.)?

I tried llc with the following test:

define i32 @variant0(i32, i32, i32) {
  %4 = and i32 %1, %0
  %5 = xor i32 %0, -1
  %6 = and i32 %5, %2
  %7 = or i32 %6, %4
  ret i32 %7
}

define i32 @variant1(i32, i32, i32) {
  %4 = xor i32 %2, %1
  %5 = and i32 %4, %0
  %6 = xor i32 %5, %2
  ret i32 %6
}

(generic) X86: variant0 produces "mov; and; not; and; or"; variant1 produces "mov; xor; and; xor"
X86 -mattr=+bmi: variant0 and variant1 produce "and, andn, orl"
AArch64: variant0 and variant1 produce "bic, and, orr"
ARM: variant0 produces "and; bic; orr"; variant1 produces "eor; and; eor"
ppc32: variant0/variant1 produce the same code: "and; andc; or"
ppc64: variant0 produces "and; xori; xoris; and; or"?!? variant1 produces "andc; and; or" which is clearly better
riscv32/riscv64: variant0 produces "and; not; and; or"; variant1 produces "xor; and; xor"
s390x: variant0 produces "nr; xilf; nr; or"; variant1 produces "xr; nr; xr"

-> For targets without an "and-not" instruction variant1 seems better and this patch will help produce better code there. Most targets with an "and-not" use it either way, except for ARM which fails to recognize the xor variant and PPC64 which fails to recognize the and-not variant...

Also a goal for LLVM-IR is normalizing the program and this patch will help with that.

It looks like have a reverse fold to this in DAG: DAGCombiner::unfoldMaskedMerge

Also, InstCombine has a fold for: (X&~Z)|(Y&Z) -> select(X,Y) but it doesn't handle the XOR variant: https://simd.godbolt.org/z/sWM3KxbrE

It looks like have a reverse fold to this in DAG: DAGCombiner::unfoldMaskedMerge

Noticed that too. It depends on TargetLower::hasAndNot() to return true.

I tried implementing hasAndNot() in the ARM and SystemZ backend which lacked this function. It does eliminate problems with this patch, unfortunately there are ripple effects in other areas and I am not sure how deep I want to dig (patterns like select(x < 0, 0, x) are DAGCombined into (x & ~(x >>s 31)) making the targets fail to select specialized saturation instructions...). Though I could probably refactor things to make it possible to independently unfold masked merge separately from the and-not zero saturation.

MatzeB mentioned this in D109850: Implement SystemZIselLowering::hasAndNot.Sep 15 2021, 1:36 PM

Dropped masked-merge commutative tests which no longer apply.
Put on top of patch improving SystemZ codegen to avoid codegen regression there.

MatzeB retitled this revision from [InstCombine] Optimize (a & b) | (~a & c) to [InstCombine] Optimize (a & b) | (~a & c) (canonicalize masked merge).Sep 15 2021, 1:41 PM

MatzeB retitled this revision from [InstCombine] Optimize (a & b) | (~a & c) (canonicalize masked merge) to [InstCombine] Canonicalize masked merge; optimize (a & b) | (~a & c).

MatzeB added a parent revision: D109850: Implement SystemZIselLowering::hasAndNot.

I filed https://bugs.llvm.org/show_bug.cgi?id=51876 about the different ARM codegen. I believe we can land this change anyway, since ARM codegen produces the same amount of instructions with any masked-merge pattern.

Harbormaster completed remote builds in B124079: Diff 372789.Sep 15 2021, 2:04 PM

I don't think this makes sense as an InstCombine (middle end) transform. The resulting expression is less analyzable both in that freeze is a complete analysis blocker, and xor is generally less analyzable than and/or. This looks like something we should be doing in the backend instead.

I don't think this makes sense as an InstCombine (middle end) transform. The resulting expression is less analyzable both in that freeze is a complete analysis blocker, and xor is generally less analyzable than and/or. This looks like something we should be doing in the backend instead.

It's easy enough to move this pattern to SelectionDAG.

Though on a general level, my philosophy was always that if you have 2 equivalent expressions then it's best to canonicalize towards one of them in InstCombine. Not sure if freeze is a good enough reason to not canonicalize...

MatzeB added a reviewer: nlopes.Sep 15 2021, 6:30 PM

What about canonicalizing to (a & b) | (~a & c)?

In D109194#3030775, @RKSimon wrote:

What about canonicalizing to (a & b) | (~a & c)?

That would also work well. But it will require a freeze just the same as we go from one use of a to two.

And FWIW: My gut feeling is that a canonicalization is worth having an extra freeze around...

Abandoning in favor of a SelectionDAG solution in D112754

Herald added a subscriber: modimo. · View Herald TranscriptOct 28 2021, 1:50 PM

MatzeB abandoned this revision.Oct 28 2021, 1:50 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineAndOrXor.cpp

15 lines

test/

Transforms/

InstCombine/

116 lines

116 lines

116 lines

30 lines

Diff 372789

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

Show First 20 Lines • Show All 2,766 Lines • ▼ Show 20 Lines	if (Op0->hasOneUse() \|\| Op1->hasOneUse()) {
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);
if (Value *V = matchSelectFromAndOr(B, D, C, A))		if (Value *V = matchSelectFromAndOr(B, D, C, A))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);
if (Value *V = matchSelectFromAndOr(D, B, A, C))		if (Value *V = matchSelectFromAndOr(D, B, A, C))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);
if (Value *V = matchSelectFromAndOr(D, B, C, A))		if (Value *V = matchSelectFromAndOr(D, B, C, A))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);
}		}

		// (a & b) \| (~a & c) -> ((b ^ c_) & a) ^ c_ with c_ = freeze(c)
		{
		Value A, B, *C;
		if (Op0->hasOneUse() && Op1->hasOneUse() &&
		(match(&I, m_BinOp(m_c_And(m_Value(A), m_Value(B)),
		m_c_And(m_Not(m_Deferred(A)), m_Value(C)))) \|\|
		match(&I, m_BinOp(m_c_And(m_Not(m_Value(A)), m_Value(C)),
		m_c_And(m_Deferred(A), m_Value(B)))))) {
		Value *FrozenC = Builder.CreateFreeze(C);
		Value *Xor0 = Builder.CreateXor(B, FrozenC);
		Value *And = Builder.CreateAnd(Xor0, A);
		return BinaryOperator::CreateXor(And, FrozenC);
		}
		}
}		}

// (A ^ B) \| ((B ^ C) ^ A) -> (A ^ B) \| C		// (A ^ B) \| ((B ^ C) ^ A) -> (A ^ B) \| C
if (match(Op0, m_Xor(m_Value(A), m_Value(B))))		if (match(Op0, m_Xor(m_Value(A), m_Value(B))))
if (match(Op1, m_Xor(m_Xor(m_Specific(B), m_Value(C)), m_Specific(A))))		if (match(Op1, m_Xor(m_Xor(m_Specific(B), m_Value(C)), m_Specific(A))))
return BinaryOperator::CreateOr(Op0, C);		return BinaryOperator::CreateOr(Op0, C);

// ((A ^ C) ^ B) \| (B ^ A) -> (B ^ A) \| C		// ((A ^ C) ^ B) \| (B ^ A) -> (B ^ A) \| C
▲ Show 20 Lines • Show All 906 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/masked-merge-add.ll

Show First 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	;
%ret = add <3 x i32> %and, %and1		%ret = add <3 x i32> %and, %and1
ret <3 x i32> %ret		ret <3 x i32> %ret
}		}

; ============================================================================ ;		; ============================================================================ ;
; Commutativity.		; Commutativity.
; ============================================================================ ;		; ============================================================================ ;

; Used to make sure that the IR complexity sorting does not interfere.
declare i32 @gen32()

define i32 @p_commutative0(i32 %x, i32 %y, i32 %m) {
; CHECK-LABEL: @p_commutative0(
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.]] = and i32 [[NEG]], [[Y:%.]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND]], [[AND1]]
; CHECK-NEXT: ret i32 [[RET]]
;
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %neg, %y
%ret = add i32 %and, %and1
ret i32 %ret
}

define i32 @p_commutative1(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative1(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], [[M:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND]], [[AND1]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %x, %m
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = add i32 %and, %and1
ret i32 %ret
}

define i32 @p_commutative2(i32 %x, i32 %y, i32 %m) {
; CHECK-LABEL: @p_commutative2(
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], [[M:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.]] = and i32 [[NEG]], [[Y:%.]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%and = and i32 %x, %m
%neg = xor i32 %m, -1
%and1 = and i32 %neg, %y
%ret = add i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_commutative3(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative3(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND]], [[AND1]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = add i32 %and, %and1
ret i32 %ret
}

define i32 @p_commutative4(i32 %x, i32 %y, i32 %m) {
; CHECK-LABEL: @p_commutative4(
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.]] = and i32 [[NEG]], [[Y:%.]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %neg, %y
%ret = add i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_commutative5(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative5(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], [[M:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %x, %m
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = add i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_commutative6(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative6(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = add i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_constmask_commutative(i32 %x, i32 %y) {		define i32 @p_constmask_commutative(i32 %x, i32 %y) {
; CHECK-LABEL: @p_constmask_commutative(		; CHECK-LABEL: @p_constmask_commutative(
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], 65280		; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], 65280
; CHECK-NEXT: [[AND1:%.]] = and i32 [[Y:%.]], -65281		; CHECK-NEXT: [[AND1:%.]] = and i32 [[Y:%.]], -65281
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]		; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]		; CHECK-NEXT: ret i32 [[RET]]
;		;
%and = and i32 %x, 65280		%and = and i32 %x, 65280
▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/masked-merge-or.ll

Show First 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	;
%ret = or <3 x i32> %and, %and1		%ret = or <3 x i32> %and, %and1
ret <3 x i32> %ret		ret <3 x i32> %ret
}		}

; ============================================================================ ;		; ============================================================================ ;
; Commutativity.		; Commutativity.
; ============================================================================ ;		; ============================================================================ ;

; Used to make sure that the IR complexity sorting does not interfere.
declare i32 @gen32()

define i32 @p_commutative0(i32 %x, i32 %y, i32 %m) {
; CHECK-LABEL: @p_commutative0(
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.]] = and i32 [[NEG]], [[Y:%.]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND]], [[AND1]]
; CHECK-NEXT: ret i32 [[RET]]
;
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %neg, %y
%ret = or i32 %and, %and1
ret i32 %ret
}

define i32 @p_commutative1(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative1(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], [[M:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND]], [[AND1]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %x, %m
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = or i32 %and, %and1
ret i32 %ret
}

define i32 @p_commutative2(i32 %x, i32 %y, i32 %m) {
; CHECK-LABEL: @p_commutative2(
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], [[M:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.]] = and i32 [[NEG]], [[Y:%.]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%and = and i32 %x, %m
%neg = xor i32 %m, -1
%and1 = and i32 %neg, %y
%ret = or i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_commutative3(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative3(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND]], [[AND1]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = or i32 %and, %and1
ret i32 %ret
}

define i32 @p_commutative4(i32 %x, i32 %y, i32 %m) {
; CHECK-LABEL: @p_commutative4(
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.]] = and i32 [[NEG]], [[Y:%.]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %neg, %y
%ret = or i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_commutative5(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative5(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], [[M:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %x, %m
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = or i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_commutative6(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative6(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = or i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_constmask_commutative(i32 %x, i32 %y) {		define i32 @p_constmask_commutative(i32 %x, i32 %y) {
; CHECK-LABEL: @p_constmask_commutative(		; CHECK-LABEL: @p_constmask_commutative(
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], 65280		; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], 65280
; CHECK-NEXT: [[AND1:%.]] = and i32 [[Y:%.]], -65281		; CHECK-NEXT: [[AND1:%.]] = and i32 [[Y:%.]], -65281
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]		; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]		; CHECK-NEXT: ret i32 [[RET]]
;		;
%and = and i32 %x, 65280		%and = and i32 %x, 65280
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/masked-merge-xor.ll

Show First 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	;
%ret = xor <3 x i32> %and, %and1		%ret = xor <3 x i32> %and, %and1
ret <3 x i32> %ret		ret <3 x i32> %ret
}		}

; ============================================================================ ;		; ============================================================================ ;
; Commutativity.		; Commutativity.
; ============================================================================ ;		; ============================================================================ ;

; Used to make sure that the IR complexity sorting does not interfere.
declare i32 @gen32()

define i32 @p_commutative0(i32 %x, i32 %y, i32 %m) {
; CHECK-LABEL: @p_commutative0(
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.]] = and i32 [[NEG]], [[Y:%.]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND]], [[AND1]]
; CHECK-NEXT: ret i32 [[RET]]
;
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %neg, %y
%ret = xor i32 %and, %and1
ret i32 %ret
}

define i32 @p_commutative1(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative1(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], [[M:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND]], [[AND1]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %x, %m
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = xor i32 %and, %and1
ret i32 %ret
}

define i32 @p_commutative2(i32 %x, i32 %y, i32 %m) {
; CHECK-LABEL: @p_commutative2(
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], [[M:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.]] = and i32 [[NEG]], [[Y:%.]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%and = and i32 %x, %m
%neg = xor i32 %m, -1
%and1 = and i32 %neg, %y
%ret = xor i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_commutative3(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative3(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND]], [[AND1]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = xor i32 %and, %and1
ret i32 %ret
}

define i32 @p_commutative4(i32 %x, i32 %y, i32 %m) {
; CHECK-LABEL: @p_commutative4(
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.]] = and i32 [[NEG]], [[Y:%.]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %neg, %y
%ret = xor i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_commutative5(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative5(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], [[M:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %x, %m
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = xor i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_commutative6(i32 %x, i32 %m) {
; CHECK-LABEL: @p_commutative6(
; CHECK-NEXT: [[Y:%.*]] = call i32 @gen32()
; CHECK-NEXT: [[AND:%.]] = and i32 [[M:%.]], [[X:%.*]]
; CHECK-NEXT: [[NEG:%.*]] = xor i32 [[M]], -1
; CHECK-NEXT: [[AND1:%.*]] = and i32 [[Y]], [[NEG]]
; CHECK-NEXT: [[RET:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET]]
;
%y = call i32 @gen32()
%and = and i32 %m, %x ; swapped order
%neg = xor i32 %m, -1
%and1 = and i32 %y, %neg; swapped order
%ret = xor i32 %and1, %and ; swapped order
ret i32 %ret
}

define i32 @p_constmask_commutative(i32 %x, i32 %y) {		define i32 @p_constmask_commutative(i32 %x, i32 %y) {
; CHECK-LABEL: @p_constmask_commutative(		; CHECK-LABEL: @p_constmask_commutative(
; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], 65280		; CHECK-NEXT: [[AND:%.]] = and i32 [[X:%.]], 65280
; CHECK-NEXT: [[AND1:%.]] = and i32 [[Y:%.]], -65281		; CHECK-NEXT: [[AND1:%.]] = and i32 [[Y:%.]], -65281
; CHECK-NEXT: [[RET1:%.*]] = or i32 [[AND1]], [[AND]]		; CHECK-NEXT: [[RET1:%.*]] = or i32 [[AND1]], [[AND]]
; CHECK-NEXT: ret i32 [[RET1]]		; CHECK-NEXT: ret i32 [[RET1]]
;		;
%and = and i32 %x, 65280		%and = and i32 %x, 65280
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/or.ll

Show First 20 Lines • Show All 1,379 Lines • ▼ Show 20 Lines	;
%or = or i32 %y, %x		%or = or i32 %y, %x
%neg = xor i32 %or, -1		%neg = xor i32 %or, -1
call void @use(i32 %neg)		call void @use(i32 %neg)
%xor = xor i32 %y, %x		%xor = xor i32 %y, %x
call void @use(i32 %xor)		call void @use(i32 %xor)
%or1 = or i32 %xor, %neg		%or1 = or i32 %xor, %neg
ret i32 %or1		ret i32 %or1
}		}

		define i32 @test_or_and_and_not(i32 %a, i32 %b, i32 %c) {
		; CHECK-LABEL: @test_or_and_and_not(
		; CHECK-NEXT: [[TMP1:%.]] = freeze i32 [[C:%.]]
		; CHECK-NEXT: [[TMP2:%.]] = xor i32 [[TMP1]], [[B:%.]]
		; CHECK-NEXT: [[TMP3:%.]] = and i32 [[TMP2]], [[A:%.]]
		; CHECK-NEXT: [[OR:%.*]] = xor i32 [[TMP3]], [[TMP1]]
		; CHECK-NEXT: ret i32 [[OR]]
		;
		%and0 = and i32 %a, %b
		%not_a = xor i32 %a, -1
		%and1 = and i32 %not_a, %c
		%or = or i32 %and0, %and1
		ret i32 %or
		}

		define i32 @test_or_and_and_not2(i32 %a, i32 %b, i32 %c) {
		; CHECK-LABEL: @test_or_and_and_not2(
		; CHECK-NEXT: [[TMP1:%.]] = freeze i32 [[A:%.]]
		; CHECK-NEXT: [[TMP2:%.]] = xor i32 [[TMP1]], [[C:%.]]
		; CHECK-NEXT: [[TMP3:%.]] = and i32 [[TMP2]], [[B:%.]]
		; CHECK-NEXT: [[OR:%.*]] = xor i32 [[TMP3]], [[TMP1]]
		; CHECK-NEXT: ret i32 [[OR]]
		;
		%not_b = xor i32 %b, -1
		%and0 = and i32 %a, %not_b
		%and1 = and i32 %b, %c
		%or = or i32 %and0, %and1
		ret i32 %or
		}

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Canonicalize masked merge; optimize (a & b) | (~a & c)AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 372789

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

llvm/test/Transforms/InstCombine/masked-merge-add.ll

llvm/test/Transforms/InstCombine/masked-merge-or.ll

llvm/test/Transforms/InstCombine/masked-merge-xor.ll

llvm/test/Transforms/InstCombine/or.ll

[InstCombine] Canonicalize masked merge; optimize (a & b) | (~a & c)
AbandonedPublic