This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] try to shrink bitwise logic with phi operand
Needs ReviewPublic

Authored by spatel on Jan 12 2017, 12:52 PM.

Download Raw Diff

Details

Reviewers

majnemer
hfinkel
efriedma

Summary

The test case is based on an example from:
http://www.gdcvault.com/play/1023026/Taming-the-Jaguar-x86-Optimization

bool ArraySearch(int count, uint32_t needle, uint32_t haystack[]) {
  bool found = false;
  for (int i = 0; i < count; ++i)
    found |= needle == haystack[i];
  return found;
}

...so I think the pattern could show up in any "all_of" or "any_of" style of loop. I checked that the loop vectorizer is ok with the change in this case. x86 scalar and vector codegen looks fine with it too.

I'm hoping that D27933 will be approved so we don't have to check the commuted variant (unary cast op is less complex than phi, so it goes to the right side).

Diff Detail

Event Timeline

spatel updated this revision to Diff 84167.Jan 12 2017, 12:52 PM

spatel retitled this revision from to [InstCombine] try to shrink bitwise logic with phi operand.

spatel updated this object.

spatel added reviewers: efriedma, majnemer, hfinkel.

spatel added subscribers: llvm-commits, RKSimon.

Herald added a subscriber: mcrosier. · View Herald TranscriptJan 12 2017, 12:52 PM

Would it be possible to come up with some more general optimization? Making integer operations in loops narrower can be useful (for vectorization etc.), but specifically special-casing a boolean "or" reduction doesn't seem like something that will trigger frequently enough to matter.

In D28625#647608, @efriedma wrote:

Would it be possible to come up with some more general optimization? Making integer operations in loops narrower can be useful (for vectorization etc.), but specifically special-casing a boolean "or" reduction doesn't seem like something that will trigger frequently enough to matter.

My laziness of only including one test in the patch misled you. :)
Although 'any_of' and 'all_of' are my motivating cases, this patch is neither limited to 'or' nor to bools. I'll update the tests to show this.

Patch updated:
Add more tests to show that this transform will work with any bitwise logic op and data types.

Err, sorry, it looks like my comment wasn't really clear about what I was thinking about...

The optimization sort of has two weaknesses in its current state: one the optimization doesn't work if the initial value isn't a constant. Two, it only works if there's a single logical operation per loop iteration. The first could probably be fixed using known bits; I'm more worried about the second one... it's probably not something we would deal with in instcombine. Maybe that's not important.

In D28625#650685, @efriedma wrote:

Err, sorry, it looks like my comment wasn't really clear about what I was thinking about...

The optimization sort of has two weaknesses in its current state: one the optimization doesn't work if the initial value isn't a constant. Two, it only works if there's a single logical operation per loop iteration. The first could probably be fixed using known bits; I'm more worried about the second one... it's probably not something we would deal with in instcombine. Maybe that's not important.

For the first, yes, I considered adding phis in demanded bits simplify, but that seemed like overkill for the cases I'm seeing, so I thought a more specific solution was the way to go. For the second, if we have a chain of logic ops, then they should all shrink once we can shrink the first link in that chain. Ie, we already have very similar transforms in foldLogicCastConstant() and shrinkBitwiseLogic(). If you have an example of the pattern you're imagining, we can certainly try to see if the current group of shrinking transforms will handle it?

If you unroll your example loop once, it no longer gets recognized.

Another kind of similar pattern:

int array_max(int n, char *p) {
  int result = 0;
  for (int i = 0; i < n; ++i)
    result = result > p[i] ? result : p[i];
  return result;
}

In D28625#650747, @efriedma wrote:

If you unroll your example loop once, it no longer gets recognized.

Passing -funroll-loops on one of my original loops yields something monstrous. There's no way to solve a case like that without enhancing SimplifyDemandedUseBits()? Ie, we have to chase operands until we 'complete the circle' from incoming phi to logic op. And even that won't work if the unrolling is excessive enough to trip the recursive depth check in SimplifyDemandedBits. I'll abandon this patch if you think the more general solution is the way to go.

for.body:                         
%indvars.iv = phi i64 [ %indvars.iv.unr, %for.body.preheader.new ], [ %indvars.iv.next.3, %for.body ]
%all.010 = phi i8 [ %all.010.unr, %for.body.preheader.new ], [ %and.3, %for.body ]
%arrayidx = getelementptr inbounds i32, i32* %haystack, i64 %indvars.iv
%5 = load i32, i32* %arrayidx, align 4
%cmp1 = icmp eq i32 %5, %needle
%conv = zext i1 %cmp1 to i8
%and = and i8 %conv, %all.010
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%arrayidx.1 = getelementptr inbounds i32, i32* %haystack, i64 %indvars.iv.next
%6 = load i32, i32* %arrayidx.1, align 4
%cmp1.1 = icmp eq i32 %6, %needle
%conv.1 = zext i1 %cmp1.1 to i8
%and.1 = and i8 %conv.1, %and
%indvars.iv.next.1 = add nsw i64 %indvars.iv, 2
%arrayidx.2 = getelementptr inbounds i32, i32* %haystack, i64 %indvars.iv.next.1
%7 = load i32, i32* %arrayidx.2, align 4
%cmp1.2 = icmp eq i32 %7, %needle
%conv.2 = zext i1 %cmp1.2 to i8
%and.2 = and i8 %conv.2, %and.1
%indvars.iv.next.2 = add nsw i64 %indvars.iv, 3
%arrayidx.3 = getelementptr inbounds i32, i32* %haystack, i64 %indvars.iv.next.2
%8 = load i32, i32* %arrayidx.3, align 4
%cmp1.3 = icmp eq i32 %8, %needle
%conv.3 = zext i1 %cmp1.3 to i8
%and.3 = and i8 %conv.3, %and.2
%indvars.iv.next.3 = add nsw i64 %indvars.iv, 4
%exitcond.3 = icmp eq i64 %indvars.iv.next.3, %wide.trip.count
br i1 %exitcond.3, label %for.cond.cleanup.loopexit.unr-lcssa, label %for.body

In D28625#650747, @efriedma wrote:
int array_max(int n, char *p) {
int result = 0;
for (int i = 0; i < n; ++i)
  result = result > p[i] ? result : p[i];
return result;

This case falls into the "don't break min/max" hole that thwarts all kinds of optimization...sigh. Ie, we have something like this in at least 5 places in InstCombine:

// If this is a select as part of a min/max pattern, don't simplify any
// further in case we break the structure.

Also, there's no bitwise logic here. We'd need to enhance select folding to catch it separately from this patch, or this is a good reason to pursue a more general SimplifyDemandedBits solution.

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineAndOrXor.cpp

55 lines

test/

Transforms/

InstCombine/

narrow.ll

31 lines

Diff 84777

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

Show First 20 Lines • Show All 1,155 Lines • ▼ Show 20 Lines	if (ZextTruncC == C) {
Value *NewOp = Builder->CreateBinOp(LogicOpc, X, TruncC);		Value *NewOp = Builder->CreateBinOp(LogicOpc, X, TruncC);
return new ZExtInst(NewOp, DestTy);		return new ZExtInst(NewOp, DestTy);
}		}
}		}

return nullptr;		return nullptr;
}		}

		static Instruction *
		shrinkLogicWithZextOpAndPhiOp(BinaryOperator &Logic,
		InstCombiner::BuilderTy &Builder) {
		// Try to narrow a bitwise logic op with a zexted operand and a phi operand
		// that may include the logic op itself and some number of constants:
		// LogicOp phi [ C0, C1, ..., LogicOp ], (zext X) -->
		// zext (LogicOp' phi [ C0', C1', ..., LogicOp'], X
		auto *Phi = dyn_cast<PHINode>(Logic.getOperand(0));
		Value *X;
		if (!Phi \|\| !Phi->hasOneUse() \|\|
		!match(Logic.getOperand(1), m_OneUse(m_ZExt(m_Value(X)))))
		return nullptr;

		// The phi's incoming values must be the logic instruction itself or constants
		// that fit in the narrower source type.
		Type *DestTy = Logic.getType();
		Type *SrcTy = X->getType();
		SmallVector<Value *, 4> NarrowOps;
		for (Value *PhiVal : Phi->incoming_values()) {
		if (auto *C = dyn_cast<Constant>(PhiVal)) {
		Constant *TruncC = ConstantExpr::getTrunc(C, SrcTy);
		Constant *ZextTruncC = ConstantExpr::getZExt(TruncC, DestTy);
		if (ZextTruncC != C)
		return nullptr;
		NarrowOps.push_back(TruncC);
		} else if (PhiVal == &Logic) {
		// Use X as a placeholder in the list. We'll replace it with a new narrow
		// logic op if this transform works out.
		NarrowOps.push_back(X);
		} else {
		return nullptr;
		}
		}

		// All incoming values are safe to shrink. Replace the phi and logic op with
		// narrower versions.
		unsigned NumPhiOps = Phi->getNumOperands();
		assert((NarrowOps.size() == NumPhiOps) &&
		"Unexpected value while narrowing phi");

		// Phi nodes must be inserted at the start of a basic block.
		Builder.SetInsertPoint(Phi);
		PHINode *NarrowPhi = Builder.CreatePHI(SrcTy, NumPhiOps);
		Builder.SetInsertPoint(&Logic);
		Value *NarrowLogic = Builder.CreateBinOp(Logic.getOpcode(), NarrowPhi, X);
		for (unsigned i = 0; i != NumPhiOps; ++i) {
		Value *NarrowVal = NarrowOps[i] == X ? NarrowLogic : NarrowOps[i];
		NarrowPhi->addIncoming(NarrowVal, Phi->getIncomingBlock(i));
		}
		return new ZExtInst(NarrowLogic, DestTy);
		}

/// Fold {and,or,xor} (cast X), Y.		/// Fold {and,or,xor} (cast X), Y.
Instruction *InstCombiner::foldCastedBitwiseLogic(BinaryOperator &I) {		Instruction *InstCombiner::foldCastedBitwiseLogic(BinaryOperator &I) {
auto LogicOpc = I.getOpcode();		auto LogicOpc = I.getOpcode();
assert(I.isBitwiseLogicOp() && "Unexpected opcode for bitwise logic folding");		assert(I.isBitwiseLogicOp() && "Unexpected opcode for bitwise logic folding");

		if (Instruction Ret = shrinkLogicWithZextOpAndPhiOp(I, Builder))
		return Ret;

Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);		Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);
CastInst *Cast0 = dyn_cast<CastInst>(Op0);		CastInst *Cast0 = dyn_cast<CastInst>(Op0);
if (!Cast0)		if (!Cast0)
return nullptr;		return nullptr;

// This must be a cast from an integer or integer vector source type to allow		// This must be a cast from an integer or integer vector source type to allow
// transformation of the logic operation to the source type.		// transformation of the logic operation to the source type.
Type *DestTy = I.getType();		Type *DestTy = I.getType();
▲ Show 20 Lines • Show All 1,569 Lines • Show Last 20 Lines

test/Transforms/InstCombine/narrow.ll

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TRUNC:%.*]] = and <2 x i32> [[TMP1]], <i32 0, i32 6>			; CHECK-NEXT: [[TRUNC:%.*]] = and <2 x i32> [[TMP1]], <i32 0, i32 6>
	; CHECK-NEXT: ret <2 x i32> [[TRUNC]]			; CHECK-NEXT: ret <2 x i32> [[TRUNC]]
	;			;
	%and = and <2 x i33> %a, <i33 4294967296, i33 6>			%and = and <2 x i33> %a, <i33 4294967296, i33 6>
	%trunc = trunc <2 x i33> %and to <2 x i32>			%trunc = trunc <2 x i33> %and to <2 x i32>
	ret <2 x i32> %trunc			ret <2 x i32> %trunc
	}			}

	; FIXME:
	; This is based on an 'any_of' loop construct.			; This is based on an 'any_of' loop construct.
	; By narrowing the phi and logic op, we simplify away the zext and the final icmp.			; By narrowing the phi and logic op, we simplify away the zext and the final icmp.

	define i1 @searchArray1(i32 %needle, i32* %haystack) {			define i1 @searchArray1(i32 %needle, i32* %haystack) {
	; CHECK-LABEL: @searchArray1(			; CHECK-LABEL: @searchArray1(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[INDVAR:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[INDVAR_NEXT:%.*]], [[LOOP]] ]			; CHECK-NEXT: [[INDVAR:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[INDVAR_NEXT:%.*]], [[LOOP]] ]
	; CHECK-NEXT: [[FOUND:%.]] = phi i8 [ 0, [[ENTRY]] ], [ [[OR:%.]], [[LOOP]] ]			; CHECK-NEXT: [[TMP0:%.]] = phi i1 [ false, [[ENTRY]] ], [ [[TMP2:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[TMP0:%.*]] = sext i32 [[INDVAR]] to i64			; CHECK-NEXT: [[TMP1:%.*]] = sext i32 [[INDVAR]] to i64
	; CHECK-NEXT: [[IDX:%.]] = getelementptr i32, i32 [[HAYSTACK:%.*]], i64 [[TMP0]]			; CHECK-NEXT: [[IDX:%.]] = getelementptr i32, i32 [[HAYSTACK:%.*]], i64 [[TMP1]]
	; CHECK-NEXT: [[LD:%.]] = load i32, i32 [[IDX]], align 4			; CHECK-NEXT: [[LD:%.]] = load i32, i32 [[IDX]], align 4
	; CHECK-NEXT: [[CMP1:%.]] = icmp eq i32 [[LD]], [[NEEDLE:%.]]			; CHECK-NEXT: [[CMP1:%.]] = icmp eq i32 [[LD]], [[NEEDLE:%.]]
	; CHECK-NEXT: [[ZEXT:%.*]] = zext i1 [[CMP1]] to i8			; CHECK-NEXT: [[TMP2]] = or i1 [[TMP0]], [[CMP1]]
	; CHECK-NEXT: [[OR]] = or i8 [[FOUND]], [[ZEXT]]
	; CHECK-NEXT: [[INDVAR_NEXT]] = add i32 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT]] = add i32 [[INDVAR]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[INDVAR_NEXT]], 1000			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i32 [[INDVAR_NEXT]], 1000
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[EXIT:%.*]], label [[LOOP]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[EXIT:%.*]], label [[LOOP]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[TOBOOL:%.*]] = icmp ne i8 [[OR]], 0			; CHECK-NEXT: ret i1 [[TMP2]]
	; CHECK-NEXT: ret i1 [[TOBOOL]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%indvar = phi i32 [ 0, %entry ], [ %indvar.next, %loop ]			%indvar = phi i32 [ 0, %entry ], [ %indvar.next, %loop ]
	%found = phi i8 [ 0, %entry ], [ %or, %loop ]			%found = phi i8 [ 0, %entry ], [ %or, %loop ]
	%idx = getelementptr i32, i32* %haystack, i32 %indvar			%idx = getelementptr i32, i32* %haystack, i32 %indvar
	%ld = load i32, i32* %idx			%ld = load i32, i32* %idx
	%cmp1 = icmp eq i32 %ld, %needle			%cmp1 = icmp eq i32 %ld, %needle
	%zext = zext i1 %cmp1 to i8			%zext = zext i1 %cmp1 to i8
	%or = or i8 %found, %zext			%or = or i8 %found, %zext
	%indvar.next = add i32 %indvar, 1			%indvar.next = add i32 %indvar, 1
	%exitcond = icmp eq i32 %indvar.next, 1000			%exitcond = icmp eq i32 %indvar.next, 1000
	br i1 %exitcond, label %exit, label %loop			br i1 %exitcond, label %exit, label %loop

	exit:			exit:
	%tobool = icmp ne i8 %or, 0			%tobool = icmp ne i8 %or, 0
	ret i1 %tobool			ret i1 %tobool
	}			}

	; FIXME:
	; This is based on an 'all_of' loop construct.			; This is based on an 'all_of' loop construct.
	; By narrowing the phi and logic op, we simplify away the zext and the final icmp.			; By narrowing the phi and logic op, we simplify away the zext and the final icmp.

	define i1 @searchArray2(i32 %hay, i32* %haystack) {			define i1 @searchArray2(i32 %hay, i32* %haystack) {
	; CHECK-LABEL: @searchArray2(			; CHECK-LABEL: @searchArray2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[INDVAR:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INDVAR_NEXT:%.*]], [[LOOP]] ]			; CHECK-NEXT: [[INDVAR:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INDVAR_NEXT:%.*]], [[LOOP]] ]
	; CHECK-NEXT: [[FOUND:%.]] = phi i8 [ 1, [[ENTRY]] ], [ [[AND:%.]], [[LOOP]] ]			; CHECK-NEXT: [[TMP0:%.]] = phi i1 [ true, [[ENTRY]] ], [ [[TMP1:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[IDX:%.]] = getelementptr i32, i32 [[HAYSTACK:%.*]], i64 [[INDVAR]]			; CHECK-NEXT: [[IDX:%.]] = getelementptr i32, i32 [[HAYSTACK:%.*]], i64 [[INDVAR]]
	; CHECK-NEXT: [[LD:%.]] = load i32, i32 [[IDX]], align 4			; CHECK-NEXT: [[LD:%.]] = load i32, i32 [[IDX]], align 4
	; CHECK-NEXT: [[CMP1:%.]] = icmp eq i32 [[LD]], [[HAY:%.]]			; CHECK-NEXT: [[CMP1:%.]] = icmp eq i32 [[LD]], [[HAY:%.]]
	; CHECK-NEXT: [[ZEXT:%.*]] = zext i1 [[CMP1]] to i8			; CHECK-NEXT: [[TMP1]] = and i1 [[TMP0]], [[CMP1]]
	; CHECK-NEXT: [[AND]] = and i8 [[FOUND]], [[ZEXT]]
	; CHECK-NEXT: [[INDVAR_NEXT]] = add i64 [[INDVAR]], 1			; CHECK-NEXT: [[INDVAR_NEXT]] = add i64 [[INDVAR]], 1
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVAR_NEXT]], 1000			; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[INDVAR_NEXT]], 1000
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[EXIT:%.*]], label [[LOOP]]			; CHECK-NEXT: br i1 [[EXITCOND]], label [[EXIT:%.*]], label [[LOOP]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[TOBOOL:%.*]] = icmp ne i8 [[AND]], 0			; CHECK-NEXT: ret i1 [[TMP1]]
	; CHECK-NEXT: ret i1 [[TOBOOL]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%indvar = phi i64 [ 0, %entry ], [ %indvar.next, %loop ]			%indvar = phi i64 [ 0, %entry ], [ %indvar.next, %loop ]
	%found = phi i8 [ 1, %entry ], [ %and, %loop ]			%found = phi i8 [ 1, %entry ], [ %and, %loop ]
	%idx = getelementptr i32, i32* %haystack, i64 %indvar			%idx = getelementptr i32, i32* %haystack, i64 %indvar
	%ld = load i32, i32* %idx			%ld = load i32, i32* %idx
	%cmp1 = icmp eq i32 %ld, %hay			%cmp1 = icmp eq i32 %ld, %hay
	%zext = zext i1 %cmp1 to i8			%zext = zext i1 %cmp1 to i8
	%and = and i8 %found, %zext			%and = and i8 %found, %zext
	%indvar.next = add i64 %indvar, 1			%indvar.next = add i64 %indvar, 1
	%exitcond = icmp eq i64 %indvar.next, 1000			%exitcond = icmp eq i64 %indvar.next, 1000
	br i1 %exitcond, label %exit, label %loop			br i1 %exitcond, label %exit, label %loop

	exit:			exit:
	%tobool = icmp ne i8 %and, 0			%tobool = icmp ne i8 %and, 0
	ret i1 %tobool			ret i1 %tobool
	}			}

	; FIXME:
	; Narrowing should work with an 'xor' and is not limited to bool types.			; Narrowing should work with an 'xor' and is not limited to bool types.

	define i32 @shrinkLogicAndPhi1(i8 %x, i1 %cond) {			define i32 @shrinkLogicAndPhi1(i8 %x, i1 %cond) {
	; CHECK-LABEL: @shrinkLogicAndPhi1(			; CHECK-LABEL: @shrinkLogicAndPhi1(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br i1 [[COND:%.]], label [[IF:%.]], label [[ENDIF:%.*]]			; CHECK-NEXT: br i1 [[COND:%.]], label [[IF:%.]], label [[ENDIF:%.*]]
	; CHECK: if:			; CHECK: if:
	; CHECK-NEXT: br label [[ENDIF]]			; CHECK-NEXT: br label [[ENDIF]]
	; CHECK: endif:			; CHECK: endif:
	; CHECK-NEXT: [[PHI:%.]] = phi i32 [ 21, [[ENTRY:%.]] ], [ 33, [[IF]] ]			; CHECK-NEXT: [[TMP0:%.]] = phi i8 [ 21, [[ENTRY:%.]] ], [ 33, [[IF]] ]
	; CHECK-NEXT: [[ZEXT:%.]] = zext i8 [[X:%.]] to i32			; CHECK-NEXT: [[TMP1:%.]] = xor i8 [[TMP0]], [[X:%.]]
	; CHECK-NEXT: [[LOGIC:%.*]] = xor i32 [[PHI]], [[ZEXT]]			; CHECK-NEXT: [[LOGIC:%.*]] = zext i8 [[TMP1]] to i32
	; CHECK-NEXT: ret i32 [[LOGIC]]			; CHECK-NEXT: ret i32 [[LOGIC]]
	;			;
	entry:			entry:
	br i1 %cond, label %if, label %endif			br i1 %cond, label %if, label %endif
	if:			if:
	br label %endif			br label %endif
	endif:			endif:
	%phi = phi i32 [ 21, %entry], [ 33, %if ]			%phi = phi i32 [ 21, %entry], [ 33, %if ]
	%zext = zext i8 %x to i32			%zext = zext i8 %x to i32
	%logic = xor i32 %phi, %zext			%logic = xor i32 %phi, %zext
	ret i32 %logic			ret i32 %logic
	}			}

	; FIXME:			; FIXME:
	; Narrowing should work with an 'xor' and is not limited to bool types.
	; FIXME:
	; We should either canonicalize based on complexity or enhance the pattern matching to catch this commuted variant.			; We should either canonicalize based on complexity or enhance the pattern matching to catch this commuted variant.

	define i32 @shrinkLogicAndPhi2(i8 %x, i1 %cond) {			define i32 @shrinkLogicAndPhi2(i8 %x, i1 %cond) {
	; CHECK-LABEL: @shrinkLogicAndPhi2(			; CHECK-LABEL: @shrinkLogicAndPhi2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br i1 [[COND:%.]], label [[IF:%.]], label [[ENDIF:%.*]]			; CHECK-NEXT: br i1 [[COND:%.]], label [[IF:%.]], label [[ENDIF:%.*]]
	; CHECK: if:			; CHECK: if:
	; CHECK-NEXT: br label [[ENDIF]]			; CHECK-NEXT: br label [[ENDIF]]
	Show All 17 Lines