This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineInternal.h
1/7
InstCombinePHI.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
fold-or-phi.ll

Differential D44626

[InstCombine] Fold (A OR B) AND B code sequence over Phi node
AbandonedPublic

Authored by inouehrs on Mar 19 2018, 7:46 AM.

Download Raw Diff

Details

Reviewers

efriedma
• dberlin
echristo
hfinkel
kbarton
nemanjai
spatel
xbolva00

Summary

This patch intends to enable jump threading with a method whose return type is std::pair<int, bool> or std::pair<bool, int>
For example, jump threading does not work for the if statement in func.

std::pair<int, bool> callee(int v) {
  int a = dummy(v);
  if (a) return std::make_pair(dummy(v), true);
  else return std::make_pair(v, v < 0);
}

int func(int v) {
  std::pair<int, bool> rc = callee(v);
  if (rc.second) {
    // do something
  }

SROA executed before the method inlining replaces std::pair by i64 without splitting in both callee and func since at this point no access to the individual fields is seen to SROA.
After inlining, jump threading fails to identify that the incoming value is a constant due to additional instructions (like or, and, trunc).

This patch finds a phi node, which has OR instruction as an incoming value and AND instruction as its use. If the OR and AND instructions take the same operand, e.g. (%A OR %B) AND %B, then replace the incoming OR by %B. For example,

BB1:
  %or = or i64 %val, 1
  br %BB2
BB2:
  %phi = phi i64 [ %or, %BB1 ], ... # -> phi i64 [ 1, %BB1 ], ...
  %and = and i64 %phi, 1

This helps jump threading identify the opportunity listed above.
Later, in the CFG simplification pass, the similar code modification happens. But it is too late to help jump threading.

Diff Detail

Event Timeline

This patch improved the performance of protobuf (https://github.com/google/protobuf), which suffers from the problem of jump threading for std::pair<int, bool>.

with this patch
----------------------------------------------------------------------
Benchmark                               Time           CPU Iterations
----------------------------------------------------------------------
google_message2_parse_new          761764 ns     761558 ns        842   105.904MB/s
google_message2_parse_reuse        204430 ns     204397 ns       3422   394.587MB/s
google_message2_parse_newarena     398431 ns     398367 ns       1758   202.457MB/s
google_message2_serialize          116088 ns     116080 ns       6029     694.8MB/s

without this patch
----------------------------------------------------------------------
Benchmark                               Time           CPU Iterations
----------------------------------------------------------------------
google_message2_parse_new          786832 ns     786808 ns        868   102.506MB/s
google_message2_parse_reuse        207647 ns     207631 ns       3368    388.44MB/s
google_message2_parse_newarena     403998 ns     403985 ns       1733   199.642MB/s
google_message2_serialize          117183 ns     117172 ns       5973   688.322MB/s

majnemer added a subscriber: majnemer.Mar 19 2018, 8:57 AM

majnemer added inline comments.

lib/Transforms/InstCombine/InstCombinePHI.cpp
678	I think you need to `return NewPhi`.

Missing testcase.

lib/Transforms/InstCombine/InstCombinePHI.cpp
669	In general, instcombine should not insert new instructions without erasing any existing instructions; in some cases, we'll increase codesize without any benefit.

Now this optimization creates a new phi node only if the AND instruction can be eliminated not to increase the code size.
Added test case.

@efriedma Thank you so much for the advise! I added check before creating a new phi node to confirm that the AND instruction will be eliminated later and so not to increase the code size.
Also, I added test case; I forgot to include this in the first submission.

inouehrs marked 2 inline comments as done.Mar 23 2018, 5:28 AM

inouehrs added inline comments.

lib/Transforms/InstCombine/InstCombinePHI.cpp
678	Thanks. I made the function return &Phi since we add (not replace) NewPhi or modify operands of Phi.

simplify code
rebase to the latest code

Any idea how frequently this triggers on general code (the LLVM testsuite, or something like that)?

lib/Transforms/InstCombine/InstCombinePHI.cpp
653	Only allowing 64-bit values seems overly restrictive.
657	We don't usually like to walk uses like this; can you start the pattern-match from the "and"?
682	cast<>, not dyn_cast<>
692	Do you need to check hasOneUse() on the "or" here? Looking specifically for zext+shl seems overly specific to your testcase; I'd like to see something a little more general. Maybe you could check `SimplifyAndInst(V, UserVal) != nullptr`? Or maybe that's too expensive; not sure.

addressed comments from Eli.

Only allowing 64-bit values seems overly restrictive.

I agree. I relaxed the condition.

We don't usually like to walk uses like this; can you start the pattern-match from the "and"?

I made the pattern match start from "and".

Do you need to check hasOneUse() on the "or" here?

This patch does not eliminate or modify "or" and does not require hasOneUse on it, although the later optimization may further optimize "or".

Looking specifically for zext+shl seems overly specific to your testcase

I generalized the code using computeKnownBits since SimplifyAndInst does not identify this pattern.

Any idea how frequently this triggers on general code (the LLVM testsuite, or something like that)?

While compiling LLVM, this modification happens about 440 times. In LLVM testsuite it happens only three times.
I think newer programs tend to use a value pair as the return value type more frequently.

Thank you for the comments.

xbolva00 accepted this revision.Apr 20 2018, 12:02 AM

This revision is now accepted and ready to land.Apr 20 2018, 12:02 AM

@efriedma Do you have further comments of suggestions? Thanks!

I'm still not sure this is really as general as it could be, but I guess it's okay.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1423 ↗	(On Diff #143229)	This logic doesn't work if AndVal isn't a constant (e.g. consider the case where the "and" and "or" are in the same basic block).

xbolva00 requested changes to this revision.Apr 28 2018, 2:50 AM

This revision now requires changes to proceed.Apr 28 2018, 2:50 AM

In D44626#1080274, @efriedma wrote:

I'm still not sure this is really as general as it could be, but I guess it's okay.

I think what this patch really wants to ask/do is: "Does this binop simplify with the incoming value of the phi to 1 of the binop operands? If yes, substitute that value into the phi."

If you look at it that way, then it should be some add-on to the existing foldOpIntoPhi() - and that's probably a smaller and more general patch.

That substitution analysis seems to fall into the gray area -- or not gray depending on your viewpoint :) -- of whether this belongs in (New)GVN or instcombine. (cc @fhahn).

inouehrs updated this revision to Diff 144452.Apr 28 2018, 9:24 AM

In D44626#1081987, @spatel wrote:

I think what this patch really wants to ask/do is: "Does this binop simplify with the incoming value of the phi to 1 of the binop operands? If yes, substitute that value into the phi."

Note that this patch intend to optimize only a simple but important case on std::pair to help jump threading. Other optimizers are already able to do more generic optimization for this type of code sequence, but it's too late to help jump threading.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
1423 ↗	(On Diff #143229)	Thank you for pointing this out. I added check for `AndVal` (prevoiously I checked this only in `if (!Phi->hasOneUse())` block).

In D44626#1082044, @inouehrs wrote:

In D44626#1081987, @spatel wrote:

I think what this patch really wants to ask/do is: "Does this binop simplify with the incoming value of the phi to 1 of the binop operands? If yes, substitute that value into the phi."

Note that this patch intend to optimize only a simple but important case on std::pair to help jump threading. Other optimizers are already able to do more generic optimization for this type of code sequence, but it's too late to help jump threading.

I understand that we get many other similar cases (for better or worse, here in instcombine). But if we're going to try this, then we might as well generalize it, so it triggers more often and leads to less lumpy optimization. Ie, you should get this case too:

define i64 @phi_with_idempotent_binop(i1 %f, i64 %a) {
entry:
  br i1 %f, label %BB1, label %BB2

BB1:                               
  %and = and i64 %a, 1
  br label %BB2

BB2:                                 
  %phi = phi i64 [ %a, %entry ], [ %and, %BB1 ]
  %or = or i64 %phi, 1
  ret i64 %or
}

If you use a more general matcher (and I think it will be cheaper for the tests you're showing), we can get the motivating cases you've seen and the ones you haven't seen yet. :)

I think the matcher is quite simple. It would look something like this:

if (BinOp.isIdempotent()) { // handle both 'and' and 'or'
  Value *V = SimplifyBinOp(BinOp->getOpcode(), Phi->getIncomingVal(Idx), BinOp->getOperand(1)) // no limits on what's on the other side of the phi
  if (V && V == BinOp->getOperand(1))
    Phi->setIncomingValue(Idx, V);
}

(add the appropriate checks for uses, commutes, etc)

In D44626#1081987, @spatel wrote:

In D44626#1080274, @efriedma wrote:

I'm still not sure this is really as general as it could be, but I guess it's okay.

I think what this patch really wants to ask/do is: "Does this binop simplify with the incoming value of the phi to 1 of the binop operands? If yes, substitute that value into the phi."

If you look at it that way, then it should be some add-on to the existing foldOpIntoPhi() - and that's probably a smaller and more general patch.

That substitution analysis seems to fall into the gray area -- or not gray depending on your viewpoint :) -- of whether this belongs in (New)GVN or instcombine. (cc @fhahn).

FWIW, the underlying issue for me in doing it in instcombine is that it can never really be good at it in any sane time bound. Without knowing the values of other instructions ahead of time (IE some form of value numbering), it has to go figure them out (again and again), or give up and only tackle simple cases around constants (which people never stay happy with). It also has no fine grain dependency tracking, so it will re-evaluate this all the time. By contrast, NewGVN (or even GVN if you wanted to try it there) only re-evaluate these transforms at the points they could change, and already know the values of other things in the program, so can tell when the transform produces a redundancy without further evaluation.

Over time, either the set of cases you catch suck, you make instcombine slow and complicated, or you make instcombine do value numbering.
None of these seem like appealing options to me ;)

spatel mentioned this in D46336: [InstCombine] Apply binary operator simplifications to associative/commutative cases..May 2 2018, 4:26 PM

xbolva00 resigned from this revision.May 16 2018, 5:56 AM

inouehrs mentioned this in D48828: [InstSimplify] fold extracting from std::pair (1/2).Jul 2 2018, 6:03 AM

I submitted new patches for instsimplify to catch this opportunity.
https://reviews.llvm.org/D48828
https://reviews.llvm.org/D49981

Thanks for all the advice.

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineInternal.h

1 line

InstCombinePHI.cpp

84 lines

test/

Transforms/

InstCombine/

fold-or-phi.ll

146 lines

Diff 142213

lib/Transforms/InstCombine/InstCombineInternal.h

Show First 20 Lines • Show All 695 Lines • ▼ Show 20 Lines	private:

/// \brief Try to rotate an operation below a PHI node, using PHI nodes for		/// \brief Try to rotate an operation below a PHI node, using PHI nodes for
/// its operands.		/// its operands.
Instruction *FoldPHIArgOpIntoPHI(PHINode &PN);		Instruction *FoldPHIArgOpIntoPHI(PHINode &PN);
Instruction *FoldPHIArgBinOpIntoPHI(PHINode &PN);		Instruction *FoldPHIArgBinOpIntoPHI(PHINode &PN);
Instruction *FoldPHIArgGEPIntoPHI(PHINode &PN);		Instruction *FoldPHIArgGEPIntoPHI(PHINode &PN);
Instruction *FoldPHIArgLoadIntoPHI(PHINode &PN);		Instruction *FoldPHIArgLoadIntoPHI(PHINode &PN);
Instruction *FoldPHIArgZextsIntoPHI(PHINode &PN);		Instruction *FoldPHIArgZextsIntoPHI(PHINode &PN);
		Instruction *FoldPHIArgOrIntoPHI(PHINode &PN);

/// If an integer typed PHI has only one use which is an IntToPtr operation,		/// If an integer typed PHI has only one use which is an IntToPtr operation,
/// replace the PHI with an existing pointer typed PHI if it exists. Otherwise		/// replace the PHI with an existing pointer typed PHI if it exists. Otherwise
/// insert a new pointer typed PHI and replace the original one.		/// insert a new pointer typed PHI and replace the original one.
Instruction *FoldIntegerTypedPHI(PHINode &PN);		Instruction *FoldIntegerTypedPHI(PHINode &PN);

/// Helper function for FoldPHIArgXIntoPHI() to set debug location for the		/// Helper function for FoldPHIArgXIntoPHI() to set debug location for the
/// folded operation.		/// folded operation.
▲ Show 20 Lines • Show All 98 Lines • Show Last 20 Lines

lib/Transforms/InstCombine/InstCombinePHI.cpp

Show First 20 Lines • Show All 629 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::FoldPHIArgLoadIntoPHI(PHINode &PN) {
if (isVolatile)		if (isVolatile)
for (Value *IncValue : PN.incoming_values())		for (Value *IncValue : PN.incoming_values())
cast<LoadInst>(IncValue)->setVolatile(false);		cast<LoadInst>(IncValue)->setVolatile(false);

PHIArgMergedDebugLoc(NewLI, PN);		PHIArgMergedDebugLoc(NewLI, PN);
return NewLI;		return NewLI;
}		}

		// FoldPHIArgOrIntoPHI finds a phi node, which has OR instruction
		// as an incoming value and AND instruction as its use.
		// If the OR and AND instructions take the same operand
		// as (%A OR %B) AND %B then replace the incoming value by %B.
		// For example,
		// BB1:
		// %or = or i64 %val, 1
		// br %BB2
		// BB2:
		// %phi = phi i64 [ %or, %BB1 ], ... # -> phi i64 [ 1, %BB1 ], ...
		// %and = and i64 %phi, 1
		// This optimization allows jump threading pass to find the opportunity
		// in method call whose return value is std::pair<int, bool>.
		Instruction *InstCombiner::FoldPHIArgOrIntoPHI(PHINode &Phi) {
		if (!Phi.getType()->isIntegerTy() \|\|
		Phi.getType()->getPrimitiveSizeInBits() != 64)
		efriedmaUnsubmitted Not Done Reply Inline Actions Only allowing 64-bit values seems overly restrictive. efriedma: Only allowing 64-bit values seems overly restrictive.
		return nullptr;

		Instruction *rc = nullptr;
		for (User *User : Phi.users()) {
		efriedmaUnsubmitted Not Done Reply Inline Actions We don't usually like to walk uses like this; can you start the pattern-match from the "and"? efriedma: We don't usually like to walk uses like this; can you start the pattern-match from the "and"?
		// Try to find ((%A OR %B) AND %B) code sequence over Phi.
		Value *UserVal = nullptr;
		if (!match(User, m_And(m_Specific(&Phi), m_Value(UserVal))))
		continue;

		auto IsEligible = [&](Value *V) {
		return match(V, m_Or(m_Value(), m_Specific(UserVal)));
		};
		if (llvm::none_of(Phi.incoming_values(), IsEligible))
		continue;

		PHINode* NewPhi = Φ
		efriedmaUnsubmitted Done Reply Inline Actions In general, instcombine should not insert new instructions without erasing any existing instructions; in some cases, we'll increase codesize without any benefit. efriedma: In general, instcombine should not insert new instructions without erasing any existing…
		if (!Phi.hasOneUse()) {
		// When Phi is used by other instruction, we need to create a new phi
		// node for this AND instruction. To avoid increasing total code size,
		// we create a new phi if the AND can be eliminated.
		// So far, we do this when all incoming OR are used to mix two 32-bit
		// integers, e.g. (Shl(x, 32) OR ZExt(y)), and AND have a constant mask
		// to extract bits only from one of the mixed integers. For such case,
		// the AND instruction can be eliminated later by SimplifyDemandedBits.
		if (!isa<ConstantInt>(UserVal)) continue;
		majnemerUnsubmitted Not Done Reply Inline Actions I think you need to `return NewPhi`. majnemer: I think you need to `return NewPhi`.
		inouehrsAuthorUnsubmitted Not Done Reply Inline Actions Thanks. I made the function return &Phi since we add (not replace) NewPhi or modify operands of Phi. inouehrs: Thanks. I made the function return &Phi since we add (not replace) NewPhi or modify operands of…

		// If the mask for the AND instruction has one in both high and low
		// 32-bit, we cannot guarantee that the AND can be eliminated.
		uint64_t Imm = dyn_cast<ConstantInt>(UserVal)->getZExtValue();
		efriedmaUnsubmitted Not Done Reply Inline Actions cast<>, not dyn_cast<> efriedma: cast<>, not dyn_cast<>
		if ((Imm & 0xFFFFFFFFuL) != 0 && (Imm >> 32) != 0) continue;

		const Constant *CI32 = ConstantInt::get(User->getType(), 32);
		// Return true if V mixes two 32 bit values by (Shl(x, 32) OR ZExt(y))
		auto IsConcat = [&](Value *V) {
		Value *LowVal = nullptr;
		return match(V, m_Or(m_Shl(m_Value(), m_Specific(CI32)),
		m_ZExt(m_Value(LowVal)))) &&
		LowVal->getType()->getPrimitiveSizeInBits() == 32;
		};
		efriedmaUnsubmitted Not Done Reply Inline Actions Do you need to check hasOneUse() on the "or" here? Looking specifically for zext+shl seems overly specific to your testcase; I'd like to see something a little more general. Maybe you could check `SimplifyAndInst(V, UserVal) != nullptr`? Or maybe that's too expensive; not sure. efriedma: Do you need to check hasOneUse() on the "or" here? Looking specifically for zext+shl seems…
		auto IsEligibleOrConcat = [&](Value *V) {
		return IsEligible(V) \|\| IsConcat(V);
		};
		if (!llvm::all_of(Phi.incoming_values(), IsEligibleOrConcat))
		continue;

		// If there is another use of the phi node, we create a new one
		// for this AND instruction by cloning the original phi node.
		NewPhi = cast<PHINode>(Phi.clone());
		InsertNewInstBefore(NewPhi, Phi);
		cast<Instruction>(User)->setOperand(0, NewPhi);
		}

		// We replace the incoming OR with UserVal.
		for (unsigned Idx = 0; Idx < NewPhi->getNumIncomingValues(); Idx++) {
		Value *V = NewPhi->getIncomingValue(Idx);
		if (match(V, m_Or(m_Value(), m_Specific(UserVal))))
		NewPhi->setIncomingValue(Idx, UserVal);
		}

		// Because we updated an operand, we return Phi.
		rc = Φ
		}
		return rc;
		}

/// TODO: This function could handle other cast types, but then it might		/// TODO: This function could handle other cast types, but then it might
/// require special-casing a cast from the 'i1' type. See the comment in		/// require special-casing a cast from the 'i1' type. See the comment in
/// FoldPHIArgOpIntoPHI() about pessimizing illegal integer types.		/// FoldPHIArgOpIntoPHI() about pessimizing illegal integer types.
Instruction *InstCombiner::FoldPHIArgZextsIntoPHI(PHINode &Phi) {		Instruction *InstCombiner::FoldPHIArgZextsIntoPHI(PHINode &Phi) {
// We cannot create a new instruction after the PHI if the terminator is an		// We cannot create a new instruction after the PHI if the terminator is an
// EHPad because there is no valid insertion point.		// EHPad because there is no valid insertion point.
if (TerminatorInst *TI = Phi.getParent()->getTerminator())		if (TerminatorInst *TI = Phi.getParent()->getTerminator())
if (TI->isEHPad())		if (TI->isEHPad())
▲ Show 20 Lines • Show All 479 Lines • ▼ Show 20 Lines	if (isa<Instruction>(PN.getIncomingValue(0)) &&
cast<Instruction>(PN.getIncomingValue(0))->getOpcode() ==		cast<Instruction>(PN.getIncomingValue(0))->getOpcode() ==
cast<Instruction>(PN.getIncomingValue(1))->getOpcode() &&		cast<Instruction>(PN.getIncomingValue(1))->getOpcode() &&
// FIXME: The hasOneUse check will fail for PHIs that use the value more		// FIXME: The hasOneUse check will fail for PHIs that use the value more
// than themselves more than once.		// than themselves more than once.
PN.getIncomingValue(0)->hasOneUse())		PN.getIncomingValue(0)->hasOneUse())
if (Instruction *Result = FoldPHIArgOpIntoPHI(PN))		if (Instruction *Result = FoldPHIArgOpIntoPHI(PN))
return Result;		return Result;

		if (Instruction *Result = FoldPHIArgOrIntoPHI(PN))
		return Result;

// If this is a trivial cycle in the PHI node graph, remove it. Basically, if		// If this is a trivial cycle in the PHI node graph, remove it. Basically, if
// this PHI only has a single use (a PHI), and if that PHI only has one use (a		// this PHI only has a single use (a PHI), and if that PHI only has one use (a
// PHI)... break the cycle.		// PHI)... break the cycle.
if (PN.hasOneUse()) {		if (PN.hasOneUse()) {
if (Instruction *Result = FoldIntegerTypedPHI(PN))		if (Instruction *Result = FoldIntegerTypedPHI(PN))
return Result;		return Result;

Instruction *PHIUser = cast<Instruction>(PN.user_back());		Instruction *PHIUser = cast<Instruction>(PN.user_back());
▲ Show 20 Lines • Show All 113 Lines • Show Last 20 Lines

test/Transforms/InstCombine/fold-or-phi.ll

This file was added.

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				define signext i64 @test1(i1 %f, i64 signext %a, i64 signext %b) {
				; CHECK-LABEL: @test1
				; CHECK-LABEL: BB2
				; CHECK: [[PHI:%.*]] = phi i64 [ %a, %entry ], [ %b, %BB1 ]
				; CHECK: and i64 [[PHI]], %b
				entry:
				br i1 %f, label %BB1, label %BB2

				BB1:
				%or = or i64 %a, %b
				br label %BB2

				BB2:
				%phi = phi i64 [ %a, %entry ], [ %or, %BB1 ]
				%and = and i64 %phi, %b
				ret i64 %and
				}

				define signext i64 @test2(i1 %f, i64 signext %a, i64 signext %b) {
				; a test case for not creating a clone phi node to avoid code size bloat
				; CHECK-LABEL: @test2
				; CHECK-LABEL: BB2
				; CHECK: [[NewPHI:%.*]] = phi i64 [ %b, %entry ], [ %or, %BB1 ]
				entry:
				br i1 %f, label %BB1, label %BB2

				BB1:
				%or = or i64 %a, 1
				br label %BB2

				BB2:
				%phi = phi i64 [ %b, %entry ], [ %or, %BB1 ]
				%and = and i64 %phi, 1
				%add = add i64 %and, %phi
				ret i64 %add
				}

				define signext i32 @testBI(i32 signext %v) {
				; Test with std::pair<bool, int>
				; based on the following C++ code
				; std::pair<bool, int> callee(int v) {
				; int a = dummy(v);
				; if (a) return std::make_pair(true, dummy(a));
				; else return std::make_pair(v < 0, v);
				; }
				; int func(int v) {
				; std::pair<bool, int> rc = callee(v);
				; if (rc.first) dummy(0);
				; return rc.second;
				; }

				; CHECK-LABEL: @testBI
				; CHECK-LABEL: _ZL6calleei.exit
				; CHECK: [[PHI:%.]] = phi i1 [ false, %if.then.i ], [ [[V:%.]], %if.else.i ]
				; CHECK: br i1 [[PHI]], label %if.end, label %if.then
				entry:
				%call.i = call signext i32 @dummy(i32 signext %v)
				%tobool.i = icmp eq i32 %call.i, 0
				br i1 %tobool.i, label %if.else.i, label %if.then.i

				if.then.i: ; preds = %entry
				%call2.i = call signext i32 @dummy(i32 signext %call.i)
				%retval.sroa.22.0.insert.ext.i.i = zext i32 %call2.i to i64
				%retval.sroa.22.0.insert.shift.i.i = shl nuw i64 %retval.sroa.22.0.insert.ext.i.i, 32
				%retval.sroa.0.0.insert.insert.i.i = or i64 %retval.sroa.22.0.insert.shift.i.i, 1
				br label %_ZL6calleei.exit

				if.else.i: ; preds = %entry
				%.lobit.i = lshr i32 %v, 31
				%0 = zext i32 %.lobit.i to i64
				%retval.sroa.22.0.insert.ext.i8.i = zext i32 %v to i64
				%retval.sroa.22.0.insert.shift.i9.i = shl nuw i64 %retval.sroa.22.0.insert.ext.i8.i, 32
				%retval.sroa.0.0.insert.insert.i11.i = or i64 %retval.sroa.22.0.insert.shift.i9.i, %0
				br label %_ZL6calleei.exit

				_ZL6calleei.exit: ; preds = %if.then.i, %if.else.i
				%retval.sroa.0.0.i = phi i64 [ %retval.sroa.0.0.insert.insert.i.i, %if.then.i ], [ %retval.sroa.0.0.insert.insert.i11.i, %if.else.i ]
				%rc.sroa.43.0.extract.shift = lshr i64 %retval.sroa.0.0.i, 32
				%rc.sroa.43.0.extract.trunc = trunc i64 %rc.sroa.43.0.extract.shift to i32
				%1 = and i64 %retval.sroa.0.0.i, 1
				%tobool = icmp eq i64 %1, 0
				br i1 %tobool, label %if.end, label %if.then

				if.then: ; preds = %_ZL6calleei.exit
				%call1 = call signext i32 @dummy(i32 signext 0)
				br label %if.end

				if.end: ; preds = %_ZL6calleei.exit, %if.then
				ret i32 %rc.sroa.43.0.extract.trunc
				}

				define signext i32 @testIB(i32 signext %v) {
				; Test with std::pair<bool, int>
				; based on the following C++ code
				; std::pair<int, bool> callee(int v) {
				; int a = dummy(v);
				; if (a) return std::make_pair(dummy(v), true);
				; else return std::make_pair(v, v < 0);
				; }
				; int func(int v) {
				; std::pair<int, bool> rc = callee(v);
				; if (rc.second) dummy(0);
				; return rc.first;
				; }

				; CHECK-LABEL: @testIB
				; CHECK-LABEL: _ZL6calleei.exit
				; CHECK: [[PHI:%.]] = phi i1 [ false, %if.then.i ], [ [[V:%.]], %if.else.i ]
				; CHECK: br i1 [[PHI]], label %if.end, label %if.then
				entry:
				%call.i = call signext i32 @dummy(i32 signext %v)
				%tobool.i = icmp eq i32 %call.i, 0
				br i1 %tobool.i, label %if.else.i, label %if.then.i

				if.then.i: ; preds = %entry
				%call1.i = call signext i32 @dummy(i32 signext %v)
				%retval.sroa.0.0.insert.ext.i.i = zext i32 %call1.i to i64
				%retval.sroa.0.0.insert.insert.i.i = or i64 %retval.sroa.0.0.insert.ext.i.i, 4294967296
				br label %_ZL6calleei.exit

				if.else.i: ; preds = %entry
				%.lobit.i = lshr i32 %v, 31
				%0 = zext i32 %.lobit.i to i64
				%retval.sroa.2.0.insert.shift.i8.i = shl nuw nsw i64 %0, 32
				%retval.sroa.0.0.insert.ext.i9.i = zext i32 %v to i64
				%retval.sroa.0.0.insert.insert.i10.i = or i64 %retval.sroa.2.0.insert.shift.i8.i, %retval.sroa.0.0.insert.ext.i9.i
				br label %_ZL6calleei.exit

				_ZL6calleei.exit: ; preds = %if.then.i, %if.else.i
				%retval.sroa.0.0.i = phi i64 [ %retval.sroa.0.0.insert.insert.i.i, %if.then.i ], [ %retval.sroa.0.0.insert.insert.i10.i, %if.else.i ]
				%rc.sroa.0.0.extract.trunc = trunc i64 %retval.sroa.0.0.i to i32
				%1 = and i64 %retval.sroa.0.0.i, 4294967296
				%tobool = icmp eq i64 %1, 0
				br i1 %tobool, label %if.end, label %if.then

				if.then: ; preds = %_ZL6calleei.exit
				%call1 = call signext i32 @dummy(i32 signext 0)
				br label %if.end

				if.end: ; preds = %_ZL6calleei.exit, %if.then
				ret i32 %rc.sroa.0.0.extract.trunc
				}

				declare signext i32 @dummy(i32 signext %v)