This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
-
SimplifyCFG.cpp
-
test/Transforms/
-
Transforms/
-
PGOProfile/
-
chr.ll
-
SimplifyCFG/
-
fold-two-entry-phi-node-with-one-block-profmd.ll
-
speculatively-execute-block-profmd.ll

Differential D118066

[SimplifyCFG] Don't speculatively execute preductably-taken block
AbandonedPublic

Authored by lebedev.ri on Jan 24 2022, 12:43 PM.

Download Raw Diff

Details

Reviewers

spatel
Carrot
reames
nikic
hjyamauchi
davidxl
ebrevnov
apostolakis

Summary

Back in D106650 / D106717 i've added these profile-guided bailouts,
but i've kept the speculation in the case where there is
a single block to speculate, and we predict it to be taken.

But now, i'm having second thoughts.
If we predict it to be taken, isn't flattening it,
and incurring the cost of a select, a pessimization?

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lebedev.ri created this revision.Jan 24 2022, 12:43 PM

Herald added subscribers: wenlei, hiraditya. · View Herald TranscriptJan 24 2022, 12:43 PM

lebedev.ri requested review of this revision.Jan 24 2022, 12:43 PM

Not sure on this. I think in theory you may be right, but in practice I suspect that converting a separate block into a select will on average be beneficial, because of the followup optimizations it enables. We'll be able to do more with a select than a whole basic block.

lkail added a subscriber: lkail.Jan 25 2022, 1:04 AM

lebedev.ri added reviewers: hjyamauchi, davidxl, ebrevnov.Jan 25 2022, 2:07 AM

This change implies that speculation is not beneficial regardless whether block is predicted to be taken or untaken. That essentially means that the optimization is not beneficial in most cases of unpredicted (at compile time) branch as well since hardware will actually predict well in most cases. While there is an explicit note that it's hard to do proper cost modeling on IR level there are still some simple cases which are beneficial. So I think we will regress in some cases.

Another thing to consider is (AFAIU) there is a more aggressive version of this optimization on machine level. Thus there could be cases when by not doing the optimization on IR we will still end up with transformation applied on MIR but will miss potential benefit from another IR level optimizations.

To summarize. I think this change may cause improvements and regressions as well. To better asses overall impact we need perf data.
We can start with the motivating example clearly showing benefit of not doing the transform and understand why and how general is the pattern ....

If we predict it to be taken, isn't flattening it, and incurring the cost of a select

I think the cost of branch instruction(not including mispredict cost) can't be ignored either. The fallthrough basicblock doesn't need branch instruction, but the other does. If we got multiple selects, branch might be a win.

Harbormaster completed remote builds in B145299: Diff 402626.Jan 26 2022, 11:10 AM

davidxl added a reviewer: apostolakis.Jan 28 2022, 9:38 AM

I agree with nikic and ebrevnov in that it is better to avoid converting more selects to branches in SimplifyCFG since they might block other IR-level optimizations.

In general, I think we need to do the inverse. Flatten the CFG more aggressively, allow a simpler control flow that will facilitate IR-level optimizations and then at the end of middle-end make the decision on whether to convert the selects to branches.
I have a RFC on cmov/branch decision-making that proposes extending the logic on CodeGenPrepare or having a pass just before it (essentially at the end of middle-end) .
In the current proposal, I did not change SimplifyCFG but from some preliminary experiments, preventing SimplifyCFG from making such decisions and deferring this for later further improves performance.
Besides, isn't SimplifyCFG supposed to be a canonicalization pass that enables others, rather than an optimization pass.

In general, I think we need to do the inverse. Flatten the CFG more aggressively, allow a simpler control flow that will facilitate IR-level optimizations

But you can expect perf problems due to missing optimizations with select over phi.

https://groups.google.com/g/llvm-dev/c/VcJnLMI7Deg/m/TshM3BkHCAAJ

(Or maybe GVN is now smarter..)

But you can expect perf problems due to missing optimizations with select over phi.

https://groups.google.com/g/llvm-dev/c/VcJnLMI7Deg/m/TshM3BkHCAAJ

(Or maybe GVN is now smarter..)

Thanks for pointing this out. I guess it is debatable what the canonical form should be (selects or phis).
I would assume that the current canonical form is the one that enables the most subsequent optimizations.
I thought that this was selects (but not sure). Despite some passes that would prefer phis (such as GVNs perhaps) there are still a lot of other ones that prefer selects. So, some regressions seem currently unavoidable but overall I would hope that one is better.

It would be great for others to weigh-in of what's the current state and whether either selects or phis are more canonical (i.e., on average bigger enablers).

Besides this debate, shouldn't SimplifyCFG convert to this canonical form (whichever is better) and no try to optimize on its own? It seems hacky to force SimplifyCFG to make such decisions but maybe if neither form is significantly better then this ad-hoc decision-making helps overall.

Things may improve with https://reviews.llvm.org/D118143

cc @fhahn

In general, I believe we should canonicalize to select, fix missing transforms where it makes sense, then treat optimal branch vs cmov lowering as a backend problem.

(Quick comment, may be missing context in the discussion which is important, don't let me derail if so.)

apostolakis mentioned this in D120230: [SelectOpti][1/5] Setup new select-optimize pass.Mar 9 2022, 4:53 PM

Matt added a subscriber: Matt.Jun 1 2022, 4:28 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 4:28 PM

lebedev.ri abandoned this revision.Oct 18 2022, 5:46 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Utils/

SimplifyCFG.cpp

61 lines

test/

Transforms/

PGOProfile/

chr.ll

28 lines

SimplifyCFG/

fold-two-entry-phi-node-with-one-block-profmd.ll

6 lines

speculatively-execute-block-profmd.ll

5 lines

Diff 402626

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,348 Lines • ▼ Show 20 Lines	for (PHINode &PN : EndBB->phis()) {
++SpeculatedInstructions;		++SpeculatedInstructions;
if (SpeculatedInstructions > 1)		if (SpeculatedInstructions > 1)
return false;		return false;
}		}

return HaveRewritablePHIs;		return HaveRewritablePHIs;
}		}

		// Check if the branch is non-unpredictable, and has a predictable behaviour.
		static bool IsBranchPredictable(BranchInst *BI,
		const TargetTransformInfo &TTI) {
		if (BI->getMetadata(LLVMContext::MD_unpredictable))
		return false;

		uint64_t TWeight, FWeight;
		if (!BI->extractProfMetadata(TWeight, FWeight) \|\| (TWeight + FWeight) == 0)
		return false;

		BranchProbability BITrueProb =
		BranchProbability::getBranchProbability(TWeight, TWeight + FWeight);
		BranchProbability BIFalseProb = BITrueProb.getCompl();

		BranchProbability Likely = TTI.getPredictableBranchThreshold();
		return BITrueProb >= Likely \|\| BIFalseProb >= Likely;
		}

/// Speculate a conditional basic block flattening the CFG.		/// Speculate a conditional basic block flattening the CFG.
///		///
/// Note that this is a very risky transform currently. Speculating		/// Note that this is a very risky transform currently. Speculating
/// instructions like this is most often not desirable. Instead, there is an MI		/// instructions like this is most often not desirable. Instead, there is an MI
/// pass which can do it with full awareness of the resource constraints.		/// pass which can do it with full awareness of the resource constraints.
/// However, some cases are "obvious" and we should do directly. An example of		/// However, some cases are "obvious" and we should do directly. An example of
/// this is speculating a single, reasonably cheap instruction.		/// this is speculating a single, reasonably cheap instruction.
///		///
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	bool SimplifyCFGOpt::SpeculativelyExecuteBB(BranchInst BI, BasicBlock ThenBB,
// to swap the select operands later.		// to swap the select operands later.
bool Invert = false;		bool Invert = false;
if (ThenBB != BI->getSuccessor(0)) {		if (ThenBB != BI->getSuccessor(0)) {
assert(ThenBB == BI->getSuccessor(1) && "No edge from 'if' block?");		assert(ThenBB == BI->getSuccessor(1) && "No edge from 'if' block?");
Invert = true;		Invert = true;
}		}
assert(EndBB == BI->getSuccessor(!Invert) && "No edge from to end block");		assert(EndBB == BI->getSuccessor(!Invert) && "No edge from to end block");

// If the branch is non-unpredictable, and is predicted to not branch to		// Avoid speculating predictable branches.
// the `then` block, then avoid speculating it.		if (IsBranchPredictable(BI, TTI))
if (!BI->getMetadata(LLVMContext::MD_unpredictable)) {
uint64_t TWeight, FWeight;
if (BI->extractProfMetadata(TWeight, FWeight) && (TWeight + FWeight) != 0) {
uint64_t EndWeight = Invert ? TWeight : FWeight;
BranchProbability BIEndProb =
BranchProbability::getBranchProbability(EndWeight, TWeight + FWeight);
BranchProbability Likely = TTI.getPredictableBranchThreshold();
if (BIEndProb >= Likely)
return false;		return false;
}
}

// Keep a count of how many times instructions are used within ThenBB when		// Keep a count of how many times instructions are used within ThenBB when
// they are candidates for sinking into ThenBB. Specifically:		// they are candidates for sinking into ThenBB. Specifically:
// - They are defined in BB, and		// - They are defined in BB, and
// - They have no side effects, and		// - They have no side effects, and
// - All of their uses are in ThenBB.		// - All of their uses are in ThenBB.
SmallDenseMap<Instruction *, unsigned, 4> SinkCandidateUseCounts;		SmallDenseMap<Instruction *, unsigned, 4> SinkCandidateUseCounts;

▲ Show 20 Lines • Show All 354 Lines • ▼ Show 20 Lines	static bool FoldTwoEntryPHINode(PHINode *PN, const TargetTransformInfo &TTI,
SmallVector<BasicBlock *, 2> IfBlocks;		SmallVector<BasicBlock *, 2> IfBlocks;
llvm::copy_if(		llvm::copy_if(
PN->blocks(), std::back_inserter(IfBlocks), [](BasicBlock *IfBlock) {		PN->blocks(), std::back_inserter(IfBlocks), [](BasicBlock *IfBlock) {
return cast<BranchInst>(IfBlock->getTerminator())->isUnconditional();		return cast<BranchInst>(IfBlock->getTerminator())->isUnconditional();
});		});
assert((IfBlocks.size() == 1 \|\| IfBlocks.size() == 2) &&		assert((IfBlocks.size() == 1 \|\| IfBlocks.size() == 2) &&
"Will have either one or two blocks to speculate.");		"Will have either one or two blocks to speculate.");

// If the branch is non-unpredictable, see if we either predictably jump to		// Avoid speculating predictable branches.
// the merge bb (if we have only a single 'then' block), or if we predictably		if (IsBranchPredictable(DomBI, TTI))
// jump to one specific 'then' block (if we have two of them).
// It isn't beneficial to speculatively execute the code
// from the block that we know is predictably not entered.
if (!DomBI->getMetadata(LLVMContext::MD_unpredictable)) {
uint64_t TWeight, FWeight;
if (DomBI->extractProfMetadata(TWeight, FWeight) &&
(TWeight + FWeight) != 0) {
BranchProbability BITrueProb =
BranchProbability::getBranchProbability(TWeight, TWeight + FWeight);
BranchProbability Likely = TTI.getPredictableBranchThreshold();
BranchProbability BIFalseProb = BITrueProb.getCompl();
if (IfBlocks.size() == 1) {
BranchProbability BIBBProb =
DomBI->getSuccessor(0) == BB ? BITrueProb : BIFalseProb;
if (BIBBProb >= Likely)
return false;
} else {
if (BITrueProb >= Likely \|\| BIFalseProb >= Likely)
return false;		return false;
}
}
}

// Don't try to fold an unreachable block. For example, the phi node itself		// Don't try to fold an unreachable block. For example, the phi node itself
// can't be the candidate if-condition for a select that we want to form.		// can't be the candidate if-condition for a select that we want to form.
if (auto *IfCondPhiInst = dyn_cast<PHINode>(IfCond))		if (auto *IfCondPhiInst = dyn_cast<PHINode>(IfCond))
if (IfCondPhiInst->getParent() == BB)		if (IfCondPhiInst->getParent() == BB)
return false;		return false;

// Okay, we found that we can merge this two-entry phi node into a select.		// Okay, we found that we can merge this two-entry phi node into a select.
▲ Show 20 Lines • Show All 3,966 Lines • Show Last 20 Lines

llvm/test/Transforms/PGOProfile/chr.ll

	Show First 20 Lines • Show All 465 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP8:%.*]] = add i32 [[SUM0]], 42			; CHECK-NEXT: [[TMP8:%.*]] = add i32 [[SUM0]], 42
	; CHECK-NEXT: [[SUM1_NONCHR:%.*]] = select i1 [[TMP7]], i32 [[SUM0]], i32 [[TMP8]], !prof [[PROF16]]			; CHECK-NEXT: [[SUM1_NONCHR:%.*]] = select i1 [[TMP7]], i32 [[SUM0]], i32 [[TMP8]], !prof [[PROF16]]
	; CHECK-NEXT: [[TMP9:%.*]] = and i32 [[TMP0]], 2			; CHECK-NEXT: [[TMP9:%.*]] = and i32 [[TMP0]], 2
	; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i32 [[TMP9]], 0			; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i32 [[TMP9]], 0
	; CHECK-NEXT: [[TMP11:%.*]] = add i32 [[SUM1_NONCHR]], 43			; CHECK-NEXT: [[TMP11:%.*]] = add i32 [[SUM1_NONCHR]], 43
	; CHECK-NEXT: [[SUM2_NONCHR:%.*]] = select i1 [[TMP10]], i32 [[SUM1_NONCHR]], i32 [[TMP11]], !prof [[PROF16]]			; CHECK-NEXT: [[SUM2_NONCHR:%.*]] = select i1 [[TMP10]], i32 [[SUM1_NONCHR]], i32 [[TMP11]], !prof [[PROF16]]
	; CHECK-NEXT: [[TMP12:%.*]] = and i32 [[TMP0]], 4			; CHECK-NEXT: [[TMP12:%.*]] = and i32 [[TMP0]], 4
	; CHECK-NEXT: [[TMP13:%.*]] = icmp eq i32 [[TMP12]], 0			; CHECK-NEXT: [[TMP13:%.*]] = icmp eq i32 [[TMP12]], 0
				; CHECK-NEXT: br i1 [[TMP13]], label [[BB3]], label [[BB1_NONCHR:%.*]], !prof [[PROF16]]
				; CHECK: bb1.nonchr:
	; CHECK-NEXT: [[TMP14:%.*]] = and i32 [[TMP0]], 8			; CHECK-NEXT: [[TMP14:%.*]] = and i32 [[TMP0]], 8
	; CHECK-NEXT: [[TMP15:%.*]] = icmp eq i32 [[TMP14]], 0			; CHECK-NEXT: [[TMP15:%.*]] = icmp eq i32 [[TMP14]], 0
	; CHECK-NEXT: [[SUM4_NONCHR_V:%.*]] = select i1 [[TMP15]], i32 44, i32 88			; CHECK-NEXT: [[SUM4_NONCHR_V:%.*]] = select i1 [[TMP15]], i32 44, i32 88, !prof [[PROF16]]
	; CHECK-NEXT: [[SUM4_NONCHR:%.*]] = add i32 [[SUM2_NONCHR]], [[SUM4_NONCHR_V]]			; CHECK-NEXT: [[SUM4_NONCHR:%.*]] = add i32 [[SUM2_NONCHR]], [[SUM4_NONCHR_V]]
	; CHECK-NEXT: [[SUM5_NONCHR:%.*]] = select i1 [[TMP13]], i32 [[SUM2_NONCHR]], i32 [[SUM4_NONCHR]], !prof [[PROF16]]
	; CHECK-NEXT: br label [[BB3]]			; CHECK-NEXT: br label [[BB3]]
	; CHECK: bb3:			; CHECK: bb3:
	; CHECK-NEXT: [[SUM6:%.*]] = phi i32 [ [[TMP4]], [[BB0]] ], [ [[SUM0]], [[ENTRY_SPLIT_NONCHR]] ], [ [[SUM5_NONCHR]], [[BB0_NONCHR]] ]			; CHECK-NEXT: [[SUM6:%.*]] = phi i32 [ [[TMP4]], [[BB0]] ], [ [[SUM0]], [[ENTRY_SPLIT_NONCHR]] ], [ [[SUM2_NONCHR]], [[BB0_NONCHR]] ], [ [[SUM4_NONCHR]], [[BB1_NONCHR]] ]
	; CHECK-NEXT: ret i32 [[SUM6]]			; CHECK-NEXT: ret i32 [[SUM6]]
	;			;
	entry:			entry:
	%0 = load i32, i32* %i			%0 = load i32, i32* %i
	%1 = and i32 %0, 255			%1 = and i32 %0, 255
	%2 = icmp eq i32 %1, 0			%2 = icmp eq i32 %1, 0
	br i1 %2, label %bb3, label %bb0, !prof !15			br i1 %2, label %bb3, label %bb0, !prof !15

	▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP11:%.*]] = add i32 [[SUM0]], 42			; CHECK-NEXT: [[TMP11:%.*]] = add i32 [[SUM0]], 42
	; CHECK-NEXT: [[SUM1_NONCHR:%.*]] = select i1 [[TMP10]], i32 [[SUM0]], i32 [[TMP11]], !prof [[PROF16]]			; CHECK-NEXT: [[SUM1_NONCHR:%.*]] = select i1 [[TMP10]], i32 [[SUM0]], i32 [[TMP11]], !prof [[PROF16]]
	; CHECK-NEXT: [[TMP12:%.*]] = and i32 [[TMP0]], 2			; CHECK-NEXT: [[TMP12:%.*]] = and i32 [[TMP0]], 2
	; CHECK-NEXT: [[TMP13:%.*]] = icmp eq i32 [[TMP12]], 0			; CHECK-NEXT: [[TMP13:%.*]] = icmp eq i32 [[TMP12]], 0
	; CHECK-NEXT: [[TMP14:%.*]] = add i32 [[SUM1_NONCHR]], 43			; CHECK-NEXT: [[TMP14:%.*]] = add i32 [[SUM1_NONCHR]], 43
	; CHECK-NEXT: [[SUM2_NONCHR:%.*]] = select i1 [[TMP13]], i32 [[SUM1_NONCHR]], i32 [[TMP14]], !prof [[PROF16]]			; CHECK-NEXT: [[SUM2_NONCHR:%.*]] = select i1 [[TMP13]], i32 [[SUM1_NONCHR]], i32 [[TMP14]], !prof [[PROF16]]
	; CHECK-NEXT: [[TMP15:%.*]] = and i32 [[SUM0]], 4			; CHECK-NEXT: [[TMP15:%.*]] = and i32 [[SUM0]], 4
	; CHECK-NEXT: [[TMP16:%.*]] = icmp eq i32 [[TMP15]], 0			; CHECK-NEXT: [[TMP16:%.*]] = icmp eq i32 [[TMP15]], 0
				; CHECK-NEXT: br i1 [[TMP16]], label [[BB3]], label [[BB1_NONCHR:%.*]], !prof [[PROF16]]
				; CHECK: bb1.nonchr:
	; CHECK-NEXT: [[TMP17:%.*]] = and i32 [[TMP0]], 8			; CHECK-NEXT: [[TMP17:%.*]] = and i32 [[TMP0]], 8
	; CHECK-NEXT: [[TMP18:%.*]] = icmp eq i32 [[TMP17]], 0			; CHECK-NEXT: [[TMP18:%.*]] = icmp eq i32 [[TMP17]], 0
	; CHECK-NEXT: [[SUM4_NONCHR_V:%.*]] = select i1 [[TMP18]], i32 44, i32 88			; CHECK-NEXT: [[SUM4_NONCHR_V:%.*]] = select i1 [[TMP18]], i32 44, i32 88, !prof [[PROF16]]
	; CHECK-NEXT: [[SUM4_NONCHR:%.*]] = add i32 [[SUM2_NONCHR]], [[SUM4_NONCHR_V]]			; CHECK-NEXT: [[SUM4_NONCHR:%.*]] = add i32 [[SUM2_NONCHR]], [[SUM4_NONCHR_V]]
	; CHECK-NEXT: [[SUM5_NONCHR:%.*]] = select i1 [[TMP16]], i32 [[SUM2_NONCHR]], i32 [[SUM4_NONCHR]], !prof [[PROF16]]
	; CHECK-NEXT: br label [[BB3]]			; CHECK-NEXT: br label [[BB3]]
	; CHECK: bb3:			; CHECK: bb3:
	; CHECK-NEXT: [[SUM6:%.*]] = phi i32 [ [[TMP7]], [[BB0]] ], [ [[SUM0]], [[ENTRY_SPLIT_NONCHR]] ], [ [[SUM5_NONCHR]], [[BB0_NONCHR]] ]			; CHECK-NEXT: [[SUM6:%.*]] = phi i32 [ [[TMP7]], [[BB0]] ], [ [[SUM0]], [[ENTRY_SPLIT_NONCHR]] ], [ [[SUM2_NONCHR]], [[BB0_NONCHR]] ], [ [[SUM4_NONCHR]], [[BB1_NONCHR]] ]
	; CHECK-NEXT: ret i32 [[SUM6]]			; CHECK-NEXT: ret i32 [[SUM6]]
	;			;
	entry:			entry:
	%0 = load i32, i32* %i			%0 = load i32, i32* %i
	%1 = and i32 %0, 255			%1 = and i32 %0, 255
	%2 = icmp eq i32 %1, 0			%2 = icmp eq i32 %1, 0
	br i1 %2, label %bb3, label %bb0, !prof !15			br i1 %2, label %bb3, label %bb0, !prof !15

	▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: br i1 [[V2_NOT]], label [[BB3]], label [[BB0_NONCHR:%.*]], !prof [[PROF16]]			; CHECK-NEXT: br i1 [[V2_NOT]], label [[BB3]], label [[BB0_NONCHR:%.*]], !prof [[PROF16]]
	; CHECK: bb0.nonchr:			; CHECK: bb0.nonchr:
	; CHECK-NEXT: [[V3_NONCHR:%.*]] = and i32 [[I0]], 2			; CHECK-NEXT: [[V3_NONCHR:%.*]] = and i32 [[I0]], 2
	; CHECK-NEXT: [[V4_NONCHR:%.*]] = icmp eq i32 [[V3_NONCHR]], 0			; CHECK-NEXT: [[V4_NONCHR:%.*]] = icmp eq i32 [[V3_NONCHR]], 0
	; CHECK-NEXT: [[V8_NONCHR:%.*]] = add i32 [[SUM0]], 43			; CHECK-NEXT: [[V8_NONCHR:%.*]] = add i32 [[SUM0]], 43
	; CHECK-NEXT: [[SUM2_NONCHR:%.*]] = select i1 [[V4_NONCHR]], i32 [[SUM0]], i32 [[V8_NONCHR]], !prof [[PROF16]]			; CHECK-NEXT: [[SUM2_NONCHR:%.*]] = select i1 [[V4_NONCHR]], i32 [[SUM0]], i32 [[V8_NONCHR]], !prof [[PROF16]]
	; CHECK-NEXT: [[V9_NONCHR:%.*]] = and i32 [[J0]], 4			; CHECK-NEXT: [[V9_NONCHR:%.*]] = and i32 [[J0]], 4
	; CHECK-NEXT: [[V10_NONCHR:%.*]] = icmp eq i32 [[V9_NONCHR]], 0			; CHECK-NEXT: [[V10_NONCHR:%.*]] = icmp eq i32 [[V9_NONCHR]], 0
				; CHECK-NEXT: br i1 [[V10_NONCHR]], label [[BB3]], label [[BB1_NONCHR:%.*]], !prof [[PROF16]]
				; CHECK: bb1.nonchr:
	; CHECK-NEXT: [[V11_NONCHR:%.*]] = and i32 [[I0]], 8			; CHECK-NEXT: [[V11_NONCHR:%.*]] = and i32 [[I0]], 8
	; CHECK-NEXT: [[V12_NONCHR:%.*]] = icmp eq i32 [[V11_NONCHR]], 0			; CHECK-NEXT: [[V12_NONCHR:%.*]] = icmp eq i32 [[V11_NONCHR]], 0
	; CHECK-NEXT: [[SUM4_NONCHR_V:%.*]] = select i1 [[V12_NONCHR]], i32 44, i32 88			; CHECK-NEXT: [[SUM4_NONCHR_V:%.*]] = select i1 [[V12_NONCHR]], i32 44, i32 88, !prof [[PROF16]]
	; CHECK-NEXT: [[SUM4_NONCHR:%.*]] = add i32 [[SUM2_NONCHR]], [[SUM4_NONCHR_V]]			; CHECK-NEXT: [[SUM4_NONCHR:%.*]] = add i32 [[SUM2_NONCHR]], [[SUM4_NONCHR_V]]
	; CHECK-NEXT: [[SUM5_NONCHR:%.*]] = select i1 [[V10_NONCHR]], i32 [[SUM2_NONCHR]], i32 [[SUM4_NONCHR]], !prof [[PROF16]]
	; CHECK-NEXT: br label [[BB3]]			; CHECK-NEXT: br label [[BB3]]
	; CHECK: bb3:			; CHECK: bb3:
	; CHECK-NEXT: [[SUM6:%.*]] = phi i32 [ [[V13]], [[BB0]] ], [ [[SUM0]], [[ENTRY_SPLIT_NONCHR]] ], [ [[SUM5_NONCHR]], [[BB0_NONCHR]] ]			; CHECK-NEXT: [[SUM6:%.*]] = phi i32 [ [[V13]], [[BB0]] ], [ [[SUM0]], [[ENTRY_SPLIT_NONCHR]] ], [ [[SUM2_NONCHR]], [[BB0_NONCHR]] ], [ [[SUM4_NONCHR]], [[BB1_NONCHR]] ]
	; CHECK-NEXT: ret i32 [[SUM6]]			; CHECK-NEXT: ret i32 [[SUM6]]
	;			;
	entry:			entry:
	%i0 = load i32, i32* %i			%i0 = load i32, i32* %i
	%j0 = load i32, i32* %j			%j0 = load i32, i32* %j
	%v1 = and i32 %i0, 255			%v1 = and i32 %i0, 255
	%v2 = icmp eq i32 %v1, 0			%v2 = icmp eq i32 %v1, 0
	br i1 %v2, label %bb3, label %bb0, !prof !15			br i1 %v2, label %bb3, label %bb0, !prof !15
	▲ Show 20 Lines • Show All 1,061 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[TMP0]], 255			; CHECK-NEXT: [[TMP5:%.*]] = and i32 [[TMP0]], 255
	; CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i32 [[TMP5]], 0			; CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i32 [[TMP5]], 0
	; CHECK-NEXT: br i1 [[DOTNOT]], label [[BB3]], label [[BB0_NONCHR:%.*]], !prof [[PROF16]]			; CHECK-NEXT: br i1 [[DOTNOT]], label [[BB3]], label [[BB0_NONCHR:%.*]], !prof [[PROF16]]
	; CHECK: bb0.nonchr:			; CHECK: bb0.nonchr:
	; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[TMP0]], 1			; CHECK-NEXT: [[TMP6:%.*]] = and i32 [[TMP0]], 1
	; CHECK-NEXT: [[TMP7:%.*]] = icmp eq i32 [[TMP6]], 0			; CHECK-NEXT: [[TMP7:%.*]] = icmp eq i32 [[TMP6]], 0
	; CHECK-NEXT: [[TMP8:%.*]] = add i32 [[SUM0]], 85			; CHECK-NEXT: [[TMP8:%.*]] = add i32 [[SUM0]], 85
	; CHECK-NEXT: [[SUM2_NONCHR:%.*]] = select i1 [[TMP7]], i32 [[SUM0]], i32 [[TMP8]], !prof [[PROF16]]			; CHECK-NEXT: [[SUM2_NONCHR:%.*]] = select i1 [[TMP7]], i32 [[SUM0]], i32 [[TMP8]], !prof [[PROF16]]
				; CHECK-NEXT: br i1 [[TMP7]], label [[BB3]], label [[BB1_NONCHR:%.*]], !prof [[PROF16]]
				; CHECK: bb1.nonchr:
	; CHECK-NEXT: [[TMP9:%.*]] = and i32 [[TMP0]], 8			; CHECK-NEXT: [[TMP9:%.*]] = and i32 [[TMP0]], 8
	; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i32 [[TMP9]], 0			; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i32 [[TMP9]], 0
	; CHECK-NEXT: [[SUM4_NONCHR_V:%.*]] = select i1 [[TMP10]], i32 44, i32 88			; CHECK-NEXT: [[SUM4_NONCHR_V:%.*]] = select i1 [[TMP10]], i32 44, i32 88, !prof [[PROF16]]
	; CHECK-NEXT: [[SUM4_NONCHR:%.*]] = add i32 [[SUM2_NONCHR]], [[SUM4_NONCHR_V]]			; CHECK-NEXT: [[SUM4_NONCHR:%.*]] = add i32 [[SUM2_NONCHR]], [[SUM4_NONCHR_V]]
	; CHECK-NEXT: [[SUM5_NONCHR:%.*]] = select i1 [[TMP7]], i32 [[SUM2_NONCHR]], i32 [[SUM4_NONCHR]], !prof [[PROF16]]
	; CHECK-NEXT: br label [[BB3]]			; CHECK-NEXT: br label [[BB3]]
	; CHECK: bb3:			; CHECK: bb3:
	; CHECK-NEXT: [[SUM6:%.*]] = phi i32 [ [[TMP4]], [[BB0]] ], [ [[SUM0]], [[ENTRY_SPLIT_NONCHR]] ], [ [[SUM5_NONCHR]], [[BB0_NONCHR]] ]			; CHECK-NEXT: [[SUM6:%.*]] = phi i32 [ [[TMP4]], [[BB0]] ], [ [[SUM0]], [[ENTRY_SPLIT_NONCHR]] ], [ [[SUM2_NONCHR]], [[BB0_NONCHR]] ], [ [[SUM4_NONCHR]], [[BB1_NONCHR]] ]
	; CHECK-NEXT: ret i32 [[SUM6]]			; CHECK-NEXT: ret i32 [[SUM6]]
	;			;
	entry:			entry:
	%0 = load i32, i32* %i			%0 = load i32, i32* %i
	%1 = and i32 %0, 255			%1 = and i32 %0, 255
	%2 = icmp eq i32 %1, 0			%2 = icmp eq i32 %1, 0
	br i1 %2, label %bb3, label %bb0, !prof !15			br i1 %2, label %bb3, label %bb0, !prof !15

	▲ Show 20 Lines • Show All 904 Lines • Show Last 20 Lines

llvm/test/Transforms/SimplifyCFG/fold-two-entry-phi-node-with-one-block-profmd.ll

Show All 28 Lines	end:
ret i32 %res		ret i32 %res
}		}

define i32 @predictably_taken(i32 %a, i32 %b, i32 %c, i32 %d) {		define i32 @predictably_taken(i32 %a, i32 %b, i32 %c, i32 %d) {
; CHECK-LABEL: @predictably_taken(		; CHECK-LABEL: @predictably_taken(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: call void @sideeffect0()		; CHECK-NEXT: call void @sideeffect0()
; CHECK-NEXT: [[CMP:%.]] = icmp eq i32 [[A:%.]], [[B:%.*]]		; CHECK-NEXT: [[CMP:%.]] = icmp eq i32 [[A:%.]], [[B:%.*]]
		; CHECK-NEXT: br i1 [[CMP]], label [[COND_TRUE:%.]], label [[END:%.]], !prof [[PROF0:![0-9]+]]
		; CHECK: cond.true:
; CHECK-NEXT: [[V0:%.]] = add i32 [[C:%.]], [[D:%.*]]		; CHECK-NEXT: [[V0:%.]] = add i32 [[C:%.]], [[D:%.*]]
; CHECK-NEXT: [[RES:%.*]] = select i1 [[CMP]], i32 [[V0]], i32 0, !prof [[PROF0:![0-9]+]]		; CHECK-NEXT: br label [[END]]
		; CHECK: end:
		; CHECK-NEXT: [[RES:%.]] = phi i32 [ [[V0]], [[COND_TRUE]] ], [ 0, [[ENTRY:%.]] ]
; CHECK-NEXT: call void @sideeffect1()		; CHECK-NEXT: call void @sideeffect1()
; CHECK-NEXT: ret i32 [[RES]]		; CHECK-NEXT: ret i32 [[RES]]
;		;
entry:		entry:
call void @sideeffect0()		call void @sideeffect0()
%cmp = icmp eq i32 %a, %b		%cmp = icmp eq i32 %a, %b
br i1 %cmp, label %cond.true, label %end, !prof !0 ; likely branches to %cond.true		br i1 %cmp, label %cond.true, label %end, !prof !0 ; likely branches to %cond.true

▲ Show 20 Lines • Show All 146 Lines • Show Last 20 Lines

llvm/test/Transforms/SimplifyCFG/speculatively-execute-block-profmd.ll

	Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
	define i32 @predictably_taken(i1 %c, i32 %a, i32 %b) {			define i32 @predictably_taken(i1 %c, i32 %a, i32 %b) {
	; CHECK-LABEL: @predictably_taken(			; CHECK-LABEL: @predictably_taken(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: call void @sideeffect0()			; CHECK-NEXT: call void @sideeffect0()
	; CHECK-NEXT: br i1 [[C:%.]], label [[DISPATCH:%.]], label [[END:%.*]]			; CHECK-NEXT: br i1 [[C:%.]], label [[DISPATCH:%.]], label [[END:%.*]]
	; CHECK: dispatch:			; CHECK: dispatch:
	; CHECK-NEXT: call void @sideeffect1()			; CHECK-NEXT: call void @sideeffect1()
	; CHECK-NEXT: [[CMP:%.]] = icmp eq i32 [[A:%.]], [[B:%.*]]			; CHECK-NEXT: [[CMP:%.]] = icmp eq i32 [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: br i1 [[CMP]], label [[COND_TRUE:%.*]], label [[END]], !prof [[PROF0:![0-9]+]]
				; CHECK: cond.true:
	; CHECK-NEXT: [[VAL:%.*]] = add i32 [[A]], [[B]]			; CHECK-NEXT: [[VAL:%.*]] = add i32 [[A]], [[B]]
	; CHECK-NEXT: [[SPEC_SELECT:%.*]] = select i1 [[CMP]], i32 [[VAL]], i32 0, !prof [[PROF0:![0-9]+]]
	; CHECK-NEXT: br label [[END]]			; CHECK-NEXT: br label [[END]]
	; CHECK: end:			; CHECK: end:
	; CHECK-NEXT: [[RES:%.]] = phi i32 [ -1, [[ENTRY:%.]] ], [ [[SPEC_SELECT]], [[DISPATCH]] ]			; CHECK-NEXT: [[RES:%.]] = phi i32 [ -1, [[ENTRY:%.]] ], [ 0, [[DISPATCH]] ], [ [[VAL]], [[COND_TRUE]] ]
	; CHECK-NEXT: call void @sideeffect2()			; CHECK-NEXT: call void @sideeffect2()
	; CHECK-NEXT: ret i32 [[RES]]			; CHECK-NEXT: ret i32 [[RES]]
	;			;
	entry:			entry:
	call void @sideeffect0()			call void @sideeffect0()
	br i1 %c, label %dispatch, label %end			br i1 %c, label %dispatch, label %end

	dispatch:			dispatch:
	▲ Show 20 Lines • Show All 197 Lines • Show Last 20 Lines