This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
5/7
SimplifyCFG.cpp
-
test/Transforms/
-
Transforms/
-
PGOProfile/
-
chr.ll
-
SimplifyCFG/
-
preserve-branchweights.ll

Differential D98898

[SimplifyCFG] use profile metadata to refine merging branch conditions
ClosedPublic

Authored by spatel on Mar 18 2021, 2:22 PM.

Download Raw Diff

Details

Reviewers

Carrot
lebedev.ri
pengfei
craig.topper
RKSimon

Commits

rG1bf8f9e22854: [SimplifyCFG] use profile metadata to refine merging branch conditions
rG27ae17a6b014: [SimplifyCFG] use profile metadata to refine merging branch conditions

Summary

This is one step towards solving:
https://llvm.org/PR49336

In that example, we disregard the recommended usage of builtin_expect, so an expensive (unpredictable) branch is folded into another branch that is guarding it.
Here, we read the profile metadata to see if the 1st (predecessor) condition is likely to cause execution to bypass the 2nd (successor) condition before merging conditions by using logic ops.

Part of this patch is moving the Likely/Unlikely variables to make them visible to SimplifyCFG. We could do that as a preliminary step (if I got that right).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

spatel created this revision.Mar 18 2021, 2:22 PM

Herald added subscribers: wenlei, hiraditya, mcrosier. · View Herald TranscriptMar 18 2021, 2:22 PM

spatel requested review of this revision.Mar 18 2021, 2:22 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 18 2021, 2:22 PM

Yep, please do split off the prep patch.
I think there is some other place that duplicates those constants, maybe in clang?
I will look over this in detail later..

Harbormaster completed remote builds in B94553: Diff 331687.Mar 18 2021, 3:01 PM

spatel mentioned this in D98945: [BranchProbability] move options for 'likely' and 'unlikely'.Mar 19 2021, 5:52 AM

Updated:
Rebased on top of D98945 (move the branch prob options),

spatel added a parent revision: D98945: [BranchProbability] move options for 'likely' and 'unlikely'.Mar 19 2021, 6:06 AM

Harbormaster completed remote builds in B94684: Diff 331847.Mar 19 2021, 6:51 AM

spatel mentioned this in rGee8b53815ddf: [BranchProbability] move options for 'likely' and 'unlikely'.Mar 20 2021, 11:49 AM

Can prof metadata and MD_unpredictable be mixed together?
What should we do here in presence of MD_unpredictable?

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2844–2846	... and check that it wouldn't be an obviously unprofitable thing to do as per the prof metadata.
2862–2863	And now that i actually start reviewing this, i see what was irking me. While it seems like we should be using the weights from `LowerExpectIntrinsic`, why is that the right threshold for profile-driven weights? There's `TLI->getPredictableBranchThreshold()`, shouldn't we use that? Because the `LikelyBranchWeight`/`UnlikelyBranchWeight` appears to only be used in `CodeGenFunction.cpp` and `LowerExpectIntrinsic.cpp`, none of the transforms use them, unlike `TLI->getPredictableBranchThreshold()`. Which to me looks like that they are `LowerExpectIntrinsic`'s implementation detail that is unintentionally overexposed.

In D98898#2640239, @lebedev.ri wrote:

Can prof metadata and MD_unpredictable be mixed together?
What should we do here in presence of MD_unpredictable?

They can be mixed, but I don't see any precedence for dealing with them simultaneously. So we get to decide here :) -- or in a follow-up standalone patch to make it more definitive
In theory, either of these metadata could be attached manually by a programmer or from actual training data / profile-guided optimization (PGO).
I'm leaning towards having unpredictable be the winner in that case because branch weights are good, but they can't really tell us how a given target will behave since we can't model even basic branch history hardware here. So we have to defer to whoever/whatever said a branch was unpredictable as true/correct.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2862–2863	Yes, I thought about that too, and I don't have a good reason for using likely/unlikely. I think we will want to move the threshold API to TTI (rather than TLI) if we go with that setting, so we're not making an optimizer pass depend on TLI unnecessarily.

lebedev.ri added inline comments.Mar 21 2021, 12:17 PM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2862–2863	I'll revert D98945 along with fixing rG08196e0b2e1f8aaa8a854585335c17ba479114df in a moment.

In D98898#2640287, @spatel wrote:

In D98898#2640239, @lebedev.ri wrote:

Can prof metadata and MD_unpredictable be mixed together?
What should we do here in presence of MD_unpredictable?

They can be mixed, but I don't see any precedence for dealing with them simultaneously. So we get to decide here :) -- or in a follow-up standalone patch to make it more definitive
In theory, either of these metadata could be attached manually by a programmer or from actual training data / profile-guided optimization (PGO).
I'm leaning towards having unpredictable be the winner in that case because branch weights are good, but they can't really tell us how a given target will behave since we can't model even basic branch history hardware here. So we have to defer to whoever/whatever said a branch was unpredictable as true/correct.

I also believe unpredictable should win.

lebedev.ri mentioned this in rG37d6be90524c: Revert "[BranchProbability] move options for 'likely' and 'unlikely'".Mar 21 2021, 12:51 PM

lebedev.ri mentioned this in rGe3a470162738: [clang][CodeGen] Lower Likelihood attributes to @llvm.expect intrin instead of….

(done in rGe3a470162738871bba982416748ae5f5e3572947)

lebedev.ri requested changes to this revision.Mar 22 2021, 7:41 AM

This revision now requires changes to proceed.Mar 22 2021, 7:41 AM

spatel mentioned this in rG664d0c052c31: [TargetTransformInfo] move branch probability query from TargetLoweringInfo.Mar 22 2021, 12:56 PM

spatel mentioned this in rGc21016715f0e: [SimplifyCFG] adjust test branchweights; NFC.

Patch updated:
Use TTI query to decide when a branch is predictable.
I changed the TLI query to TTI as a preliminary commit and adjusted the test metadata to match.

I think we should add the unpredictable override as a follow-up, so I have not added that yet, but if the consensus is to add it here, I can do that.

LGTM, thanks.

In D98898#2642445, @spatel wrote:

I think we should add the unpredictable override as a follow-up, so I have not added that yet, but if the consensus is to add it here, I can do that.

I think that's fine, but as before that we should probably document that in LangRef

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2857

This revision is now accepted and ready to land.Mar 22 2021, 1:18 PM

This revision was landed with ongoing or failed builds.Mar 22 2021, 1:49 PM

Closed by commit rG27ae17a6b014: [SimplifyCFG] use profile metadata to refine merging branch conditions (authored by spatel). · Explain Why

This revision was automatically updated to reflect the committed changes.

spatel added a commit: rG27ae17a6b014: [SimplifyCFG] use profile metadata to refine merging branch conditions.

Harbormaster completed remote builds in B95077: Diff 332406.Mar 22 2021, 2:19 PM

spatel added a reverting change: rG95f7f7c21b47: Revert "[SimplifyCFG] use profile metadata to refine merging branch conditions".Mar 22 2021, 2:48 PM

lebedev.ri added inline comments.Mar 22 2021, 2:55 PM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
3009–3012	`FoldBranchToCommonDest()` might be called from outside of SimplifyCFG (it is called from loopsimplify), without passing-in TTI, i guess.

This one looks broken by the patch https://lab.llvm.org/buildbot/#/builders/77/builds/4834

Oh, I see it's already reverted. Thank you!

spatel added inline comments.Mar 23 2021, 4:55 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
3009–3012	Yep, that was it. I was able to deduce a test for it without going through the stage2 failure. I'll add a test for `-loop-simplify` and try again. This does raise a question: is it intentional that a pass is ignoring the metadata? Does that mean we may lose the information despite making a change for SimplifyCFG?

spatel added a commit: rG1bf8f9e22854: [SimplifyCFG] use profile metadata to refine merging branch conditions.Mar 23 2021, 7:21 AM

spatel mentioned this in D100213: [PassManager][PhaseOrdering] lower expects before running simplifyCFG .Apr 9 2021, 9:32 AM

spatel mentioned this in rG330619a3a623: [PassManager][PhaseOrdering] lower expects before running simplifyCFG.Apr 12 2021, 9:24 AM

spatel mentioned this in rG661cc71a1c50: [PassManager][PhaseOrdering] lower expects before running simplifyCFG.Apr 12 2021, 12:08 PM

MatzeB mentioned this in D158642: LoopUnrollRuntime: Add weights to all branches.Aug 23 2023, 11:09 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Utils/

SimplifyCFG.cpp

59 lines

test/

Transforms/

PGOProfile/

chr.ll

14 lines

SimplifyCFG/

preserve-branchweights.ll

60 lines

Diff 332413

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines

#include "llvm/IR/Operator.h" #include "llvm/IR/Operator.h"

#include "llvm/IR/PatternMatch.h" #include "llvm/IR/PatternMatch.h"

#include "llvm/IR/PseudoProbe.h" #include "llvm/IR/PseudoProbe.h"

#include "llvm/IR/Type.h" #include "llvm/IR/Type.h"

#include "llvm/IR/Use.h" #include "llvm/IR/Use.h"

#include "llvm/IR/User.h" #include "llvm/IR/User.h"

#include "llvm/IR/Value.h" #include "llvm/IR/Value.h"

#include "llvm/IR/ValueHandle.h" #include "llvm/IR/ValueHandle.h"

#include "llvm/Support/BranchProbability.h"

#include "llvm/Support/Casting.h" #include "llvm/Support/Casting.h"

#include "llvm/Support/CommandLine.h" #include "llvm/Support/CommandLine.h"

#include "llvm/Support/Debug.h" #include "llvm/Support/Debug.h"

#include "llvm/Support/ErrorHandling.h" #include "llvm/Support/ErrorHandling.h"

#include "llvm/Support/KnownBits.h" #include "llvm/Support/KnownBits.h"

#include "llvm/Support/MathExtras.h" #include "llvm/Support/MathExtras.h"

#include "llvm/Support/raw_ostream.h" #include "llvm/Support/raw_ostream.h"

#include "llvm/Transforms/Utils/BasicBlockUtils.h" #include "llvm/Transforms/Utils/BasicBlockUtils.h"

▲ Show 20 Lines • Show All 2,761 Lines • ▼ Show 20 Lines if (PredHasWeights || SuccHasWeights) {

if (!SuccHasWeights) if (!SuccHasWeights)

SuccTrueWeight = SuccFalseWeight = 1; SuccTrueWeight = SuccFalseWeight = 1;

return true; return true;

} else { } else {

return false; return false;

} }

// Determine if the two branches share a common destination, /// Determine if the two branches share a common destination and deduce a glue

// and deduce a glue that we need to use to join branch's conditions /// that joins branch's conditions to arrive at the common destination if that

// to arrive at the common destination. /// would be profitable.

lebedev.riUnsubmitted

Done

... and check that it wouldn't be an obviously unprofitable thing to do as per the prof metadata.

lebedev.ri: ... and check that it wouldn't be an obviously unprofitable thing to do as per the prof…

static Optional<std::pair<Instruction::BinaryOps, bool>> static Optional<std::pair<Instruction::BinaryOps, bool>>

CheckIfCondBranchesShareCommonDestination(BranchInst *BI, BranchInst *PBI) { shouldFoldCondBranchesToCommonDestination(BranchInst *BI, BranchInst *PBI,

const TargetTransformInfo *TTI) {

assert(BI && PBI && BI->isConditional() && PBI->isConditional() && assert(BI && PBI && BI->isConditional() && PBI->isConditional() &&

"Both blocks must end with a conditional branches."); "Both blocks must end with a conditional branches.");

assert(is_contained(predecessors(BI->getParent()), PBI->getParent()) && assert(is_contained(predecessors(BI->getParent()), PBI->getParent()) &&

"PredBB must be a predecessor of BB."); "PredBB must be a predecessor of BB.");

if (PBI->getSuccessor(0) == BI->getSuccessor(0)) // We have the potential to fold the conditions together, but if the

// predecessor branch is predictable, we may not want to merge them.

uint64_t PTWeight, PFWeight;

lebedev.riUnsubmitted

Not Done

// predecessor branch is predictable, we may not want to merge them.

- uint64_t TWeight, FWeight;

+ uint64_t PTWeight, PFWeight;

BranchProbability PBITrueProb, Likely;

lebedev.ri:

BranchProbability PBITrueProb, Likely;

if (PBI->extractProfMetadata(PTWeight, PFWeight) &&

(PTWeight + PFWeight) != 0) {

PBITrueProb =

BranchProbability::getBranchProbability(PTWeight, PTWeight + PFWeight);

Likely = TTI->getPredictableBranchThreshold();

lebedev.riUnsubmitted

Done

And now that i actually start reviewing this, i see what was irking me.
While it seems like we should be using the weights from LowerExpectIntrinsic,
why is that the right threshold for profile-driven weights?

There's TLI->getPredictableBranchThreshold(),
shouldn't we use that?

Because the LikelyBranchWeight/UnlikelyBranchWeight appears to only be used in
CodeGenFunction.cpp and LowerExpectIntrinsic.cpp, none of the transforms use them,
unlike TLI->getPredictableBranchThreshold(). Which to me looks like
that they are LowerExpectIntrinsic's implementation detail
that is unintentionally overexposed.

lebedev.ri: And now that i actually start reviewing this, i see what was irking me. While it seems like we…

spatelAuthorUnsubmitted

Done

Yes, I thought about that too, and I don't have a good reason for using likely/unlikely. I think we will want to move the threshold API to TTI (rather than TLI) if we go with that setting, so we're not making an optimizer pass depend on TLI unnecessarily.

spatel: Yes, I thought about that too, and I don't have a good reason for using likely/unlikely. I…

lebedev.riUnsubmitted

Done

I'll revert D98945 along with fixing rG08196e0b2e1f8aaa8a854585335c17ba479114df in a moment.

lebedev.ri: I'll revert D98945 along with fixing rG08196e0b2e1f8aaa8a854585335c17ba479114df in a moment.

}

if (PBI->getSuccessor(0) == BI->getSuccessor(0)) {

// Speculate the 2nd condition unless the 1st is probably true.

if (PBITrueProb.isUnknown() || PBITrueProb < Likely)

return {{Instruction::Or, false}}; return {{Instruction::Or, false}};

else if (PBI->getSuccessor(1) == BI->getSuccessor(1)) } else if (PBI->getSuccessor(1) == BI->getSuccessor(1)) {

// Speculate the 2nd condition unless the 1st is probably false.

if (PBITrueProb.isUnknown() || PBITrueProb.getCompl() < Likely)

return {{Instruction::And, false}}; return {{Instruction::And, false}};

else if (PBI->getSuccessor(0) == BI->getSuccessor(1)) } else if (PBI->getSuccessor(0) == BI->getSuccessor(1)) {

// Speculate the 2nd condition unless the 1st is probably true.

if (PBITrueProb.isUnknown() || PBITrueProb < Likely)

return {{Instruction::And, true}}; return {{Instruction::And, true}};

else if (PBI->getSuccessor(1) == BI->getSuccessor(0)) } else if (PBI->getSuccessor(1) == BI->getSuccessor(0)) {

// Speculate the 2nd condition unless the 1st is probably false.

if (PBITrueProb.isUnknown() || PBITrueProb.getCompl() < Likely)

return {{Instruction::Or, true}}; return {{Instruction::Or, true}};

}

return None; return None;

} }

static bool PerformBranchToCommonDestFolding(BranchInst *BI, BranchInst *PBI, static bool performBranchToCommonDestFolding(BranchInst *BI, BranchInst *PBI,

DomTreeUpdater *DTU, DomTreeUpdater *DTU,

MemorySSAUpdater *MSSAU) { MemorySSAUpdater *MSSAU,

const TargetTransformInfo *TTI) {

BasicBlock *BB = BI->getParent(); BasicBlock *BB = BI->getParent();

BasicBlock *PredBlock = PBI->getParent(); BasicBlock *PredBlock = PBI->getParent();

// Determine if the two branches share a common destination. // Determine if the two branches share a common destination.

Instruction::BinaryOps Opc; Instruction::BinaryOps Opc;

bool InvertPredCond; bool InvertPredCond;

std::tie(Opc, InvertPredCond) = std::tie(Opc, InvertPredCond) =

*CheckIfCondBranchesShareCommonDestination(BI, PBI); *shouldFoldCondBranchesToCommonDestination(BI, PBI, TTI);

LLVM_DEBUG(dbgs() << "FOLDING BRANCH TO COMMON DEST:\n" << *PBI << *BB); LLVM_DEBUG(dbgs() << "FOLDING BRANCH TO COMMON DEST:\n" << *PBI << *BB);

IRBuilder<> Builder(PBI); IRBuilder<> Builder(PBI);

// The builder is used to create instructions to eliminate the branch in BB. // The builder is used to create instructions to eliminate the branch in BB.

// If BB's terminator has !annotation metadata, add it to the new // If BB's terminator has !annotation metadata, add it to the new

// instructions. // instructions.

Builder.CollectMetadataToCopy(BB->getTerminator(), Builder.CollectMetadataToCopy(BB->getTerminator(),

▲ Show 20 Lines • Show All 95 Lines • ▼ Show 20 Lines static bool performBranchToCommonDestFolding(BranchInst *BI, BranchInst *PBI,

++NumFoldBranchToCommonDest; ++NumFoldBranchToCommonDest;

return true; return true;

} }

/// If this basic block is simple enough, and if a predecessor branches to us /// If this basic block is simple enough, and if a predecessor branches to us

/// and one of our successors, fold the block into the predecessor and use /// and one of our successors, fold the block into the predecessor and use

/// logical operations to pick the right destination. /// logical operations to pick the right destination.

bool llvm::FoldBranchToCommonDest(BranchInst *BI, DomTreeUpdater *DTU, bool llvm::FoldBranchToCommonDest(BranchInst *BI, DomTreeUpdater *DTU,

MemorySSAUpdater *MSSAU, MemorySSAUpdater *MSSAU,

const TargetTransformInfo *TTI, const TargetTransformInfo *TTI,

unsigned BonusInstThreshold) { unsigned BonusInstThreshold) {

lebedev.riUnsubmitted

Not Done

FoldBranchToCommonDest() might be called from outside of SimplifyCFG (it is called from loopsimplify), without passing-in TTI, i guess.

lebedev.ri: `FoldBranchToCommonDest()` might be called from outside of SimplifyCFG (it is called from…

spatelAuthorUnsubmitted

Done

Yep, that was it. I was able to deduce a test for it without going through the stage2 failure.
I'll add a test for -loop-simplify and try again.

This does raise a question: is it intentional that a pass is ignoring the metadata? Does that mean we may lose the information despite making a change for SimplifyCFG?

spatel: Yep, that was it. I was able to deduce a test for it without going through the stage2 failure.

// If this block ends with an unconditional branch, // If this block ends with an unconditional branch,

// let SpeculativelyExecuteBB() deal with it. // let SpeculativelyExecuteBB() deal with it.

if (!BI->isConditional()) if (!BI->isConditional())

return false; return false;

BasicBlock *BB = BI->getParent(); BasicBlock *BB = BI->getParent();

const unsigned PredCount = pred_size(BB); const unsigned PredCount = pred_size(BB);

▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines for (BasicBlock *PredBlock : predecessors(BB)) {

// the common successor, verify that the same value flows in from both // the common successor, verify that the same value flows in from both

// blocks. // blocks.

if (!PBI || PBI->isUnconditional() || !SafeToMergeTerminators(BI, PBI)) if (!PBI || PBI->isUnconditional() || !SafeToMergeTerminators(BI, PBI))

continue; continue;

// Determine if the two branches share a common destination. // Determine if the two branches share a common destination.

Instruction::BinaryOps Opc; Instruction::BinaryOps Opc;

bool InvertPredCond; bool InvertPredCond;

if (auto Recepie = CheckIfCondBranchesShareCommonDestination(BI, PBI)) if (auto Recipe = shouldFoldCondBranchesToCommonDestination(BI, PBI, TTI))

std::tie(Opc, InvertPredCond) = *Recepie; std::tie(Opc, InvertPredCond) = *Recipe;

else else

continue; continue;

// Check the cost of inserting the necessary logic before performing the // Check the cost of inserting the necessary logic before performing the

// transformation. // transformation.

if (TTI) { if (TTI) {

Type *Ty = BI->getCondition()->getType(); Type *Ty = BI->getCondition()->getType();

InstructionCost Cost = TTI->getArithmeticInstrCost(Opc, Ty, CostKind); InstructionCost Cost = TTI->getArithmeticInstrCost(Opc, Ty, CostKind);

if (InvertPredCond && (!PBI->getCondition()->hasOneUse() || if (InvertPredCond && (!PBI->getCondition()->hasOneUse() ||

!isa<CmpInst>(PBI->getCondition()))) !isa<CmpInst>(PBI->getCondition())))

Cost += TTI->getArithmeticInstrCost(Instruction::Xor, Ty, CostKind); Cost += TTI->getArithmeticInstrCost(Instruction::Xor, Ty, CostKind);

if (Cost > BranchFoldThreshold) if (Cost > BranchFoldThreshold)

continue; continue;

} }

return PerformBranchToCommonDestFolding(BI, PBI, DTU, MSSAU); return performBranchToCommonDestFolding(BI, PBI, DTU, MSSAU, TTI);

} }

return Changed; return Changed;

} }

// If there is only one store in BB1 and BB2, return it, otherwise return // If there is only one store in BB1 and BB2, return it, otherwise return

// nullptr. // nullptr.

static StoreInst *findUniqueStoreInBlocks(BasicBlock *BB1, BasicBlock *BB2) { static StoreInst *findUniqueStoreInBlocks(BasicBlock *BB1, BasicBlock *BB2) {

StoreInst *S = nullptr; StoreInst *S = nullptr;

▲ Show 20 Lines • Show All 3,513 Lines • Show Last 20 Lines

llvm/test/Transforms/PGOProfile/chr.ll

	Show First 20 Lines • Show All 1,271 Lines • ▼ Show 20 Lines
	; if (i0 & 4) == 0)			; if (i0 & 4) == 0)
	; foo()			; foo()
	; }			; }
	; return i0 + sum3			; return i0 + sum3
	define i32 @test_chr_14(i32* %i, i32* %j, i32 %sum0, i1 %pred, i32 %z) !prof !14 {			define i32 @test_chr_14(i32* %i, i32* %j, i32 %sum0, i1 %pred, i32 %z) !prof !14 {
	; CHECK-LABEL: @test_chr_14(			; CHECK-LABEL: @test_chr_14(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[I0:%.]] = load i32, i32 [[I:%.*]], align 4			; CHECK-NEXT: [[I0:%.]] = load i32, i32 [[I:%.*]], align 4
	; CHECK-NEXT: [[V1:%.]] = icmp ne i32 [[Z:%.]], 1			; CHECK-NEXT: [[V1:%.]] = icmp eq i32 [[Z:%.]], 1
				; CHECK-NEXT: br i1 [[V1]], label [[BB1:%.]], label [[ENTRY_SPLIT_NONCHR:%.]], !prof !15
				; CHECK: entry.split.nonchr:
	; CHECK-NEXT: [[V0:%.*]] = icmp eq i32 [[Z]], 0			; CHECK-NEXT: [[V0:%.*]] = icmp eq i32 [[Z]], 0
	; CHECK-NEXT: [[V3_NONCHR:%.]] = and i1 [[V0]], [[PRED:%.]]			; CHECK-NEXT: [[V3_NONCHR:%.]] = and i1 [[V0]], [[PRED:%.]]
	; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[V1]], [[V3_NONCHR]]			; CHECK-NEXT: br i1 [[V3_NONCHR]], label [[BB0_NONCHR:%.*]], label [[BB1]], !prof !16
	; CHECK-NEXT: br i1 [[OR_COND]], label [[BB0_NONCHR:%.]], label [[BB1:%.]], !prof !19
	; CHECK: bb0.nonchr:			; CHECK: bb0.nonchr:
	; CHECK-NEXT: call void @foo()			; CHECK-NEXT: call void @foo()
	; CHECK-NEXT: br label [[BB1]]			; CHECK-NEXT: br label [[BB1]]
	; CHECK: bb1:			; CHECK: bb1:
	; CHECK-NEXT: [[J0:%.]] = load i32, i32 [[J:%.*]], align 4			; CHECK-NEXT: [[J0:%.]] = load i32, i32 [[J:%.*]], align 4
	; CHECK-NEXT: [[V6:%.*]] = and i32 [[I0]], 2			; CHECK-NEXT: [[V6:%.*]] = and i32 [[I0]], 2
	; CHECK-NEXT: [[V4:%.*]] = icmp ne i32 [[V6]], [[J0]]			; CHECK-NEXT: [[V4:%.*]] = icmp ne i32 [[V6]], [[J0]]
	; CHECK-NEXT: [[V8:%.]] = add i32 [[SUM0:%.]], 43			; CHECK-NEXT: [[V8:%.]] = add i32 [[SUM0:%.]], 43
	▲ Show 20 Lines • Show All 614 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP0:%.*]] = and i1 [[CMP0]], [[CMP3]]			; CHECK-NEXT: [[TMP0:%.*]] = and i1 [[CMP0]], [[CMP3]]
	; CHECK-NEXT: [[TMP1:%.*]] = and i1 [[TMP0]], [[CMP_I]]			; CHECK-NEXT: [[TMP1:%.*]] = and i1 [[TMP0]], [[CMP_I]]
	; CHECK-NEXT: br i1 [[TMP1]], label [[BB1:%.]], label [[ENTRY_SPLIT_NONCHR:%.]], !prof !15			; CHECK-NEXT: br i1 [[TMP1]], label [[BB1:%.]], label [[ENTRY_SPLIT_NONCHR:%.]], !prof !15
	; CHECK: bb1:			; CHECK: bb1:
	; CHECK-NEXT: [[CMP2:%.*]] = icmp ne i64 [[I]], 2			; CHECK-NEXT: [[CMP2:%.*]] = icmp ne i64 [[I]], 2
	; CHECK-NEXT: switch i64 [[I]], label [[BB2:%.*]] [			; CHECK-NEXT: switch i64 [[I]], label [[BB2:%.*]] [
	; CHECK-NEXT: i64 2, label [[BB3_NONCHR2:%.*]]			; CHECK-NEXT: i64 2, label [[BB3_NONCHR2:%.*]]
	; CHECK-NEXT: i64 86, label [[BB2_NONCHR1:%.*]]			; CHECK-NEXT: i64 86, label [[BB2_NONCHR1:%.*]]
	; CHECK-NEXT: ], !prof !20			; CHECK-NEXT: ], !prof !19
	; CHECK: bb2:			; CHECK: bb2:
	; CHECK-NEXT: call void @foo()			; CHECK-NEXT: call void @foo()
	; CHECK-NEXT: call void @foo()			; CHECK-NEXT: call void @foo()
	; CHECK-NEXT: br label [[BB7:%.*]]			; CHECK-NEXT: br label [[BB7:%.*]]
	; CHECK: bb2.nonchr1:			; CHECK: bb2.nonchr1:
	; CHECK-NEXT: call void @foo()			; CHECK-NEXT: call void @foo()
	; CHECK-NEXT: br label [[BB3_NONCHR2]]			; CHECK-NEXT: br label [[BB3_NONCHR2]]
	; CHECK: bb3.nonchr2:			; CHECK: bb3.nonchr2:
	▲ Show 20 Lines • Show All 560 Lines • ▼ Show 20 Lines

	; Test to not crash upon a 0:0 branch_weight metadata.			; Test to not crash upon a 0:0 branch_weight metadata.
	define void @test_chr_24(i32* %i) !prof !14 {			define void @test_chr_24(i32* %i) !prof !14 {
	; CHECK-LABEL: @test_chr_24(			; CHECK-LABEL: @test_chr_24(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[I:%.*]], align 4			; CHECK-NEXT: [[TMP0:%.]] = load i32, i32 [[I:%.*]], align 4
	; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[TMP0]], 1			; CHECK-NEXT: [[TMP1:%.*]] = and i32 [[TMP0]], 1
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i32 [[TMP1]], 0			; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i32 [[TMP1]], 0
	; CHECK-NEXT: br i1 [[TMP2]], label [[BB1:%.]], label [[BB0:%.]], !prof !21			; CHECK-NEXT: br i1 [[TMP2]], label [[BB1:%.]], label [[BB0:%.]], !prof !20
	; CHECK: bb0:			; CHECK: bb0:
	; CHECK-NEXT: call void @foo()			; CHECK-NEXT: call void @foo()
	; CHECK-NEXT: br label [[BB1]]			; CHECK-NEXT: br label [[BB1]]
	; CHECK: bb1:			; CHECK: bb1:
	; CHECK-NEXT: [[TMP3:%.*]] = and i32 [[TMP0]], 2			; CHECK-NEXT: [[TMP3:%.*]] = and i32 [[TMP0]], 2
	; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i32 [[TMP3]], 0			; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i32 [[TMP3]], 0
	; CHECK-NEXT: br i1 [[TMP4]], label [[BB3:%.]], label [[BB2:%.]], !prof !21			; CHECK-NEXT: br i1 [[TMP4]], label [[BB3:%.]], label [[BB2:%.]], !prof !20
	; CHECK: bb2:			; CHECK: bb2:
	; CHECK-NEXT: call void @foo()			; CHECK-NEXT: call void @foo()
	; CHECK-NEXT: br label [[BB3]]			; CHECK-NEXT: br label [[BB3]]
	; CHECK: bb3:			; CHECK: bb3:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%0 = load i32, i32* %i			%0 = load i32, i32* %i
	Show All 37 Lines
	!14 = !{!"function_entry_count", i64 100}			!14 = !{!"function_entry_count", i64 100}
	!15 = !{!"branch_weights", i32 0, i32 1}			!15 = !{!"branch_weights", i32 0, i32 1}
	!16 = !{!"branch_weights", i32 1, i32 1}			!16 = !{!"branch_weights", i32 1, i32 1}
	!17 = !{!"branch_weights", i32 0, i32 0}			!17 = !{!"branch_weights", i32 0, i32 0}
	; CHECK: !15 = !{!"branch_weights", i32 1000, i32 0}			; CHECK: !15 = !{!"branch_weights", i32 1000, i32 0}
	; CHECK: !16 = !{!"branch_weights", i32 0, i32 1}			; CHECK: !16 = !{!"branch_weights", i32 0, i32 1}
	; CHECK: !17 = !{!"branch_weights", i32 1, i32 1}			; CHECK: !17 = !{!"branch_weights", i32 1, i32 1}
	; CHECK: !18 = !{!"branch_weights", i32 1, i32 0}			; CHECK: !18 = !{!"branch_weights", i32 1, i32 0}
	; CHECK: !19 = !{!"branch_weights", i32 0, i32 1000}

llvm/test/Transforms/SimplifyCFG/preserve-branchweights.ll

Show First 20 Lines • Show All 630 Lines • ▼ Show 20 Lines	block3:
%cowval = phi i32 [ 2, %block2 ], [ 0, %block1 ]		%cowval = phi i32 [ 2, %block2 ], [ 0, %block1 ]
br label %exit		br label %exit

exit:		exit:
%outval = phi i32 [ %cowval, %block3 ], [ 1, %block2 ]		%outval = phi i32 [ %cowval, %block3 ], [ 1, %block2 ]
ret i32 %outval		ret i32 %outval
}		}

; FIXME: Merging the icmps with logic-op defeats the purpose of the metadata.		; Merging the icmps with logic-op defeats the purpose of the metadata.
; We can't tell which condition is expensive if they are combined.		; We can't tell which condition is expensive if they are combined.

define void @or_icmps_harmful(i32 %x, i32 %y, i8* %p) {		define void @or_icmps_harmful(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @or_icmps_harmful(		; CHECK-LABEL: @or_icmps_harmful(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sgt i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sgt i32 [[X:%.]], -1
		; CHECK-NEXT: br i1 [[EXPECTED_TRUE]], label [[EXIT:%.]], label [[RARE:%.]], !prof !19
		; CHECK: rare:
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]		; CHECK-NEXT: br i1 [[EXPENSIVE]], label [[EXIT]], label [[FALSE:%.*]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !19
; CHECK: false:		; CHECK: false:
; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1		; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1
; CHECK-NEXT: br label [[EXIT]]		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%expected_true = icmp sgt i32 %x, -1		%expected_true = icmp sgt i32 %x, -1
br i1 %expected_true, label %exit, label %rare, !prof !15		br i1 %expected_true, label %exit, label %rare, !prof !15

rare:		rare:
%expensive = icmp eq i32 %y, 0		%expensive = icmp eq i32 %y, 0
br i1 %expensive, label %exit, label %false		br i1 %expensive, label %exit, label %false

false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

; FIXME: Merging the icmps with logic-op defeats the purpose of the metadata.		; Merging the icmps with logic-op defeats the purpose of the metadata.
; We can't tell which condition is expensive if they are combined.		; We can't tell which condition is expensive if they are combined.

define void @or_icmps_harmful_inverted(i32 %x, i32 %y, i8* %p) {		define void @or_icmps_harmful_inverted(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @or_icmps_harmful_inverted(		; CHECK-LABEL: @or_icmps_harmful_inverted(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sle i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sgt i32 [[X:%.]], -1
		; CHECK-NEXT: br i1 [[EXPECTED_FALSE]], label [[RARE:%.]], label [[EXIT:%.]], !prof !20
		; CHECK: rare:
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_FALSE]], [[EXPENSIVE]]		; CHECK-NEXT: br i1 [[EXPENSIVE]], label [[EXIT]], label [[FALSE:%.*]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !19
; CHECK: false:		; CHECK: false:
; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1		; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1
; CHECK-NEXT: br label [[EXIT]]		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%expected_false = icmp sgt i32 %x, -1		%expected_false = icmp sgt i32 %x, -1
br i1 %expected_false, label %rare, label %exit, !prof !16		br i1 %expected_false, label %rare, label %exit, !prof !16

rare:		rare:
%expensive = icmp eq i32 %y, 0		%expensive = icmp eq i32 %y, 0
br i1 %expensive, label %exit, label %false		br i1 %expensive, label %exit, label %false

false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

; The probability threshold is set by a builtin_expect setting.		; The probability threshold is determined by a TTI setting.
		; In this example, we are just short of strongly expected, so speculate.

define void @or_icmps_not_that_harmful(i32 %x, i32 %y, i8* %p) {		define void @or_icmps_not_that_harmful(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @or_icmps_not_that_harmful(		; CHECK-LABEL: @or_icmps_not_that_harmful(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sgt i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sgt i32 [[X:%.]], -1
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]		; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !20		; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !21
; CHECK: false:		; CHECK: false:
; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1		; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1
; CHECK-NEXT: br label [[EXIT]]		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%expected_true = icmp sgt i32 %x, -1		%expected_true = icmp sgt i32 %x, -1
br i1 %expected_true, label %exit, label %rare, !prof !17		br i1 %expected_true, label %exit, label %rare, !prof !17

rare:		rare:
%expensive = icmp eq i32 %y, 0		%expensive = icmp eq i32 %y, 0
br i1 %expensive, label %exit, label %false		br i1 %expensive, label %exit, label %false

false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

		; The probability threshold is determined by a TTI setting.
		; In this example, we are just short of strongly expected, so speculate.

define void @or_icmps_not_that_harmful_inverted(i32 %x, i32 %y, i8* %p) {		define void @or_icmps_not_that_harmful_inverted(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @or_icmps_not_that_harmful_inverted(		; CHECK-LABEL: @or_icmps_not_that_harmful_inverted(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sgt i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sgt i32 [[X:%.]], -1
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]		; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !21		; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !22
; CHECK: false:		; CHECK: false:
; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1		; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1
; CHECK-NEXT: br label [[EXIT]]		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%expected_true = icmp sgt i32 %x, -1		%expected_true = icmp sgt i32 %x, -1
br i1 %expected_true, label %exit, label %rare, !prof !18		br i1 %expected_true, label %exit, label %rare, !prof !18

rare:		rare:
%expensive = icmp eq i32 %y, 0		%expensive = icmp eq i32 %y, 0
br i1 %expensive, label %exit, label %false		br i1 %expensive, label %exit, label %false

false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

		; The 1st cmp is probably true, so speculating the 2nd is probably a win.

define void @or_icmps_useful(i32 %x, i32 %y, i8* %p) {		define void @or_icmps_useful(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @or_icmps_useful(		; CHECK-LABEL: @or_icmps_useful(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sle i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sle i32 [[X:%.]], -1
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]		; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !22		; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !23
; CHECK: false:		; CHECK: false:
; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1		; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1
; CHECK-NEXT: br label [[EXIT]]		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%expected_true = icmp sgt i32 %x, -1		%expected_true = icmp sgt i32 %x, -1
br i1 %expected_true, label %likely, label %exit, !prof !15		br i1 %expected_true, label %likely, label %exit, !prof !15

likely:		likely:
%expensive = icmp eq i32 %y, 0		%expensive = icmp eq i32 %y, 0
br i1 %expensive, label %exit, label %false		br i1 %expensive, label %exit, label %false

false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

		; The 1st cmp is probably false, so speculating the 2nd is probably a win.

define void @or_icmps_useful_inverted(i32 %x, i32 %y, i8* %p) {		define void @or_icmps_useful_inverted(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @or_icmps_useful_inverted(		; CHECK-LABEL: @or_icmps_useful_inverted(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sgt i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sgt i32 [[X:%.]], -1
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_FALSE]], [[EXPENSIVE]]		; CHECK-NEXT: [[OR_COND:%.*]] = or i1 [[EXPECTED_FALSE]], [[EXPENSIVE]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !22		; CHECK-NEXT: br i1 [[OR_COND]], label [[EXIT:%.]], label [[FALSE:%.]], !prof !23
; CHECK: false:		; CHECK: false:
; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1		; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1
; CHECK-NEXT: br label [[EXIT]]		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%expected_false = icmp sgt i32 %x, -1		%expected_false = icmp sgt i32 %x, -1
Show All 37 Lines
more_rare:		more_rare:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

; FIXME: Merging the icmps with logic-op defeats the purpose of the metadata.		; Merging the icmps with logic-op defeats the purpose of the metadata.
; We can't tell which condition is expensive if they are combined.		; We can't tell which condition is expensive if they are combined.

define void @and_icmps_harmful(i32 %x, i32 %y, i8* %p) {		define void @and_icmps_harmful(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @and_icmps_harmful(		; CHECK-LABEL: @and_icmps_harmful(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sgt i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sgt i32 [[X:%.]], -1
		; CHECK-NEXT: br i1 [[EXPECTED_FALSE]], label [[RARE:%.]], label [[EXIT:%.]], !prof !20
		; CHECK: rare:
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_FALSE]], [[EXPENSIVE]]		; CHECK-NEXT: br i1 [[EXPENSIVE]], label [[FALSE:%.*]], label [[EXIT]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !23
; CHECK: false:		; CHECK: false:
; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1		; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1
; CHECK-NEXT: br label [[EXIT]]		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%expected_false = icmp sgt i32 %x, -1		%expected_false = icmp sgt i32 %x, -1
br i1 %expected_false, label %rare, label %exit, !prof !16		br i1 %expected_false, label %rare, label %exit, !prof !16

rare:		rare:
%expensive = icmp eq i32 %y, 0		%expensive = icmp eq i32 %y, 0
br i1 %expensive, label %false, label %exit		br i1 %expensive, label %false, label %exit

false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

; FIXME: Merging the icmps with logic-op defeats the purpose of the metadata.		; Merging the icmps with logic-op defeats the purpose of the metadata.
; We can't tell which condition is expensive if they are combined.		; We can't tell which condition is expensive if they are combined.

define void @and_icmps_harmful_inverted(i32 %x, i32 %y, i8* %p) {		define void @and_icmps_harmful_inverted(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @and_icmps_harmful_inverted(		; CHECK-LABEL: @and_icmps_harmful_inverted(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sle i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sgt i32 [[X:%.]], -1
		; CHECK-NEXT: br i1 [[EXPECTED_TRUE]], label [[EXIT:%.]], label [[RARE:%.]], !prof !19
		; CHECK: rare:
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]		; CHECK-NEXT: br i1 [[EXPENSIVE]], label [[FALSE:%.*]], label [[EXIT]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !23
; CHECK: false:		; CHECK: false:
; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1		; CHECK-NEXT: store i8 42, i8* [[P:%.*]], align 1
; CHECK-NEXT: br label [[EXIT]]		; CHECK-NEXT: br label [[EXIT]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
%expected_true = icmp sgt i32 %x, -1		%expected_true = icmp sgt i32 %x, -1
br i1 %expected_true, label %exit, label %rare, !prof !15		br i1 %expected_true, label %exit, label %rare, !prof !15

rare:		rare:
%expensive = icmp eq i32 %y, 0		%expensive = icmp eq i32 %y, 0
br i1 %expensive, label %false, label %exit		br i1 %expensive, label %false, label %exit

false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

		; The probability threshold is determined by a TTI setting.
		; In this example, we are just short of strongly expected, so speculate.

define void @and_icmps_not_that_harmful(i32 %x, i32 %y, i8* %p) {		define void @and_icmps_not_that_harmful(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @and_icmps_not_that_harmful(		; CHECK-LABEL: @and_icmps_not_that_harmful(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sgt i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sgt i32 [[X:%.]], -1
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_FALSE]], [[EXPENSIVE]]		; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_FALSE]], [[EXPENSIVE]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !24		; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !24
; CHECK: false:		; CHECK: false:
Show All 13 Lines
false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

		; The probability threshold is determined by a TTI setting.
		; In this example, we are just short of strongly expected, so speculate.

define void @and_icmps_not_that_harmful_inverted(i32 %x, i32 %y, i8* %p) {		define void @and_icmps_not_that_harmful_inverted(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @and_icmps_not_that_harmful_inverted(		; CHECK-LABEL: @and_icmps_not_that_harmful_inverted(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sle i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sle i32 [[X:%.]], -1
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]		; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !24		; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !24
; CHECK: false:		; CHECK: false:
Show All 13 Lines
false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

		; The 1st cmp is probably true, so speculating the 2nd is probably a win.

define void @and_icmps_useful(i32 %x, i32 %y, i8* %p) {		define void @and_icmps_useful(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @and_icmps_useful(		; CHECK-LABEL: @and_icmps_useful(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sgt i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_TRUE:%.]] = icmp sgt i32 [[X:%.]], -1
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]		; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_TRUE]], [[EXPENSIVE]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !25		; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !25
; CHECK: false:		; CHECK: false:
Show All 13 Lines
false:		false:
store i8 42, i8* %p, align 1		store i8 42, i8* %p, align 1
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

		; The 1st cmp is probably false, so speculating the 2nd is probably a win.

define void @and_icmps_useful_inverted(i32 %x, i32 %y, i8* %p) {		define void @and_icmps_useful_inverted(i32 %x, i32 %y, i8* %p) {
; CHECK-LABEL: @and_icmps_useful_inverted(		; CHECK-LABEL: @and_icmps_useful_inverted(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sle i32 [[X:%.]], -1		; CHECK-NEXT: [[EXPECTED_FALSE:%.]] = icmp sle i32 [[X:%.]], -1
; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0		; CHECK-NEXT: [[EXPENSIVE:%.]] = icmp eq i32 [[Y:%.]], 0
; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_FALSE]], [[EXPENSIVE]]		; CHECK-NEXT: [[OR_COND:%.*]] = and i1 [[EXPECTED_FALSE]], [[EXPENSIVE]]
; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !25		; CHECK-NEXT: br i1 [[OR_COND]], label [[FALSE:%.]], label [[EXIT:%.]], !prof !25
; CHECK: false:		; CHECK: false:
▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] use profile metadata to refine merging branch conditions ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 332413

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

llvm/test/Transforms/PGOProfile/chr.ll

llvm/test/Transforms/SimplifyCFG/preserve-branchweights.ll

[SimplifyCFG] use profile metadata to refine merging branch conditions
ClosedPublic