This is an archive of the discontinued LLVM Phabricator instance.

[BPI] Improve static heuristics for "cold" paths.
ClosedPublic

Authored by ebrevnov on May 6 2020, 5:36 AM.

Download Raw Diff

Details

Reviewers

taewookoh
skatkov
fedor.sergeev
chandlerc
dnovillo
gberry
john.brawn
davidxl
Carrot
wmi
hiraditya
yrouban

Commits

rG9fb074e7bb12: [BPI] Improve static heuristics for "cold" paths.

Summary

Current approach doesn't work well in cases when multiple paths are predicted to be "cold". By "cold" paths I mean those containing "unreachable" instruction, call marked with 'cold' attribute and 'unwind' handler of 'invoke' instruction. The issue is that heuristics are applied one by one until the first match and essentially ignores relative hotness/coldness
of other paths.

New approach unifies processing of "cold" paths by assigning predefined absolute weight to each block estimated to be "cold". Then we propagate these weights up/down IR similarly to existing approach. And finally set up edge probabilities based on estimated block weights.

One important difference is how we propagate weight up. Existing approach propagates the same weight to all blocks that are post-dominated by a block with some "known" weight. This is useless at least because it always gives 50\50 distribution which is assumed by default anyway. Worse, it causes the algorithm to skip further heuristics and can miss setting more accurate probability. New algorithm propagates the weight up only to the blocks that dominates and post-dominated by a block with some "known" weight. In other words, those blocks that are either always executed or not executed together.

In addition new approach processes loops in an uniform way as well. Essentially loop exit edges are estimated as "cold" paths relative to back edges and should be considered uniformly with other coldness/hotness markers.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

john.brawn added inline comments.Jun 3 2020, 10:57 AM

llvm/include/llvm/Analysis/BranchProbabilityInfo.h
337	What's TODO here (and similar for the TODOs below)?
368–412	Would it be possible to move all of these is/getSCC functions into SccInfo, to avoid cluttering up BranchProbabilityInfo?
370	Probably calculateSccBlockInfo would be better here.
llvm/lib/Analysis/BranchProbabilityInfo.cpp
853	Weight, not Weigth
867	This is making sure that the divide isn't rounding down to zero, yes? In which case I don't think it'll correctly handle the case where getEstimatedEdgeWeight returns ZERO_WEIGHT.

Fixes according to comments + rebase

Herald added subscribers: aheejin, jgravelle-google, sbc100, dschuff. · View Herald TranscriptJun 25 2020, 5:48 AM

Hi @john.brawn,

Sorry for keeping silence for that long. I had to work on higher priority tasks before I got a chance to return back to this. I think I fixed all your remarks. Please take a look.

Thanks
Evgeniy

Minor update

Harbormaster failed remote builds in B61704: Diff 273316!Jun 25 2020, 7:26 AM

Harbormaster failed remote builds in B61708: Diff 273323!

Can you provide some performace data (SPEC/llvm suite) ?

In D79485#2114260, @xbolva00 wrote:

Can you provide some performace data (SPEC/llvm suite) ?

By SPEC I guess you mean SPEC CPU2006/2017, right? Unfortunately, I don't have access to this benchmark. Would be nice if somebody could check it for me. On my side I can check SPEC JVM2008 and some other java benchmarks.

Regarding LLVM suite. I tried running it about half year ago but got negative experience. The results were not consistent and showed big variation. I can give one more try though. Any advice on running it is highly appreciated.

Formatting

In D79485#2116116, @ebrevnov wrote:

In D79485#2114260, @xbolva00 wrote:

Can you provide some performace data (SPEC/llvm suite) ?

By SPEC I guess you mean SPEC CPU2006/2017, right? Unfortunately, I don't have access to this benchmark. Would be nice if somebody could check it for me. On my side I can check SPEC JVM2008 and some other java benchmarks.

Regarding LLVM suite. I tried running it about half year ago but got negative experience. The results were not consistent and showed big variation. I can give one more try though. Any advice on running it is highly appreciated.

Yeah, any benchmark data would be great.

Harbormaster failed remote builds in B61869: Diff 273603!Jun 26 2020, 1:04 AM

I measured performance on bunch of java related benchmarks we have in house including SPECJVM2008, SPECJbb2015, DaCapo9.12 and others. I don't see any noticeable impact on performance. Please let me know if you need more details on these experiments or some additional perf data.

ping

xbolva00 added a reviewer: davidxl.Jul 20 2020, 3:55 AM

This patch is a little large to review properly. Is there any major refactoring (NFC or mostly NFC) that can be separated out?

llvm/lib/Analysis/BranchProbabilityInfo.cpp
120	why is it set to 0xffff?

In D79485#2162514, @davidxl wrote:

This patch is a little large to review properly. Is there any major refactoring (NFC or mostly NFC) that can be separated out?

I think I should be able to extract one or two NFCs out of this. Let me do that.

The PGOUse pass can choose not to annotate any branches with total weights == 0. Now the question becomes how do we tell PGOUse pass whether the entry should be set to 0 or leave it not set. There are two ways to do it (to signal it is not really cold, but unknown):

Remove the function from the indexed format profile;
set all counts to some sentinel value such as -1.

ebrevnov mentioned this in D84514: [BPI][NFC] Consolidate code to deal with SCCs under a dedicated data structure..Jul 24 2020, 4:57 AM

I have moved part of the change into a separate review D84514. Let's start with that...

Evgeniy Brevnov <ybrevnov@azul.com> mentioned this in rG3a2b05f9fe74: [BPI][NFC] Consolidate code to deal with SCCs under a dedicated data structure..Jul 28 2020, 3:42 AM

Rebase

Marked as WIP since I'm going to extract one more part out from this review.

ebrevnov retitled this revision from [BPI] Improve static heuristics for "cold" paths. to [WIP][BPI] Improve static heuristics for "cold" paths..Jul 28 2020, 6:10 AM

Harbormaster completed remote builds in B66011: Diff 281207.Jul 28 2020, 6:49 AM

ebrevnov mentioned this in D84838: [BPI][NFC] Unify handling of normal and SCC based loops.Jul 29 2020, 5:30 AM

Rebased on top of D84838

Herald added subscribers: kerbowa, nhaehnle, jvesely. · View Herald TranscriptJul 31 2020, 4:11 AM

Harbormaster completed remote builds in B66538: Diff 282186.Jul 31 2020, 4:12 AM

ebrevnov added inline comments.Jul 31 2020, 4:27 AM

llvm/lib/Analysis/BranchProbabilityInfo.cpp
120	Short answer is to preserve same branch probability as before. Old implementation assigned CC_NONTAKEN_WEIGHT weight to "normal" path and CC_TAKEN_WEIGHT to "cold" path. Absolute values are irrelevant and ratio CC_NONTAKEN_WEIGHT/CC_TAKEN_WEIGHT = 16 defines relative probability of "normal" and "cold" branches. New implementation uses pre-selected weight for all "nornal" paths DEFAULT_WEIGHT. Thus relative ratio is calculated as DEFAULT_WEIGHT/COLD_WEIGHT = 0xfffff/0xffff = 16.

ebrevnov retitled this revision from [WIP][BPI] Improve static heuristics for "cold" paths. to [BPI] Improve static heuristics for "cold" paths..Jul 31 2020, 4:27 AM

It's a bit shorter now. Let's see if it's manageable now. Further reduction would be problematic.

ebrevnov added a parent revision: D84838: [BPI][NFC] Unify handling of normal and SCC based loops.Aug 4 2020, 12:07 AM

Evgeniy Brevnov <ybrevnov@azul.com> mentioned this in rG02a629daad0a: [BPI][NFC] Unify handling of normal and SCC based loops.Aug 4 2020, 9:19 PM

Rebase

Harbormaster completed remote builds in B67281: Diff 283538.Aug 6 2020, 2:42 AM

xbolva00 added reviewers: Carrot, wmi.Aug 10 2020, 5:33 PM

A lot of test changes can probably be extracted out as NFC (to make the string test more robust) -- this will reduce the size of the patch.

Do you have performance numbers related to the change?

llvm/lib/Analysis/BranchProbabilityInfo.cpp
120	Why not define a value for DEFAULT_WEIGHT, and define COLD_WEIGHT to be 1/16 of the DEFAULT weight? It is more readable that way.
719	document the method and params.
727	explain here. Why the early estimated weight takes precedence?
757	why doing RPO walk? the algorithm does not seem to depend on this order.
llvm/test/Analysis/BlockFrequencyInfo/redundant_edges.ll
12	why is the new estimation better?

In D79485#2208913, @davidxl wrote:

A lot of test changes can probably be extracted out as NFC (to make the string test more robust) -- this will reduce the size of the patch.

Unfortunately, in most cases, I don't see how this can be done with out loosing current coverage. There are few cases like 'successors: %bb.3(0x80000000), %bb.4(0x00000000)' where we could remove exact numbers and just check block names. I think we better keep exact probabilities in all other cases. That actually helps to catch unintended changes.

Do you have performance numbers related to the change?

I measured performance on bunch of java related benchmarks we have in house including SPECJVM2008, SPECJbb2015, DaCapo9.12 and others. I don't see any noticeable impact on performance.

llvm/lib/Analysis/BranchProbabilityInfo.cpp
120	This is just the way how weights were defined before (using direct numbers instead of ratio). Please note all other cases (like ZH_TAKEN_WEIGHT, ZH_NONTAKEN_WEIGHT) don't use ratio as well. Another theoretical reason could be an ability to represent ratios with non zero remainder.
719	sure
727	will add a comment.
757	You are right and it should not affect correctness but could cause the algorithm to do more iterations. 'While' loop following this one propagates weights from successors to predecessors. If any of the successors not yet evaluated it will trigger an additional iteration when get evaluated. That means it's preferred to evaluate successors before predecessors.
llvm/test/Analysis/BlockFrequencyInfo/redundant_edges.ll
12	This is tough question. Let's take an extreme case when loop has infinite number of exits. In this case probability to take back branch should go to zero and loop frequency should be 1.0. That means more exits we have less probable to got to the next iteration. Existing algorithm completely ignores number of exits and always assumes loop makes 32 iteration. New algorithm assumes that back branch weight is 32 times higher than default exiting weight. Thus in case of one loop exit both algorithms give the following probability of the back branch: edge loop -> loop probability is 0x7c000000 / 0x80000000 = 96.88% [HOT edge] But because in this case there are two exiting edges probability of the back branch is calculated as '32DW/(32DW+2*DW)==32/34=0.94" edge loop -> loop probability is 0x783e0f84 / 0x80000000 = 93.94% I think this is pretty reasonable probability much better matching general high level picture. There is one cavity though. Please note in this example we have two exiting branches from latch. If we had side exit from another block even new algorithm won't take it into account. I think we should put a TODO on that and follow up later. What do you think?

ebrevnov added inline comments.Aug 11 2020, 11:46 PM

llvm/lib/Analysis/BranchProbabilityInfo.cpp
719	sure In fact, there is a documentation in header file. In source file I can put more details on behavior in case of rewrite attempt as you requested bellow.

Added a comment to updateEstimatedBlockWeight

Harbormaster completed remote builds in B68065: Diff 284984.Aug 12 2020, 12:51 AM

wenlei added a subscriber: hoyFB.Aug 12 2020, 9:58 PM

wenlei added a subscriber: wenlei.

wenlei mentioned this in D85628: [HotColdSplitting] Add command line options for supplying cold function names via user input..Aug 12 2020, 10:02 PM

hiraditya added a reviewer: hiraditya.Aug 12 2020, 10:26 PM

Is this work based on any paper/implementation?
Can we add some documentation at the top of the file to get an overall idea of the cost model?

llvm/lib/Analysis/BranchProbabilityInfo.cpp
717	nit: inconsistent comments '' vs '/'
774	do we have a tab here?

xbolva00 added a comment.Aug 13 2020, 3:50 PM

This comment was removed by xbolva00.

In D79485#2217052, @hiraditya wrote:

Is this work based on any paper/implementation?

I don't propose anything completely new here. New algorithm does essentially the same as existing one but

fixes several correctness issues
extends the approach to handle loops and invokes in a universal way

Can we add some documentation at the top of the file to get an overall idea of the cost model?

I don't think "cost model" is applicable here....anyway I can try to describe how thing work using some example. Is this what you are looking for?

llvm/lib/Analysis/BranchProbabilityInfo.cpp
717	Will remove extra /
774	No, I don't have tabs in the code. Looks like phabricator represents indention increase by 4 this way.

..anyway I can try to describe how thing work using some example. Is this what you are looking for?

yes. Thank you.

please run clang-format.

Added description of the algorithm to the header + formatting.

In D79485#2218350, @hiraditya wrote:

please run clang-format.

Done

In D79485#2218349, @hiraditya wrote:

..anyway I can try to describe how thing work using some example. Is this what you are looking for?

yes. Thank you.

Done

Harbormaster completed remote builds in B68726: Diff 286234.Aug 18 2020, 4:01 AM

Is the test failure related to this patch?

In D79485#2231383, @hiraditya wrote:

Is the test failure related to this patch?

I doubt it is. But I will double check before commit.

ping

ebrevnov edited the summary of this revision. (Show Details)Sep 16 2020, 10:19 PM

ping

I'm still reading. The patch is long but I'd like to avoid splitting it into pieces that might result in massive tests changes (back and forth). Now the test changes seem to be compact.
It would be great if someone could join the reviewing efforts.

yrouban added inline comments.Oct 12 2020, 4:05 AM

llvm/include/llvm/Analysis/BranchProbabilityInfo.h
370	agree, the simple inplace definition would simplify the reading.
376	Returns
388	... then sets ... and returns true.
390	change: ... all blocks/loops potentially affected by ... to: ... all blocks/loops that might need their weight re-estimated after ... or something similar in meaning.
396	... propagates ...
407	... sets ...
llvm/lib/Analysis/BranchProbabilityInfo.cpp
107	may be rename to LOWEST_NON_ZERO_WEIGHT or ULP_WEIGHT to explicitly denote its meaning?
110	static_assert(UNREACHABLE_WEIGHT< VERY_LOW_WEIGHT)?
120	group the new related _WEIGHT into one enum (to distinguish them from the other _WEIGH constants) explicitly describe their values (e.g. why COLD_WEIGHT is X times as high as LOWEST_NONZERO_WEIGHT) avoid using hex notation unless it makes sense (e.g. important for bitwise operations) Something like the following: enum { ZERO_WEIGHT = 0, UNREACHABLE_WEIGHT = ZERO_WEIGHT, LOWEST_NONZERO_WEIGHT = 1, NORETURN_WEIGHT = LOWEST_NONZERO_WEIGHT, UNWIND_WEIGHT = LOWEST_NONZERO_WEIGHT, COLD_WEIGHT = 65535 * LOWEST_NONZERO_WEIGHT, DEFAULT_WEIGHT = 64 * COLD_WEIGHT };
125	I suggest that we introduce the UNKNOWN_WEIGHT for most of the DEFAULT_WEIGHT uses to better reflect its meaning and the way it is treated. DEFAULT_WEIGHT should be used only for UNKNOWN_WEIGHT at the very last step of BPI calculation (after propagation).
657	could be defined along with its declaration for ease of reading
659	could be defined along with its declaration for ease of reading
665	could be defined along with its declaration for ease of reading
717	propagate
718	could be renamed to estimateBlockWeights() (plural)
719	I would remove this doc from the definition (to not deviate from the doc at the declaration) and put the note inside the body.
721	early return would be easier to read and would not need std::tie and std::ignore: if (!EstimatedBlockWeight.insert({BB, BBWeight}).second) return false; ... return true;
724	'is never returns' sounds strange. Not sure but may be IsNeverReturn? IsNeverReturning? IsDeadEnd? The function is used only once and only with blocks of two kinds: blocks that have their terminator instruction preceded with a deoptimize call. So the terminator must be the ret instruction. blocks with unreachable terminator instruction. Both cases imply no block successors. So it is better to have assert(BB->getTerminator()->getNumSuccessors() == 0) Otherwise (for a generic function) there could be a question: why this condition have a lower priority than a call with Attribute::NoReturn? call @foo() noreturn br label %next
724–725	the idiom: for (BasicBlock *P : predecessors(BB))
727	please explain (what if previous weight is not equal to the new, may be assert(PrevWeight <= BBWeight)?)
741	change TC to trip count
751	rename WorkList to BlockWorkList as it is in computeEstimatedBlockWeight().
759	if we extracted a lambda estimateBlockWeight (which is worth to be a separate member function with its description) then the structure would be concise: auto estimateBlockWight = [&](const BasicBlock BB) -> Optional<uint32_t>{ if (isa<UnreachableInst>(BB->getTerminator()) \|\| BB->getTerminatingDeoptimizeCall()) return IsNeverReturns(BB) ? NORETURN_WEIGHT : UNREACHABLE_WEIGHT; for (const auto Pred : predecessors(BB)) if (const auto II = dyn_cast<InvokeInst>(Pred->getTerminator())) if (II->getUnwindDest() == BB) return UNWIND_WEIGHT; for (const auto &I : BB) if (const CallInst CI = dyn_cast<CallInst>(&I)) if (CI->hasFnAttr(Attribute::Cold)) return COLD_WEIGHT; return None; }; ReversePostOrderTraversal<const Function > RPOT(&F); for (const auto *BB : RPOT) if (auto BBWeight = estimateBlockWight(BB)) propagateEstimatedBlockWeight(getLoopBlock(BB), DT, PDT, BBWeight.getValue(), BlockWorkList, LoopWorkList);
770	could it result in a deep recursion? may be a worklist would be better?
771–773	BBWeight = IsNeverReturns(BB) ? NORETURN_WEIGHT : UNREACHABLE_WEIGHT;
772	predecessors
788	not needed?
792	remove braces { .. }

Please fix clang-format issues.

yrouban added inline comments.Dec 11 2020, 3:12 AM

llvm/include/llvm/Analysis/BranchProbabilityInfo.h
380	There could be no edge from SrcBB to Successor, but there must be a loop edge. Please rephrase a bit. ... edges leading from loop of SrcBB to loops of Successors' block ...
llvm/lib/Analysis/BranchProbabilityInfo.cpp
674	I would start with the VERY_LOW_WEIGHT. That is if there is no loop exits then the loop is infinite, that is it can be entered at most once.
757	It seems we have to run a similar loop for the PostDom tree.
762–764	I'm not sure if this place is right but I suggest that we have a long description of the Estimated Execution Weight. As its definition starts from the basic blocks weight, I think this place is ok. ......................................... Lets introduce a notion of Estimated Execution Weight (or just weight). It is defined in several steps. It will be used to calculate branch probabilities according to the following rule: if all block successors have their Estimated Execution Weights defined then the branch probabilities can be calculated as if the weights are just !prof branch_weights metadata: BranchProbability[i] = BranchWeight [i]/( BranchWeight[0] + BranchWeight[1] + ... + BranchWeight[N-1]) Note that the Estimated Execution Weights are not mixed with the profile counters or with the branch_weights metedata. So, it is important for the rule to have all successors’ Estimated Execution Weights defined. If at least one branch undefined then the rule is not applicable. Definition of Estimated Execution Weight for basic blocks is based on their properties: Weight is defined for 4 kinds of basic blocks: Unreachable (weight=0) - if the block terminator is the unreachable instruction; Noreturn (weight=1) - if the block has a noreturn call; Unwind (weight=1) - if the block is an unwind branch target (i.e. a landingpad block); Cold (weight=65535) - if the block has a cold call; These properties are listed in their priority to define the block weight. E.g. if a block is Cold and Noreturn than the Noreturn property defines its weight=1. Example: Weights {Cold, Unwind, Unreachable} => Probabilities {Cold / (Cold + Unwind + Unreachable), Unwind / (Cold + Unwind + Unreachable), Unreachable /( Cold + Unwind + Unreachable)} = {65535/ (65535 + 1 + 0), 1/ (65535 + 1 + 0), 0/(65535 + 1 + 0)} It is worth mention that this block weight does not depend on number of block’s predecessors. In other words, these properties (unreachable, noreturn, unwind, cold) do not become any weaker or stronger if the block has more predecessors. I.e. any edge that ends with such block has its weight defined according to one of the properties of this block. Example: BB1 BB2 / \ / \ ColdBB3 ColdBB4 UnwindBB5 The block ColdBB4 has the weight Cold as the block ColdBB3. Probability(BB1->ColdBB3) = Probability(BB1->ColdBB4) = 50% Probability(BB2->ColdBB4) = Cold / (Cold + Unwind) Probability(BB2->UnwindBB5) = Unwind / (Cold + Unwind) ......................................... ... to be continued for loop weights and edge weights
769	if Edge is a loop exiting edge then we can set weight of the loop of DomBB as it lays on the same dom-postdom line with BB. Then we do not need to add the loop to the worklist but have to add its entering blocks to BlockWorkList. Avoid checking for loop exiting condition twice: if (isLoopExitingEdge(Edge)) // Check before isLoopEntering() as it might be also true. LoopWorkList.push_back(DomLoopBB); if (isLoopEnteringEdge(Edge)) ; // Nothing todo. Why? else if (!updateEstimatedBlockWeight(...)) break; Please comment that we do not update block weights for blocks that are not in the same loop with BB. Instead we update their loop weights.
787	for the unwind property we do not need to iterate over all predecessors. if block has one unwind predecessors then all its predecessors must be unwind (that is because the block must start with a landing pad). I would suggest to commit this and check only one of the predecessors.
805	nit: remove { }
816	.. BlockWorkList ... LoopWorkList has a priority over BlockWorklist. Why?
819	'while' would be better and the last 'continue' would not be needed.
822	why should we avoid calculating the loop weight? what if we find another weight for this loop?
828	I would rename it to LoopWeight which is calculated as max of weights of loop exits but at least VERY_LOW_WEIGHT. This is the definition of the loop weight. Why/how is it relevant to block weights?

Herald added a subscriber: pengfei. · View Herald TranscriptDec 11 2020, 3:12 AM

Addressed comments

Harbormaster completed remote builds in B83232: Diff 313274.Dec 22 2020, 3:05 AM

ebrevnov added inline comments.Dec 22 2020, 3:07 AM

llvm/lib/Analysis/BranchProbabilityInfo.cpp
110	Why? Currently, this assert holds but there is nothing preventing us from increasing UNREACHABLE_WEIGHT if needed by the cost model.
120	Done This is just a heuristic values. You can't have precise description for them. Done.
125	Which uses are you talking about. I see three uses in calcEstimatedHeuristics only
674	This method does nothing special to loop exits. It just iterates over all edges and returns maximum weight. If there are no edges it signals None.
719	Documentation at declaration and definition provides different level of details. They repeat each other a bit but differs in many ways.
727	We can't assert since single block might have several "incompatible" weights. For example the "unwind" block may have "cold" call. In that case we favor the first one encountered and rely on proper evaluation order.
757	Didn't get.
769	I framed the code this way to explicitly emphasize loop and non-loop cases. The compiler is clever enough to do the suggested optimization behind the scence.
770	Recursion won't happen as first thing updateEstimatedWeight does is checking if we have already processed this block.
787	Then it will always stop on the first predecessor, right? May unwind have more than one predecessor at all?
788	Not sure what was the case, but I came across the case when Pred was null.
816	There should be no difference which one processed first. Someone has to be first :-)
822	In general, weight is assigned to a block/loop when it has final value and can't/shouldn't be changed. However, there are cases when a block/loop inherently has several (possibly "contradicting") weights. For example, "unwind" block may also contain "cold" call. In that case the first set weight is favored and all consequent weights are ignored.

ebrevnov mentioned this in D93682: [CodeGen] Add "noreturn" attirbute to _Unwind_Resume.Dec 23 2020, 3:02 AM

LGTM if my latest comments are addressed.
Let's give this big work a try.

llvm/include/llvm/Analysis/BranchProbabilityInfo.h
65	override
70	noreturn blocks?
71	each such sounds strange. let's rephrase: Those blocks get their weights set to BlockExecWeight::UNREACHABLE, BlockExecWeight::NORETURN, BlockExecWeight::UNWIND and BlockExecWeight::COLD respectively. Then the weights are propagated to the other blocks up the domination line.
75	We should say about default weight in context of branch probability calculation. ... the process repeats. Once the process of weights propagation converges we set branch probabilities. It is done for all blocks with at lest one successor with its weight set. For successors without weights we use the default execution weight (BlockExecWeight::DEFAULT). For loop back branches we use their weights scaled by ...
84	may be rename BR1 to BB1 and BR2 to BB2? 'R' seems redundant.
llvm/lib/Analysis/BranchProbabilityInfo.cpp
110	The word unreachable means 'cannot be reached' and thus cannot be executed. How this execution weight be anything except zero? Any non-zero weight for unreachable successors would result in the sum of probabilities of all reachable successors be less than 1.0. I do not insist though.
139–140	I would write that the default weight is used to calculate branch probability of edges which destination blocks does not have their estimated execution weight. The default weight is not set as estimated execution weight and thus is not propagated through domination line.
347	why do we need this static_cast? they are so many in the code. can we get rid of them?
690	Please add a TODO: In addition to this propagation up the domination line consider propagating down the domination line.
720	does it make sense to reverse iterate? noreturn calls are likely close to block ends
779	please add comment: // Process loops and blocks. Order is not important.
787	I believe that unwind blocks can have more than one predecessor as one catch block for several callsites.
827	... require
856–860	shorter Weight = std::max(BlockExecWeight::LOWEST_NON_ZERO, Weight.getValueOr(static_cast<uint32_t>(BlockExecWeight::DEFAULT) / TC);
867	Weight = std::max(BlockExecWeight::LOWEST_NON_ZERO, ...);
881–902	none
882	If TotalWeight is zero then weight of ...
llvm/test/Analysis/BranchProbabilityInfo/basic.ll
156–158	space between ; and CHECK
275–276	space
llvm/test/Analysis/BranchProbabilityInfo/deopt-invoke.ll
44	space
llvm/test/Analysis/BranchProbabilityInfo/loop.ll
528	space
552	space please

This revision is now accepted and ready to land.Dec 23 2020, 4:27 AM

Addressed comments

Harbormaster completed remote builds in B83394: Diff 313548.Dec 23 2020, 6:57 AM

Rebase

Harbormaster completed remote builds in B83398: Diff 313554.Dec 23 2020, 7:09 AM

ebrevnov removed a parent revision: D84838: [BPI][NFC] Unify handling of normal and SCC based loops.Dec 23 2020, 7:12 AM

Rebase

This revision was landed with ongoing or failed builds.Dec 23 2020, 7:47 AM

Closed by commit rG9fb074e7bb12: [BPI] Improve static heuristics for "cold" paths. (authored by ebrevnov, committed by Evgeniy Brevnov <ybrevnov@azul.com>). · Explain Why

This revision was automatically updated to reflect the committed changes.

Evgeniy Brevnov <ybrevnov@azul.com> added a commit: rG9fb074e7bb12: [BPI] Improve static heuristics for "cold" paths..

Harbormaster completed remote builds in B83399: Diff 313555.Dec 23 2020, 8:01 AM

Evgeniy Brevnov <ybrevnov@azul.com> mentioned this in rGe0751234ef0d: [CodeGen] Add "noreturn" attirbute to _Unwind_Resume.Dec 24 2020, 3:14 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

BranchProbabilityInfo.h

153 lines

LazyBranchProbabilityInfo.h

2 lines

lib/

Analysis/

BranchProbabilityInfo.cpp

645 lines

OptimizationRemarkEmitter.cpp

2 lines

Transforms/

Scalar/

LoopPredication.cpp

2 lines

test/

Analysis/

BlockFrequencyInfo/

redundant_edges.ll

2 lines

BranchProbabilityInfo/

40 lines

4 lines

107 lines

209 lines

35 lines

154 lines

CodeGen/

AArch64/

GlobalISel/

irtranslator-invoke-probabilities.ll

2 lines

AMDGPU/

transform-block-with-return-to-epilog.ll

4 lines

ARM/

ifcvt-branch-weight-bug.ll

2 lines

sub-cmp-peephole.ll

2 lines

v8m.base-jumptable_alignment.ll

22 lines

PowerPC/

p10-spill-crgt.ll

182 lines

pr36292.ll

5 lines

sms-cpy-1.ll

1 line

SPARC/

missinglabel.ll

2 lines

SystemZ/

debuginstr-cgp.mir

4 lines

WebAssembly/

switch-unreachable-default.ll

4 lines

X86/

2008-04-17-CoalescerBug.ll

19 lines

block-placement.ll

4 lines

misched_phys_reg_assign_order.ll

6 lines

pr27501.ll

10 lines

pr37916.ll

2 lines

ragreedy-hoist-spill.ll

117 lines

Transforms/

JumpThreading/

thread-prob-3.ll

4 lines

Diff 313558

llvm/include/llvm/Analysis/BranchProbabilityInfo.h

Show All 21 Lines
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"
#include "llvm/IR/ValueHandle.h"		#include "llvm/IR/ValueHandle.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/BranchProbability.h"		#include "llvm/Support/BranchProbability.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
		#include <memory>
#include <utility>		#include <utility>

namespace llvm {		namespace llvm {

class Function;		class Function;
class Loop;		class Loop;
class LoopInfo;		class LoopInfo;
class raw_ostream;		class raw_ostream;
		class DominatorTree;
class PostDominatorTree;		class PostDominatorTree;
class TargetLibraryInfo;		class TargetLibraryInfo;
class Value;		class Value;

/// Analysis providing branch probability information.		/// Analysis providing branch probability information.
///		///
/// This is a function analysis which provides information on the relative		/// This is a function analysis which provides information on the relative
/// probabilities of each "edge" in the function's CFG where such an edge is		/// probabilities of each "edge" in the function's CFG where such an edge is
/// defined by a pair (PredBlock and an index in the successors). The		/// defined by a pair (PredBlock and an index in the successors). The
/// probability of an edge from one block is always relative to the		/// probability of an edge from one block is always relative to the
/// probabilities of other edges from the block. The probabilites of all edges		/// probabilities of other edges from the block. The probabilites of all edges
/// from a block sum to exactly one (100%).		/// from a block sum to exactly one (100%).
/// We use a pair (PredBlock and an index in the successors) to uniquely		/// We use a pair (PredBlock and an index in the successors) to uniquely
/// identify an edge, since we can have multiple edges from Src to Dst.		/// identify an edge, since we can have multiple edges from Src to Dst.
/// As an example, we can have a switch which jumps to Dst with value 0 and		/// As an example, we can have a switch which jumps to Dst with value 0 and
/// value 10.		/// value 10.
		///
		/// Process of computing branch probabilities can be logically viewed as three
		/// step process:
		///
		/// First, if there is a profile information associated with the branch then
		/// it is trivially translated to branch probabilities. There is one exception
		/// from this rule though. Probabilities for edges leading to "unreachable"
		/// blocks (blocks with the estimated weight not greater than
		/// UNREACHABLE_WEIGHT) are evaluated according to static estimation and
		/// override profile information. If no branch probabilities were calculated
		yroubanUnsubmitted Done Reply Inline Actions override yrouban: override
		/// on this step then take the next one.
		///
		/// Second, estimate absolute execution weights for each block based on
		/// statically known information. Roots of such information are "cold",
		/// "unreachable", "noreturn" and "unwind" blocks. Those blocks get their
		yroubanUnsubmitted Done Reply Inline Actions noreturn blocks? yrouban: noreturn blocks?
		/// weights set to BlockExecWeight::COLD, BlockExecWeight::UNREACHABLE,
		yroubanUnsubmitted Not Done Reply Inline Actions each such sounds strange. let's rephrase: Those blocks get their weights set to BlockExecWeight::UNREACHABLE, BlockExecWeight::NORETURN, BlockExecWeight::UNWIND and BlockExecWeight::COLD respectively. Then the weights are propagated to the other blocks up the domination line. yrouban: //each such// sounds strange. let's rephrase: Those blocks get their weights set to…
		/// BlockExecWeight::NORETURN and BlockExecWeight::UNWIND respectively. Then the
		/// weights are propagated to the other blocks up the domination line. In
		/// addition, if all successors have estimated weights set then maximum of these
		/// weights assigned to the block itself (while this is not ideal heuristic in
		yroubanUnsubmitted Done Reply Inline Actions We should say about default weight in context of branch probability calculation. ... the process repeats. Once the process of weights propagation converges we set branch probabilities. It is done for all blocks with at lest one successor with its weight set. For successors without weights we use the default execution weight (BlockExecWeight::DEFAULT). For loop back branches we use their weights scaled by ... yrouban: We should say about default weight in context of branch probability calculation. //... the…
		/// theory it's simple and works reasonably well in most cases) and the process
		/// repeats. Once the process of weights propagation converges branch
		/// probabilities are set for all such branches that have at least one successor
		/// with the weight set. Default execution weight (BlockExecWeight::DEFAULT) is
		/// used for any successors which doesn't have its weight set. For loop back
		/// branches we use their weights scaled by loop trip count equal to
		/// 'LBH_TAKEN_WEIGHT/LBH_NOTTAKEN_WEIGHT'.
		///
		/// Here is a simple example demonstrating how the described algorithm works.
		yroubanUnsubmitted Done Reply Inline Actions may be rename BR1 to BB1 and BR2 to BB2? 'R' seems redundant. yrouban: may be rename BR1 to BB1 and BR2 to BB2? 'R' seems redundant.
		///
		/// BB1
		/// / \
		/// v v
		/// BB2 BB3
		/// / \
		/// v v
		/// ColdBB UnreachBB
		///
		/// Initially, ColdBB is associated with COLD_WEIGHT and UnreachBB with
		/// UNREACHABLE_WEIGHT. COLD_WEIGHT is set to BB2 as maximum between its
		/// successors. BB1 and BB3 has no explicit estimated weights and assumed to
		/// have DEFAULT_WEIGHT. Based on assigned weights branches will have the
		/// following probabilities:
		/// P(BB1->BB2) = COLD_WEIGHT/(COLD_WEIGHT + DEFAULT_WEIGHT) =
		/// 0xffff / (0xffff + 0xfffff) = 0.0588(5.9%)
		/// P(BB1->BB3) = DEFAULT_WEIGHT_WEIGHT/(COLD_WEIGHT + DEFAULT_WEIGHT) =
		/// 0xfffff / (0xffff + 0xfffff) = 0.941(94.1%)
		/// P(BB2->ColdBB) = COLD_WEIGHT/(COLD_WEIGHT + UNREACHABLE_WEIGHT) = 1(100%)
		/// P(BB2->UnreachBB) =
		/// UNREACHABLE_WEIGHT/(COLD_WEIGHT+UNREACHABLE_WEIGHT) = 0(0%)
		///
		/// If no branch probabilities were calculated on this step then take the next
		/// one.
		///
		/// Third, apply different kinds of local heuristics for each individual
		/// branch until first match. For example probability of a pointer to be null is
		/// estimated as PH_TAKEN_WEIGHT/(PH_TAKEN_WEIGHT + PH_NONTAKEN_WEIGHT). If
		/// no local heuristic has been matched then branch is left with no explicit
		/// probability set and assumed to have default probability.
class BranchProbabilityInfo {		class BranchProbabilityInfo {
public:		public:
BranchProbabilityInfo() = default;		BranchProbabilityInfo() = default;

BranchProbabilityInfo(const Function &F, const LoopInfo &LI,		BranchProbabilityInfo(const Function &F, const LoopInfo &LI,
const TargetLibraryInfo *TLI = nullptr,		const TargetLibraryInfo *TLI = nullptr,
		DominatorTree *DT = nullptr,
PostDominatorTree *PDT = nullptr) {		PostDominatorTree *PDT = nullptr) {
calculate(F, LI, TLI, PDT);		calculate(F, LI, TLI, DT, PDT);
}		}

BranchProbabilityInfo(BranchProbabilityInfo &&Arg)		BranchProbabilityInfo(BranchProbabilityInfo &&Arg)
: Probs(std::move(Arg.Probs)), LastF(Arg.LastF),		: Probs(std::move(Arg.Probs)), LastF(Arg.LastF),
PostDominatedByUnreachable(std::move(Arg.PostDominatedByUnreachable)),		EstimatedBlockWeight(std::move(Arg.EstimatedBlockWeight)) {}
PostDominatedByColdCall(std::move(Arg.PostDominatedByColdCall)) {}

BranchProbabilityInfo(const BranchProbabilityInfo &) = delete;		BranchProbabilityInfo(const BranchProbabilityInfo &) = delete;
BranchProbabilityInfo &operator=(const BranchProbabilityInfo &) = delete;		BranchProbabilityInfo &operator=(const BranchProbabilityInfo &) = delete;

BranchProbabilityInfo &operator=(BranchProbabilityInfo &&RHS) {		BranchProbabilityInfo &operator=(BranchProbabilityInfo &&RHS) {
releaseMemory();		releaseMemory();
Probs = std::move(RHS.Probs);		Probs = std::move(RHS.Probs);
PostDominatedByColdCall = std::move(RHS.PostDominatedByColdCall);		EstimatedBlockWeight = std::move(RHS.EstimatedBlockWeight);
PostDominatedByUnreachable = std::move(RHS.PostDominatedByUnreachable);
return *this;		return *this;
}		}

bool invalidate(Function &, const PreservedAnalyses &PA,		bool invalidate(Function &, const PreservedAnalyses &PA,
FunctionAnalysisManager::Invalidator &);		FunctionAnalysisManager::Invalidator &);

void releaseMemory();		void releaseMemory();

▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	public:
void copyEdgeProbabilities(BasicBlock Src, BasicBlock Dst);		void copyEdgeProbabilities(BasicBlock Src, BasicBlock Dst);

static BranchProbability getBranchProbStackProtector(bool IsLikely) {		static BranchProbability getBranchProbStackProtector(bool IsLikely) {
static const BranchProbability LikelyProb((1u << 20) - 1, 1u << 20);		static const BranchProbability LikelyProb((1u << 20) - 1, 1u << 20);
return IsLikely ? LikelyProb : LikelyProb.getCompl();		return IsLikely ? LikelyProb : LikelyProb.getCompl();
}		}

void calculate(const Function &F, const LoopInfo &LI,		void calculate(const Function &F, const LoopInfo &LI,
const TargetLibraryInfo TLI, PostDominatorTree PDT);		const TargetLibraryInfo TLI, DominatorTree DT,
		PostDominatorTree *PDT);

/// Forget analysis results for the given basic block.		/// Forget analysis results for the given basic block.
void eraseBlock(const BasicBlock *BB);		void eraseBlock(const BasicBlock *BB);

		// Data structure to track SCCs for handling irreducible loops.
		john.brawnUnsubmitted Done Reply Inline Actions Could do with a comment here explaining what these types mean - in particular the value type of SccBlockTypeMap is a bitmask of SccBlockType, so explaining that a block can be of more than one type would be good. john.brawn: Could do with a comment here explaining what these types mean - in particular the value type of…
class SccInfo {		class SccInfo {
// Enum of types to classify basic blocks in SCC. Basic block belonging to		// Enum of types to classify basic blocks in SCC. Basic block belonging to
// SCC is 'Inner' until it is either 'Header' or 'Exiting'. Note that a		// SCC is 'Inner' until it is either 'Header' or 'Exiting'. Note that a
// basic block can be 'Header' and 'Exiting' at the same time.		// basic block can be 'Header' and 'Exiting' at the same time.
enum SccBlockType {		enum SccBlockType {
Inner = 0x0,		Inner = 0x0,
Header = 0x1,		Header = 0x1,
Exiting = 0x2,		Exiting = 0x2,
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	private:
using LoopData = std::pair<Loop *, int>;		using LoopData = std::pair<Loop *, int>;
/// Helper class to keep basic block along with its loop data information.		/// Helper class to keep basic block along with its loop data information.
class LoopBlock {		class LoopBlock {
public:		public:
explicit LoopBlock(const BasicBlock *BB, const LoopInfo &LI,		explicit LoopBlock(const BasicBlock *BB, const LoopInfo &LI,
const SccInfo &SccI);		const SccInfo &SccI);

const BasicBlock *getBlock() const { return BB; }		const BasicBlock *getBlock() const { return BB; }
		BasicBlock getBlock() { return const_cast<BasicBlock >(BB); }
		LoopData getLoopData() const { return LD; }
Loop *getLoop() const { return LD.first; }		Loop *getLoop() const { return LD.first; }
int getSccNum() const { return LD.second; }		int getSccNum() const { return LD.second; }

bool belongsToLoop() const { return getLoop() \|\| getSccNum() != -1; }		bool belongsToLoop() const { return getLoop() \|\| getSccNum() != -1; }
bool belongsToSameLoop(const LoopBlock &LB) const {		bool belongsToSameLoop(const LoopBlock &LB) const {
return (LB.getLoop() && getLoop() == LB.getLoop()) \|\|		return (LB.getLoop() && getLoop() == LB.getLoop()) \|\|
(LB.getSccNum() != -1 && getSccNum() == LB.getSccNum());		(LB.getSccNum() != -1 && getSccNum() == LB.getSccNum());
}		}

private:		private:
const BasicBlock *const BB = nullptr;		const BasicBlock *const BB = nullptr;
LoopData LD = {nullptr, -1};		LoopData LD = {nullptr, -1};
};		};

// Pair of LoopBlocks representing an edge from first to second block.		// Pair of LoopBlocks representing an edge from first to second block.
using LoopEdge = std::pair<const LoopBlock &, const LoopBlock &>;		using LoopEdge = std::pair<const LoopBlock &, const LoopBlock &>;

DenseSet<BasicBlockCallbackVH, DenseMapInfo<Value*>> Handles;		DenseSet<BasicBlockCallbackVH, DenseMapInfo<Value*>> Handles;

// Since we allow duplicate edges from one basic block to another, we use		// Since we allow duplicate edges from one basic block to another, we use
// a pair (PredBlock and an index in the successors) to specify an edge.		// a pair (PredBlock and an index in the successors) to specify an edge.
using Edge = std::pair<const BasicBlock *, unsigned>;		using Edge = std::pair<const BasicBlock *, unsigned>;

// Default weight value. Used when we don't have information about the edge.
// TODO: DEFAULT_WEIGHT makes sense during static predication, when none of
// the successors have a weight yet. But it doesn't make sense when providing
// weight to an edge that may have siblings with non-zero weights. This can
// be handled various ways, but it's probably fine for an edge with unknown
// weight to just "inherit" the non-zero weight of an adjacent successor.
static const uint32_t DEFAULT_WEIGHT = 16;

DenseMap<Edge, BranchProbability> Probs;		DenseMap<Edge, BranchProbability> Probs;

/// Track the last function we run over for printing.		/// Track the last function we run over for printing.
const Function *LastF = nullptr;		const Function *LastF = nullptr;

		const LoopInfo *LI = nullptr;

/// Keeps information about all SCCs in a function.		/// Keeps information about all SCCs in a function.
std::unique_ptr<const SccInfo> SccI;		std::unique_ptr<const SccInfo> SccI;

/// Track the set of blocks directly succeeded by a returning block.		/// Keeps mapping of a basic block to its estimated weight.
SmallPtrSet<const BasicBlock *, 16> PostDominatedByUnreachable;		SmallDenseMap<const BasicBlock *, uint32_t> EstimatedBlockWeight;
		john.brawnUnsubmitted Done Reply Inline Actions What's TODO here (and similar for the TODOs below)? john.brawn: What's TODO here (and similar for the TODOs below)?

/// Track the set of blocks that always lead to a cold call.		/// Keeps mapping of a loop to estimated weight to enter the loop.
SmallPtrSet<const BasicBlock *, 16> PostDominatedByColdCall;		SmallDenseMap<LoopData, uint32_t> EstimatedLoopWeight;

		/// Helper to construct LoopBlock for \p BB.
		LoopBlock getLoopBlock(const BasicBlock *BB) const {
		return LoopBlock(BB, LI, SccI.get());
		}

/// Returns true if destination block belongs to some loop and source block is		/// Returns true if destination block belongs to some loop and source block is
/// either doesn't belong to any loop or belongs to a loop which is not inner		/// either doesn't belong to any loop or belongs to a loop which is not inner
/// relative to the destination block.		/// relative to the destination block.
bool isLoopEnteringEdge(const LoopEdge &Edge) const;		bool isLoopEnteringEdge(const LoopEdge &Edge) const;
/// Returns true if source block belongs to some loop and destination block is		/// Returns true if source block belongs to some loop and destination block is
/// either doesn't belong to any loop or belongs to a loop which is not inner		/// either doesn't belong to any loop or belongs to a loop which is not inner
/// relative to the source block.		/// relative to the source block.
bool isLoopExitingEdge(const LoopEdge &Edge) const;		bool isLoopExitingEdge(const LoopEdge &Edge) const;
/// Returns true if \p Edge is either enters to or exits from some loop, false		/// Returns true if \p Edge is either enters to or exits from some loop, false
/// in all other cases.		/// in all other cases.
bool isLoopEnteringExitingEdge(const LoopEdge &Edge) const;		bool isLoopEnteringExitingEdge(const LoopEdge &Edge) const;
/// Returns true if source and destination blocks belongs to the same loop and		/// Returns true if source and destination blocks belongs to the same loop and
/// destination block is loop header.		/// destination block is loop header.
bool isLoopBackEdge(const LoopEdge &Edge) const;		bool isLoopBackEdge(const LoopEdge &Edge) const;
// Fills in \p Enters vector with all "enter" blocks to a loop \LB belongs to.		// Fills in \p Enters vector with all "enter" blocks to a loop \LB belongs to.
void getLoopEnterBlocks(const LoopBlock &LB,		void getLoopEnterBlocks(const LoopBlock &LB,
SmallVectorImpl<BasicBlock *> &Enters) const;		SmallVectorImpl<BasicBlock *> &Enters) const;
// Fills in \p Exits vector with all "exit" blocks from a loop \LB belongs to.		// Fills in \p Exits vector with all "exit" blocks from a loop \LB belongs to.
void getLoopExitBlocks(const LoopBlock &LB,		void getLoopExitBlocks(const LoopBlock &LB,
SmallVectorImpl<BasicBlock *> &Exits) const;		SmallVectorImpl<BasicBlock *> &Exits) const;

void computePostDominatedByUnreachable(const Function &F,		/// Returns estimated weight for \p BB. None if \p BB has no estimated weight.
PostDominatorTree *PDT);		Optional<uint32_t> getEstimatedBlockWeight(const BasicBlock *BB) const;
		john.brawnUnsubmitted Done Reply Inline Actions Should this be getSccBlockType? john.brawn: Should this be getSccBlockType?
void computePostDominatedByColdCall(const Function &F,
		john.brawnUnsubmitted Done Reply Inline Actions Probably calculateSccBlockInfo would be better here. john.brawn: Probably calculateSccBlockInfo would be better here.
		yroubanUnsubmitted Not Done Reply Inline Actions agree, the simple inplace definition would simplify the reading. yrouban: agree, the simple inplace definition would simplify the reading.
		/// Returns estimated weight to enter \p L. In other words it is weight of
		/// loop's header block not scaled by trip count. Returns None if \p L has no
		/// no estimated weight.
		Optional<uint32_t> getEstimatedLoopWeight(const LoopData &L) const;

		/// Return estimated weight for \p Edge. Returns None if estimated weight is
		yroubanUnsubmitted Not Done Reply Inline Actions Returns yrouban: Returns
		/// unknown.
		Optional<uint32_t> getEstimatedEdgeWeight(const LoopEdge &Edge) const;

		/// Iterates over all edges leading from \p SrcBB to \p Successors and
		yroubanUnsubmitted Not Done Reply Inline Actions There could be no edge from SrcBB to Successor, but there must be a loop edge. Please rephrase a bit. ... edges leading from loop of SrcBB to loops of Successors' block ... yrouban: There could be no edge from SrcBB to Successor, but there must be a loop edge. Please rephrase…
		/// returns maximum of all estimated weights. If at least one edge has unknown
		/// estimated weight None is returned.
		template <class IterT>
		Optional<uint32_t>
		getMaxEstimatedEdgeWeight(const LoopBlock &SrcBB,
		iterator_range<IterT> Successors) const;

		/// If \p LoopBB has no estimated weight then set it to \p BBWeight and
		yroubanUnsubmitted Not Done Reply Inline Actions ... then sets ... and returns true. yrouban: ... then sets ... and returns true.
		/// return true. Otherwise \p BB's weight remains unchanged and false is
		/// returned. In addition all blocks/loops that might need their weight to be
		yroubanUnsubmitted Done Reply Inline Actions change: ... all blocks/loops potentially affected by ... to: ... all blocks/loops that might need their weight re-estimated after ... or something similar in meaning. yrouban: change: ... all blocks/loops potentially affected by ... to: ... all blocks/loops that might…
		/// re-estimated are put into BlockWorkList/LoopWorkList.
		bool updateEstimatedBlockWeight(LoopBlock &LoopBB, uint32_t BBWeight,
		SmallVectorImpl<BasicBlock *> &BlockWorkList,
		SmallVectorImpl<LoopBlock> &LoopWorkList);

		/// Starting from \p LoopBB (including \p LoopBB itself) propagate \p BBWeight
		yroubanUnsubmitted Not Done Reply Inline Actions ... propagates ... yrouban: ... propagates ...
		/// up the domination tree.
		void propagateEstimatedBlockWeight(const LoopBlock &LoopBB, DominatorTree *DT,
		PostDominatorTree *PDT, uint32_t BBWeight,
		SmallVectorImpl<BasicBlock *> &WorkList,
		SmallVectorImpl<LoopBlock> &LoopWorkList);

		/// Returns block's weight encoded in the IR.
		Optional<uint32_t> getInitialEstimatedBlockWeight(const BasicBlock *BB);

		// Computes estimated weights for all blocks in \p F.
		void computeEestimateBlockWeight(const Function &F, DominatorTree *DT,
		yroubanUnsubmitted Not Done Reply Inline Actions ... sets ... yrouban: ... sets ...
PostDominatorTree *PDT);		PostDominatorTree *PDT);
bool calcUnreachableHeuristics(const BasicBlock *BB);
		/// Based on computed weights by \p computeEstimatedBlockWeight set
		/// probabilities on branches.
		bool calcEstimatedHeuristics(const BasicBlock *BB);
		john.brawnUnsubmitted Done Reply Inline Actions Would it be possible to move all of these is/getSCC functions into SccInfo, to avoid cluttering up BranchProbabilityInfo? john.brawn: Would it be possible to move all of these is/getSCC functions into SccInfo, to avoid cluttering…
bool calcMetadataWeights(const BasicBlock *BB);		bool calcMetadataWeights(const BasicBlock *BB);
bool calcColdCallHeuristics(const BasicBlock *BB);
bool calcPointerHeuristics(const BasicBlock *BB);		bool calcPointerHeuristics(const BasicBlock *BB);
bool calcLoopBranchHeuristics(const BasicBlock *BB, const LoopInfo &LI);
bool calcZeroHeuristics(const BasicBlock BB, const TargetLibraryInfo TLI);		bool calcZeroHeuristics(const BasicBlock BB, const TargetLibraryInfo TLI);
bool calcFloatingPointHeuristics(const BasicBlock *BB);		bool calcFloatingPointHeuristics(const BasicBlock *BB);
bool calcInvokeHeuristics(const BasicBlock *BB);
};		};

/// Analysis pass which computes \c BranchProbabilityInfo.		/// Analysis pass which computes \c BranchProbabilityInfo.
class BranchProbabilityAnalysis		class BranchProbabilityAnalysis
: public AnalysisInfoMixin<BranchProbabilityAnalysis> {		: public AnalysisInfoMixin<BranchProbabilityAnalysis> {
friend AnalysisInfoMixin<BranchProbabilityAnalysis>;		friend AnalysisInfoMixin<BranchProbabilityAnalysis>;

static AnalysisKey Key;		static AnalysisKey Key;
▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/LazyBranchProbabilityInfo.h

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	public:
LazyBranchProbabilityInfo(const Function F, const LoopInfo LI,		LazyBranchProbabilityInfo(const Function F, const LoopInfo LI,
const TargetLibraryInfo *TLI)		const TargetLibraryInfo *TLI)
: Calculated(false), F(F), LI(LI), TLI(TLI) {}		: Calculated(false), F(F), LI(LI), TLI(TLI) {}

/// Retrieve the BPI with the branch probabilities computed.		/// Retrieve the BPI with the branch probabilities computed.
BranchProbabilityInfo &getCalculated() {		BranchProbabilityInfo &getCalculated() {
if (!Calculated) {		if (!Calculated) {
assert(F && LI && "call setAnalysis");		assert(F && LI && "call setAnalysis");
BPI.calculate(F, LI, TLI, nullptr);		BPI.calculate(F, LI, TLI, nullptr, nullptr);
Calculated = true;		Calculated = true;
}		}
return BPI;		return BPI;
}		}

const BranchProbabilityInfo &getCalculated() const {		const BranchProbabilityInfo &getCalculated() const {
return const_cast<LazyBranchProbabilityInfo *>(this)->getCalculated();		return const_cast<LazyBranchProbabilityInfo *>(this)->getCalculated();
}		}
▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

llvm/lib/Analysis/BranchProbabilityInfo.cpp

Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	cl::opt<std::string> PrintBranchProbFuncName(
"print-bpi-func-name", cl::Hidden,		"print-bpi-func-name", cl::Hidden,
cl::desc("The option to specify the name of the function "		cl::desc("The option to specify the name of the function "
"whose branch probability info is printed."));		"whose branch probability info is printed."));

INITIALIZE_PASS_BEGIN(BranchProbabilityInfoWrapperPass, "branch-prob",		INITIALIZE_PASS_BEGIN(BranchProbabilityInfoWrapperPass, "branch-prob",
"Branch Probability Analysis", false, true)		"Branch Probability Analysis", false, true)
INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
INITIALIZE_PASS_DEPENDENCY(PostDominatorTreeWrapperPass)		INITIALIZE_PASS_DEPENDENCY(PostDominatorTreeWrapperPass)
INITIALIZE_PASS_END(BranchProbabilityInfoWrapperPass, "branch-prob",		INITIALIZE_PASS_END(BranchProbabilityInfoWrapperPass, "branch-prob",
"Branch Probability Analysis", false, true)		"Branch Probability Analysis", false, true)

BranchProbabilityInfoWrapperPass::BranchProbabilityInfoWrapperPass()		BranchProbabilityInfoWrapperPass::BranchProbabilityInfoWrapperPass()
: FunctionPass(ID) {		: FunctionPass(ID) {
initializeBranchProbabilityInfoWrapperPassPass(		initializeBranchProbabilityInfoWrapperPassPass(
*PassRegistry::getPassRegistry());		*PassRegistry::getPassRegistry());
Show All 18 Lines
// \| (Weight = 4)		// \| (Weight = 4)
// V		// V
// BB3		// BB3
//		//
// Probability of the edge BB2->BB1 = 124 / (124 + 4) = 0.96875		// Probability of the edge BB2->BB1 = 124 / (124 + 4) = 0.96875
// Probability of the edge BB2->BB3 = 4 / (124 + 4) = 0.03125		// Probability of the edge BB2->BB3 = 4 / (124 + 4) = 0.03125
static const uint32_t LBH_TAKEN_WEIGHT = 124;		static const uint32_t LBH_TAKEN_WEIGHT = 124;
static const uint32_t LBH_NONTAKEN_WEIGHT = 4;		static const uint32_t LBH_NONTAKEN_WEIGHT = 4;
// Unlikely edges within a loop are half as likely as other edges
static const uint32_t LBH_UNLIKELY_WEIGHT = 62;

/// Unreachable-terminating branch taken probability.		/// Unreachable-terminating branch taken probability.
///		///
/// This is the probability for a branch being taken to a block that terminates		/// This is the probability for a branch being taken to a block that terminates
/// (eventually) in unreachable. These are predicted as unlikely as possible.		/// (eventually) in unreachable. These are predicted as unlikely as possible.
/// All reachable probability will proportionally share the remaining part.		/// All reachable probability will proportionally share the remaining part.
static const BranchProbability UR_TAKEN_PROB = BranchProbability::getRaw(1);		static const BranchProbability UR_TAKEN_PROB = BranchProbability::getRaw(1);

/// Weight for a branch taken going into a cold block.
///
/// This is the weight for a branch taken toward a block marked
/// cold. A block is marked cold if it's postdominated by a
/// block containing a call to a cold function. Cold functions
/// are those marked with attribute 'cold'.
static const uint32_t CC_TAKEN_WEIGHT = 4;

/// Weight for a branch not-taken into a cold block.
///
/// This is the weight for a branch not taken toward a block marked
/// cold.
static const uint32_t CC_NONTAKEN_WEIGHT = 64;

static const uint32_t PH_TAKEN_WEIGHT = 20;		static const uint32_t PH_TAKEN_WEIGHT = 20;
		yroubanUnsubmitted Done Reply Inline Actions may be rename to LOWEST_NON_ZERO_WEIGHT or ULP_WEIGHT to explicitly denote its meaning? yrouban: may be rename to LOWEST_NON_ZERO_WEIGHT or ULP_WEIGHT to explicitly denote its meaning?
static const uint32_t PH_NONTAKEN_WEIGHT = 12;		static const uint32_t PH_NONTAKEN_WEIGHT = 12;

static const uint32_t ZH_TAKEN_WEIGHT = 20;		static const uint32_t ZH_TAKEN_WEIGHT = 20;
		yroubanUnsubmitted Not Done Reply Inline Actions static_assert(UNREACHABLE_WEIGHT< VERY_LOW_WEIGHT)? yrouban: static_assert(UNREACHABLE_WEIGHT< VERY_LOW_WEIGHT)?
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Why? Currently, this assert holds but there is nothing preventing us from increasing UNREACHABLE_WEIGHT if needed by the cost model. ebrevnov: Why? Currently, this assert holds but there is nothing preventing us from increasing…
		yroubanUnsubmitted Not Done Reply Inline Actions The word unreachable means 'cannot be reached' and thus cannot be executed. How this execution weight be anything except zero? Any non-zero weight for unreachable successors would result in the sum of probabilities of all reachable successors be less than 1.0. I do not insist though. yrouban: The word //unreachable// means 'cannot be reached' and thus cannot be executed. How this…
static const uint32_t ZH_NONTAKEN_WEIGHT = 12;		static const uint32_t ZH_NONTAKEN_WEIGHT = 12;

static const uint32_t FPH_TAKEN_WEIGHT = 20;		static const uint32_t FPH_TAKEN_WEIGHT = 20;
static const uint32_t FPH_NONTAKEN_WEIGHT = 12;		static const uint32_t FPH_NONTAKEN_WEIGHT = 12;

/// This is the probability for an ordered floating point comparison.		/// This is the probability for an ordered floating point comparison.
static const uint32_t FPH_ORD_WEIGHT = 1024 * 1024 - 1;		static const uint32_t FPH_ORD_WEIGHT = 1024 * 1024 - 1;
/// This is the probability for an unordered floating point comparison, it means		/// This is the probability for an unordered floating point comparison, it means
/// one or two of the operands are NaN. Usually it is used to test for an		/// one or two of the operands are NaN. Usually it is used to test for an
/// exceptional case, so the result is unlikely.		/// exceptional case, so the result is unlikely.
		davidxlUnsubmitted Not Done Reply Inline Actions why is it set to 0xffff? davidxl: why is it set to 0xffff?
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Short answer is to preserve same branch probability as before. Old implementation assigned CC_NONTAKEN_WEIGHT weight to "normal" path and CC_TAKEN_WEIGHT to "cold" path. Absolute values are irrelevant and ratio CC_NONTAKEN_WEIGHT/CC_TAKEN_WEIGHT = 16 defines relative probability of "normal" and "cold" branches. New implementation uses pre-selected weight for all "nornal" paths DEFAULT_WEIGHT. Thus relative ratio is calculated as DEFAULT_WEIGHT/COLD_WEIGHT = 0xfffff/0xffff = 16. ebrevnov: Short answer is to preserve same branch probability as before. Old implementation assigned…
		davidxlUnsubmitted Not Done Reply Inline Actions Why not define a value for DEFAULT_WEIGHT, and define COLD_WEIGHT to be 1/16 of the DEFAULT weight? It is more readable that way. davidxl: Why not define a value for DEFAULT_WEIGHT, and define COLD_WEIGHT to be 1/16 of the DEFAULT…
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions This is just the way how weights were defined before (using direct numbers instead of ratio). Please note all other cases (like ZH_TAKEN_WEIGHT, ZH_NONTAKEN_WEIGHT) don't use ratio as well. Another theoretical reason could be an ability to represent ratios with non zero remainder. ebrevnov: This is just the way how weights were defined before (using direct numbers instead of ratio).
		yroubanUnsubmitted Not Done Reply Inline Actions group the new related _WEIGHT into one enum (to distinguish them from the other _WEIGH constants) explicitly describe their values (e.g. why COLD_WEIGHT is X times as high as LOWEST_NONZERO_WEIGHT) avoid using hex notation unless it makes sense (e.g. important for bitwise operations) Something like the following: enum { ZERO_WEIGHT = 0, UNREACHABLE_WEIGHT = ZERO_WEIGHT, LOWEST_NONZERO_WEIGHT = 1, NORETURN_WEIGHT = LOWEST_NONZERO_WEIGHT, UNWIND_WEIGHT = LOWEST_NONZERO_WEIGHT, COLD_WEIGHT = 65535 * LOWEST_NONZERO_WEIGHT, DEFAULT_WEIGHT = 64 * COLD_WEIGHT }; yrouban: 1. group the new related _WEIGHT into one enum (to distinguish them from the other _WEIGH…
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Done This is just a heuristic values. You can't have precise description for them. Done. ebrevnov: 1. Done 2. This is just a heuristic values. You can't have precise description for them. 3.
static const uint32_t FPH_UNO_WEIGHT = 1;		static const uint32_t FPH_UNO_WEIGHT = 1;

/// Invoke-terminating normal branch taken weight		/// Set of dedicated "absolute" execution weights for a block. These weights are
///		/// meaningful relative to each other and their derivatives only.
/// This is the weight for branching to the normal destination of an invoke		enum class BlockExecWeight : std::uint32_t {
		yroubanUnsubmitted Not Done Reply Inline Actions I suggest that we introduce the UNKNOWN_WEIGHT for most of the DEFAULT_WEIGHT uses to better reflect its meaning and the way it is treated. DEFAULT_WEIGHT should be used only for UNKNOWN_WEIGHT at the very last step of BPI calculation (after propagation). yrouban: I suggest that we introduce the UNKNOWN_WEIGHT for most of the DEFAULT_WEIGHT uses to better…
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Which uses are you talking about. I see three uses in calcEstimatedHeuristics only ebrevnov: Which uses are you talking about. I see three uses in calcEstimatedHeuristics only
/// instruction. We expect this to happen most of the time. Set the weight to an		/// Special weight used for cases with exact zero probability.
/// absurdly high value so that nested loops subsume it.		ZERO = 0x0,
static const uint32_t IH_TAKEN_WEIGHT = 1024 * 1024 - 1;		/// Minimal possible non zero weight.
		LOWEST_NON_ZERO = 0x1,
/// Invoke-terminating normal branch not-taken weight.		/// Weight to an 'unreachable' block.
///		UNREACHABLE = ZERO,
/// This is the weight for branching to the unwind destination of an invoke		/// Weight to a block containing non returning call.
/// instruction. This is essentially never taken.		NORETURN = LOWEST_NON_ZERO,
static const uint32_t IH_NONTAKEN_WEIGHT = 1;		/// Weight to 'unwind' block of an invoke instruction.
		UNWIND = LOWEST_NON_ZERO,
		/// Weight to a 'cold' block. Cold blocks are the ones containing calls marked
		/// with attribute 'cold'.
		COLD = 0xffff,
		/// Default weight is used in cases when there is no dedicated execution
		/// weight set. It is not propagated through the domination line either.
		yroubanUnsubmitted Done Reply Inline Actions I would write that the default weight is used to calculate branch probability of edges which destination blocks does not have their estimated execution weight. The default weight is not set as estimated execution weight and thus is not propagated through domination line. yrouban: I would write that the default weight is used to calculate branch probability of edges which…
		DEFAULT = 0xfffff
		};

BranchProbabilityInfo::SccInfo::SccInfo(const Function &F) {		BranchProbabilityInfo::SccInfo::SccInfo(const Function &F) {
// Record SCC numbers of blocks in the CFG to identify irreducible loops.		// Record SCC numbers of blocks in the CFG to identify irreducible loops.
// FIXME: We could only calculate this if the CFG is known to be irreducible		// FIXME: We could only calculate this if the CFG is known to be irreducible
// (perhaps cache this info in LoopInfo if we can easily calculate it there?).		// (perhaps cache this info in LoopInfo if we can easily calculate it there?).
int SccNum = 0;		int SccNum = 0;
for (scc_iterator<const Function *> It = scc_begin(&F); !It.isAtEnd();		for (scc_iterator<const Function *> It = scc_begin(&F); !It.isAtEnd();
++It, ++SccNum) {		++It, ++SccNum) {
▲ Show 20 Lines • Show All 143 Lines • ▼ Show 20 Lines	void BranchProbabilityInfo::getLoopExitBlocks(
if (LB.getLoop()) {		if (LB.getLoop()) {
LB.getLoop()->getExitBlocks(Exits);		LB.getLoop()->getExitBlocks(Exits);
} else {		} else {
assert(LB.getSccNum() != -1 && "LB doesn't belong to any loop?");		assert(LB.getSccNum() != -1 && "LB doesn't belong to any loop?");
SccI->getSccExitBlocks(LB.getSccNum(), Exits);		SccI->getSccExitBlocks(LB.getSccNum(), Exits);
}		}
}		}

static void UpdatePDTWorklist(const BasicBlock BB, PostDominatorTree PDT,
SmallVectorImpl<const BasicBlock *> &WorkList,
SmallPtrSetImpl<const BasicBlock *> &TargetSet) {
SmallVector<BasicBlock *, 8> Descendants;
SmallPtrSet<const BasicBlock *, 16> NewItems;

PDT->getDescendants(const_cast<BasicBlock *>(BB), Descendants);
for (auto *BB : Descendants)
if (TargetSet.insert(BB).second)
for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI)
if (!TargetSet.count(*PI))
NewItems.insert(*PI);
WorkList.insert(WorkList.end(), NewItems.begin(), NewItems.end());
}

/// Compute a set of basic blocks that are post-dominated by unreachables.
void BranchProbabilityInfo::computePostDominatedByUnreachable(
const Function &F, PostDominatorTree *PDT) {
SmallVector<const BasicBlock *, 8> WorkList;
for (auto &BB : F) {
const Instruction *TI = BB.getTerminator();
if (TI->getNumSuccessors() == 0) {
if (isa<UnreachableInst>(TI) \|\|
// If this block is terminated by a call to
// @llvm.experimental.deoptimize then treat it like an unreachable
// since the @llvm.experimental.deoptimize call is expected to
// practically never execute.
BB.getTerminatingDeoptimizeCall())
UpdatePDTWorklist(&BB, PDT, WorkList, PostDominatedByUnreachable);
}
}

while (!WorkList.empty()) {
const BasicBlock *BB = WorkList.pop_back_val();
if (PostDominatedByUnreachable.count(BB))
continue;
// If the terminator is an InvokeInst, check only the normal destination
// block as the unwind edge of InvokeInst is also very unlikely taken.
if (auto *II = dyn_cast<InvokeInst>(BB->getTerminator())) {
if (PostDominatedByUnreachable.count(II->getNormalDest()))
UpdatePDTWorklist(BB, PDT, WorkList, PostDominatedByUnreachable);
}
// If all the successors are unreachable, BB is unreachable as well.
else if (!successors(BB).empty() &&
llvm::all_of(successors(BB), [this](const BasicBlock *Succ) {
return PostDominatedByUnreachable.count(Succ);
}))
UpdatePDTWorklist(BB, PDT, WorkList, PostDominatedByUnreachable);
}
}

/// compute a set of basic blocks that are post-dominated by ColdCalls.
void BranchProbabilityInfo::computePostDominatedByColdCall(
const Function &F, PostDominatorTree *PDT) {
SmallVector<const BasicBlock *, 8> WorkList;
for (auto &BB : F)
for (auto &I : BB)
if (const CallInst *CI = dyn_cast<CallInst>(&I))
if (CI->hasFnAttr(Attribute::Cold))
UpdatePDTWorklist(&BB, PDT, WorkList, PostDominatedByColdCall);

while (!WorkList.empty()) {
const BasicBlock *BB = WorkList.pop_back_val();

// If the terminator is an InvokeInst, check only the normal destination
// block as the unwind edge of InvokeInst is also very unlikely taken.
if (auto *II = dyn_cast<InvokeInst>(BB->getTerminator())) {
if (PostDominatedByColdCall.count(II->getNormalDest()))
UpdatePDTWorklist(BB, PDT, WorkList, PostDominatedByColdCall);
}
// If all of successor are post dominated then BB is also done.
else if (!successors(BB).empty() &&
llvm::all_of(successors(BB), [this](const BasicBlock *Succ) {
return PostDominatedByColdCall.count(Succ);
}))
UpdatePDTWorklist(BB, PDT, WorkList, PostDominatedByColdCall);
}
}

/// Calculate edge weights for successors lead to unreachable.
///
/// Predict that a successor which leads necessarily to an
/// unreachable-terminated block as extremely unlikely.
bool BranchProbabilityInfo::calcUnreachableHeuristics(const BasicBlock *BB) {
const Instruction *TI = BB->getTerminator();
(void) TI;
assert(TI->getNumSuccessors() > 1 && "expected more than one successor!");
assert(!isa<InvokeInst>(TI) &&
"Invokes should have already been handled by calcInvokeHeuristics");

SmallVector<unsigned, 4> UnreachableEdges;
SmallVector<unsigned, 4> ReachableEdges;

for (const_succ_iterator I = succ_begin(BB), E = succ_end(BB); I != E; ++I)
if (PostDominatedByUnreachable.count(*I))
UnreachableEdges.push_back(I.getSuccessorIndex());
else
ReachableEdges.push_back(I.getSuccessorIndex());

// Skip probabilities if all were reachable.
if (UnreachableEdges.empty())
return false;

SmallVector<BranchProbability, 4> EdgeProbabilities(
BB->getTerminator()->getNumSuccessors(), BranchProbability::getUnknown());
if (ReachableEdges.empty()) {
BranchProbability Prob(1, UnreachableEdges.size());
for (unsigned SuccIdx : UnreachableEdges)
EdgeProbabilities[SuccIdx] = Prob;
setEdgeProbability(BB, EdgeProbabilities);
return true;
}

auto UnreachableProb = UR_TAKEN_PROB;
auto ReachableProb =
(BranchProbability::getOne() - UR_TAKEN_PROB * UnreachableEdges.size()) /
ReachableEdges.size();

for (unsigned SuccIdx : UnreachableEdges)
EdgeProbabilities[SuccIdx] = UnreachableProb;
for (unsigned SuccIdx : ReachableEdges)
EdgeProbabilities[SuccIdx] = ReachableProb;

setEdgeProbability(BB, EdgeProbabilities);
return true;
}

// Propagate existing explicit probabilities from either profile data or		// Propagate existing explicit probabilities from either profile data or
// 'expect' intrinsic processing. Examine metadata against unreachable		// 'expect' intrinsic processing. Examine metadata against unreachable
// heuristic. The probability of the edge coming to unreachable block is		// heuristic. The probability of the edge coming to unreachable block is
// set to min of metadata and unreachable heuristic.		// set to min of metadata and unreachable heuristic.
bool BranchProbabilityInfo::calcMetadataWeights(const BasicBlock *BB) {		bool BranchProbabilityInfo::calcMetadataWeights(const BasicBlock *BB) {
const Instruction *TI = BB->getTerminator();		const Instruction *TI = BB->getTerminator();
assert(TI->getNumSuccessors() > 1 && "expected more than one successor!");		assert(TI->getNumSuccessors() > 1 && "expected more than one successor!");
if (!(isa<BranchInst>(TI) \|\| isa<SwitchInst>(TI) \|\| isa<IndirectBrInst>(TI) \|\|		if (!(isa<BranchInst>(TI) \|\| isa<SwitchInst>(TI) \|\| isa<IndirectBrInst>(TI) \|\|
Show All 24 Lines	for (unsigned I = 1, E = WeightsNode->getNumOperands(); I != E; ++I) {
ConstantInt *Weight =		ConstantInt *Weight =
mdconst::dyn_extract<ConstantInt>(WeightsNode->getOperand(I));		mdconst::dyn_extract<ConstantInt>(WeightsNode->getOperand(I));
if (!Weight)		if (!Weight)
return false;		return false;
assert(Weight->getValue().getActiveBits() <= 32 &&		assert(Weight->getValue().getActiveBits() <= 32 &&
"Too many bits for uint32_t");		"Too many bits for uint32_t");
Weights.push_back(Weight->getZExtValue());		Weights.push_back(Weight->getZExtValue());
WeightSum += Weights.back();		WeightSum += Weights.back();
if (PostDominatedByUnreachable.count(TI->getSuccessor(I - 1)))		const LoopBlock SrcLoopBB = getLoopBlock(BB);
		const LoopBlock DstLoopBB = getLoopBlock(TI->getSuccessor(I - 1));
		auto EstimatedWeight = getEstimatedEdgeWeight({SrcLoopBB, DstLoopBB});
		if (EstimatedWeight &&
		EstimatedWeight.getValue() <=
		static_cast<uint32_t>(BlockExecWeight::UNREACHABLE))
		yroubanUnsubmitted Not Done Reply Inline Actions why do we need this static_cast? they are so many in the code. can we get rid of them? yrouban: why do we need this static_cast? they are so many in the code. can we get rid of them?
UnreachableIdxs.push_back(I - 1);		UnreachableIdxs.push_back(I - 1);
else		else
ReachableIdxs.push_back(I - 1);		ReachableIdxs.push_back(I - 1);
}		}
assert(Weights.size() == TI->getNumSuccessors() && "Checked above");		assert(Weights.size() == TI->getNumSuccessors() && "Checked above");

// If the sum of weights does not fit in 32 bits, scale every weight down		// If the sum of weights does not fit in 32 bits, scale every weight down
// accordingly.		// accordingly.
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	if (OldReachableSum != NewReachableSum) { // Anything to dsitribute?
}		}
}		}

setEdgeProbability(BB, BP);		setEdgeProbability(BB, BP);

return true;		return true;
}		}

/// Calculate edge weights for edges leading to cold blocks.
///
/// A cold block is one post-dominated by a block with a call to a
/// cold function. Those edges are unlikely to be taken, so we give
/// them relatively low weight.
///
/// Return true if we could compute the weights for cold edges.
/// Return false, otherwise.
bool BranchProbabilityInfo::calcColdCallHeuristics(const BasicBlock *BB) {
const Instruction *TI = BB->getTerminator();
(void) TI;
assert(TI->getNumSuccessors() > 1 && "expected more than one successor!");
assert(!isa<InvokeInst>(TI) &&
"Invokes should have already been handled by calcInvokeHeuristics");

// Determine which successors are post-dominated by a cold block.
SmallVector<unsigned, 4> ColdEdges;
SmallVector<unsigned, 4> NormalEdges;
for (const_succ_iterator I = succ_begin(BB), E = succ_end(BB); I != E; ++I)
if (PostDominatedByColdCall.count(*I))
ColdEdges.push_back(I.getSuccessorIndex());
else
NormalEdges.push_back(I.getSuccessorIndex());

// Skip probabilities if no cold edges.
if (ColdEdges.empty())
return false;

SmallVector<BranchProbability, 4> EdgeProbabilities(
BB->getTerminator()->getNumSuccessors(), BranchProbability::getUnknown());
if (NormalEdges.empty()) {
BranchProbability Prob(1, ColdEdges.size());
for (unsigned SuccIdx : ColdEdges)
EdgeProbabilities[SuccIdx] = Prob;
setEdgeProbability(BB, EdgeProbabilities);
return true;
}

auto ColdProb = BranchProbability::getBranchProbability(
CC_TAKEN_WEIGHT,
(CC_TAKEN_WEIGHT + CC_NONTAKEN_WEIGHT) * uint64_t(ColdEdges.size()));
auto NormalProb = BranchProbability::getBranchProbability(
CC_NONTAKEN_WEIGHT,
(CC_TAKEN_WEIGHT + CC_NONTAKEN_WEIGHT) * uint64_t(NormalEdges.size()));

for (unsigned SuccIdx : ColdEdges)
EdgeProbabilities[SuccIdx] = ColdProb;
for (unsigned SuccIdx : NormalEdges)
EdgeProbabilities[SuccIdx] = NormalProb;

setEdgeProbability(BB, EdgeProbabilities);
return true;
}

// Calculate Edge Weights using "Pointer Heuristics". Predict a comparison		// Calculate Edge Weights using "Pointer Heuristics". Predict a comparison
// between two pointer or pointer and NULL will fail.		// between two pointer or pointer and NULL will fail.
bool BranchProbabilityInfo::calcPointerHeuristics(const BasicBlock *BB) {		bool BranchProbabilityInfo::calcPointerHeuristics(const BasicBlock *BB) {
const BranchInst *BI = dyn_cast<BranchInst>(BB->getTerminator());		const BranchInst *BI = dyn_cast<BranchInst>(BB->getTerminator());
if (!BI \|\| !BI->isConditional())		if (!BI \|\| !BI->isConditional())
return false;		return false;

Value *Cond = BI->getCondition();		Value *Cond = BI->getCondition();
▲ Show 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	for (BasicBlock *B : P->blocks()) {
if (Result &&		if (Result &&
((Result->isZeroValue() && B == BI->getSuccessor(0)) \|\|		((Result->isZeroValue() && B == BI->getSuccessor(0)) \|\|
(Result->isOneValue() && B == BI->getSuccessor(1))))		(Result->isOneValue() && B == BI->getSuccessor(1))))
UnlikelyBlocks.insert(B);		UnlikelyBlocks.insert(B);
}		}
}		}
}		}

// Calculate Edge Weights using "Loop Branch Heuristics". Predict backedges		Optional<uint32_t>
// as taken, exiting edges as not-taken.		BranchProbabilityInfo::getEstimatedBlockWeight(const BasicBlock *BB) const {
bool BranchProbabilityInfo::calcLoopBranchHeuristics(const BasicBlock *BB,		auto WeightIt = EstimatedBlockWeight.find(BB);
const LoopInfo &LI) {		if (WeightIt == EstimatedBlockWeight.end())
LoopBlock LB(BB, LI, *SccI.get());		return None;
if (!LB.belongsToLoop())		return WeightIt->second;
		}

		Optional<uint32_t>
		BranchProbabilityInfo::getEstimatedLoopWeight(const LoopData &L) const {
		auto WeightIt = EstimatedLoopWeight.find(L);
		if (WeightIt == EstimatedLoopWeight.end())
		return None;
		return WeightIt->second;
		}

		Optional<uint32_t>
		BranchProbabilityInfo::getEstimatedEdgeWeight(const LoopEdge &Edge) const {
		// For edges entering a loop take weight of a loop rather than an individual
		// block in the loop.
		return isLoopEnteringEdge(Edge)
		? getEstimatedLoopWeight(Edge.second.getLoopData())
		: getEstimatedBlockWeight(Edge.second.getBlock());
		}

		template <class IterT>
		Optional<uint32_t> BranchProbabilityInfo::getMaxEstimatedEdgeWeight(
		const LoopBlock &SrcLoopBB, iterator_range<IterT> Successors) const {
		SmallVector<uint32_t, 4> Weights;
		Optional<uint32_t> MaxWeight;
		for (const BasicBlock *DstBB : Successors) {
		const LoopBlock DstLoopBB = getLoopBlock(DstBB);
		auto Weight = getEstimatedEdgeWeight({SrcLoopBB, DstLoopBB});

		if (!Weight)
		return None;

		if (!MaxWeight \|\| MaxWeight.getValue() < Weight.getValue())
		MaxWeight = Weight;
		}

		return MaxWeight;
		}

		// Updates \p LoopBB's weight and returns true. If \p LoopBB has already
		// an associated weight it is unchanged and false is returned.
		//
		// Please note by the algorithm the weight is not expected to change once set
		// thus 'false' status is used to track visited blocks.
		bool BranchProbabilityInfo::updateEstimatedBlockWeight(
		LoopBlock &LoopBB, uint32_t BBWeight,
		SmallVectorImpl<BasicBlock *> &BlockWorkList,
		SmallVectorImpl<LoopBlock> &LoopWorkList) {
		BasicBlock *BB = LoopBB.getBlock();

		// In general, weight is assigned to a block when it has final value and
		// can't/shouldn't be changed. However, there are cases when a block
		// inherently has several (possibly "contradicting") weights. For example,
		// "unwind" block may also contain "cold" call. In that case the first
		// set weight is favored and all consequent weights are ignored.
		if (!EstimatedBlockWeight.insert({BB, BBWeight}).second)
		return false;

		yroubanUnsubmitted Not Done Reply Inline Actions could be defined along with its declaration for ease of reading yrouban: could be defined along with its declaration for ease of reading
		for (BasicBlock *PredBlock : predecessors(BB)) {
		LoopBlock PredLoop = getLoopBlock(PredBlock);
		yroubanUnsubmitted Not Done Reply Inline Actions could be defined along with its declaration for ease of reading yrouban: could be defined along with its declaration for ease of reading
		// Add affected block/loop to a working list.
		if (isLoopExitingEdge({PredLoop, LoopBB})) {
		if (!EstimatedLoopWeight.count(PredLoop.getLoopData()))
		LoopWorkList.push_back(PredLoop);
		} else if (!EstimatedBlockWeight.count(PredBlock))
		BlockWorkList.push_back(PredBlock);
		yroubanUnsubmitted Not Done Reply Inline Actions could be defined along with its declaration for ease of reading yrouban: could be defined along with its declaration for ease of reading
		}
		return true;
		}

		// Starting from \p BB traverse through dominator blocks and assign \p BBWeight
		// to all such blocks that are post dominated by \BB. In other words to all
		// blocks that the one is executed if and only if another one is executed.
		// Importantly, we skip loops here for two reasons. First weights of blocks in
		// a loop should be scaled by trip count (yet possibly unknown). Second there is
		yroubanUnsubmitted Not Done Reply Inline Actions I would start with the VERY_LOW_WEIGHT. That is if there is no loop exits then the loop is infinite, that is it can be entered at most once. yrouban: I would start with the VERY_LOW_WEIGHT. That is if there is no loop exits then the loop is…
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions This method does nothing special to loop exits. It just iterates over all edges and returns maximum weight. If there are no edges it signals None. ebrevnov: This method does nothing special to loop exits. It just iterates over all edges and returns…
		// no any value in doing that because that doesn't give any additional
		// information regarding distribution of probabilities inside the loop.
		// Exception is loop 'enter' and 'exit' edges that are handled in a special way
		// at calcEstimatedHeuristics.
		//
		// In addition, \p WorkList is populated with basic blocks if at leas one
		// successor has updated estimated weight.
		void BranchProbabilityInfo::propagateEstimatedBlockWeight(
		const LoopBlock &LoopBB, DominatorTree DT, PostDominatorTree PDT,
		uint32_t BBWeight, SmallVectorImpl<BasicBlock *> &BlockWorkList,
		SmallVectorImpl<LoopBlock> &LoopWorkList) {
		const BasicBlock *BB = LoopBB.getBlock();
		const auto *DTStartNode = DT->getNode(BB);
		const auto *PDTStartNode = PDT->getNode(BB);

		// TODO: Consider propagating weight down the domination line as well.
		yroubanUnsubmitted Done Reply Inline Actions Please add a TODO: In addition to this propagation up the domination line consider propagating down the domination line. yrouban: Please add a TODO: In addition to this propagation up the domination line consider propagating…
		for (const auto *DTNode = DTStartNode; DTNode != nullptr;
		DTNode = DTNode->getIDom()) {
		auto *DomBB = DTNode->getBlock();
		// Consider blocks which lie on one 'line'.
		if (!PDT->dominates(PDTStartNode, PDT->getNode(DomBB)))
		// If BB doesn't post dominate DomBB it will not post dominate dominators
		// of DomBB as well.
		break;

		LoopBlock DomLoopBB = getLoopBlock(DomBB);
		const LoopEdge Edge{DomLoopBB, LoopBB};
		// Don't propagate weight to blocks belonging to different loops.
		if (!isLoopEnteringExitingEdge(Edge)) {
		if (!updateEstimatedBlockWeight(DomLoopBB, BBWeight, BlockWorkList,
		LoopWorkList))
		// If DomBB has weight set then all it's predecessors are already
		// processed (since we propagate weight up to the top of IR each time).
		break;
		} else if (isLoopExitingEdge(Edge)) {
		LoopWorkList.push_back(DomLoopBB);
		}
		}
		}

		Optional<uint32_t> BranchProbabilityInfo::getInitialEstimatedBlockWeight(
		const BasicBlock *BB) {
		// Returns true if \p BB has call marked with "NoReturn" attribute.
		hiradityaUnsubmitted Not Done Reply Inline Actions nit: inconsistent comments '' vs '/' hiraditya: nit: inconsistent comments '//' vs '///'
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Will remove extra / ebrevnov: Will remove extra /
		yroubanUnsubmitted Done Reply Inline Actions propagate yrouban: propagate
		auto hasNoReturn = [&](const BasicBlock *BB) {
		yroubanUnsubmitted Not Done Reply Inline Actions could be renamed to estimateBlockWeights() (plural) yrouban: could be renamed to //estimateBlockWeights()// (plural)
		for (const auto &I : reverse(*BB))
		davidxlUnsubmitted Not Done Reply Inline Actions document the method and params. davidxl: document the method and params.
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions sure ebrevnov: sure
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions sure In fact, there is a documentation in header file. In source file I can put more details on behavior in case of rewrite attempt as you requested bellow. ebrevnov: > sure In fact, there is a documentation in header file. In source file I can put more details…
		yroubanUnsubmitted Not Done Reply Inline Actions I would remove this doc from the definition (to not deviate from the doc at the declaration) and put the note inside the body. yrouban: I would remove this doc from the definition (to not deviate from the doc at the declaration)…
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Documentation at declaration and definition provides different level of details. They repeat each other a bit but differs in many ways. ebrevnov: Documentation at declaration and definition provides different level of details. They repeat…
		if (const CallInst *CI = dyn_cast<CallInst>(&I))
		yroubanUnsubmitted Not Done Reply Inline Actions does it make sense to reverse iterate? noreturn calls are likely close to block ends yrouban: does it make sense to reverse iterate? noreturn calls are likely close to block ends
		if (CI->hasFnAttr(Attribute::NoReturn))
		yroubanUnsubmitted Done Reply Inline Actions early return would be easier to read and would not need std::tie and std::ignore: if (!EstimatedBlockWeight.insert({BB, BBWeight}).second) return false; ... return true; yrouban: early return would be easier to read and would not need std::tie and std::ignore: if (!
		return true;

return false;		return false;
		yroubanUnsubmitted Done Reply Inline Actions 'is never returns' sounds strange. Not sure but may be IsNeverReturn? IsNeverReturning? IsDeadEnd? The function is used only once and only with blocks of two kinds: blocks that have their terminator instruction preceded with a deoptimize call. So the terminator must be the ret instruction. blocks with unreachable terminator instruction. Both cases imply no block successors. So it is better to have assert(BB->getTerminator()->getNumSuccessors() == 0) Otherwise (for a generic function) there could be a question: why this condition have a lower priority than a call with Attribute::NoReturn? call @foo() noreturn br label %next yrouban: 'is never returns' sounds strange. Not sure but may be IsNeverReturn? IsNeverReturning?
		};
		yroubanUnsubmitted Done Reply Inline Actions the idiom: for (BasicBlock P : predecessors(BB)) yrouban:* the idiom: for (BasicBlock *P : predecessors(BB))

SmallPtrSet<const BasicBlock*, 8> UnlikelyBlocks;		// Important note regarding the order of checks. They are ordered by weight
		davidxlUnsubmitted Done Reply Inline Actions explain here. Why the early estimated weight takes precedence? davidxl: explain here. Why the early estimated weight takes precedence?
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions will add a comment. ebrevnov: will add a comment.
		yroubanUnsubmitted Done Reply Inline Actions please explain (what if previous weight is not equal to the new, may be assert(PrevWeight <= BBWeight)?) yrouban: please explain (what if previous weight is not equal to the new, may be assert(PrevWeight <=…
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions We can't assert since single block might have several "incompatible" weights. For example the "unwind" block may have "cold" call. In that case we favor the first one encountered and rely on proper evaluation order. ebrevnov: We can't assert since single block might have several "incompatible" weights. For example the…
if (LB.getLoop())		// from lowest to highest. Doing that allows to avoid "unstable" results
computeUnlikelySuccessors(BB, LB.getLoop(), UnlikelyBlocks);		// when several conditions heuristics can be applied simultaneously.
		if (isa<UnreachableInst>(BB->getTerminator()) \|\|
		// If this block is terminated by a call to
		// @llvm.experimental.deoptimize then treat it like an unreachable
		// since it is expected to practically never execute.
		// TODO: Should we actually treat as never returning call?
		BB->getTerminatingDeoptimizeCall())
		return hasNoReturn(BB)
		? static_cast<uint32_t>(BlockExecWeight::NORETURN)
		: static_cast<uint32_t>(BlockExecWeight::UNREACHABLE);

		// Check if the block is 'unwind' handler of some invoke instruction.
		for (const auto *Pred : predecessors(BB))
		yroubanUnsubmitted Done Reply Inline Actions change TC to trip count yrouban: change TC to trip count
		if (Pred)
		if (const auto *II = dyn_cast<InvokeInst>(Pred->getTerminator()))
		if (II->getUnwindDest() == BB)
		return static_cast<uint32_t>(BlockExecWeight::UNWIND);

SmallVector<unsigned, 8> BackEdges;		// Check if the block contains 'cold' call.
SmallVector<unsigned, 8> ExitingEdges;		for (const auto &I : *BB)
SmallVector<unsigned, 8> InEdges; // Edges from header to the loop.		if (const CallInst *CI = dyn_cast<CallInst>(&I))
SmallVector<unsigned, 8> UnlikelyEdges;		if (CI->hasFnAttr(Attribute::Cold))
		return static_cast<uint32_t>(BlockExecWeight::COLD);
		yroubanUnsubmitted Done Reply Inline Actions rename WorkList to BlockWorkList as it is in computeEstimatedBlockWeight(). yrouban: rename WorkList to BlockWorkList as it is in computeEstimatedBlockWeight().

for (const_succ_iterator I = succ_begin(BB), E = succ_end(BB); I != E; ++I) {		return None;
LoopBlock SuccLB(I, LI, SccI.get());		}
LoopEdge Edge(LB, SuccLB);
bool IsUnlikelyEdge = LB.getLoop() && UnlikelyBlocks.contains(*I);		// Does RPO traversal over all blocks in \p F and assigns weights to
		// 'unreachable', 'noreturn', 'cold', 'unwind' blocks. In addition it does its
		davidxlUnsubmitted Not Done Reply Inline Actions why doing RPO walk? the algorithm does not seem to depend on this order. davidxl: why doing RPO walk? the algorithm does not seem to depend on this order.
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions You are right and it should not affect correctness but could cause the algorithm to do more iterations. 'While' loop following this one propagates weights from successors to predecessors. If any of the successors not yet evaluated it will trigger an additional iteration when get evaluated. That means it's preferred to evaluate successors before predecessors. ebrevnov: You are right and it should not affect correctness but could cause the algorithm to do more…
		yroubanUnsubmitted Not Done Reply Inline Actions It seems we have to run a similar loop for the PostDom tree. yrouban: It seems we have to run a similar loop for the PostDom tree.
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Didn't get. ebrevnov: Didn't get.
if (IsUnlikelyEdge)		// best to propagate the weight to up/down the IR.
UnlikelyEdges.push_back(I.getSuccessorIndex());		void BranchProbabilityInfo::computeEestimateBlockWeight(
		yroubanUnsubmitted Done Reply Inline Actions if we extracted a lambda estimateBlockWeight (which is worth to be a separate member function with its description) then the structure would be concise: auto estimateBlockWight = [&](const BasicBlock BB) -> Optional<uint32_t>{ if (isa<UnreachableInst>(BB->getTerminator()) \|\| BB->getTerminatingDeoptimizeCall()) return IsNeverReturns(BB) ? NORETURN_WEIGHT : UNREACHABLE_WEIGHT; for (const auto Pred : predecessors(BB)) if (const auto II = dyn_cast<InvokeInst>(Pred->getTerminator())) if (II->getUnwindDest() == BB) return UNWIND_WEIGHT; for (const auto &I : BB) if (const CallInst CI = dyn_cast<CallInst>(&I)) if (CI->hasFnAttr(Attribute::Cold)) return COLD_WEIGHT; return None; }; ReversePostOrderTraversal<const Function > RPOT(&F); for (const auto BB : RPOT) if (auto BBWeight = estimateBlockWight(BB)) propagateEstimatedBlockWeight(getLoopBlock(BB), DT, PDT, BBWeight.getValue(), BlockWorkList, LoopWorkList); yrouban:* if we extracted a lambda //estimateBlockWeight// (which is worth to be a separate member…
else if (isLoopExitingEdge(Edge))		const Function &F, DominatorTree DT, PostDominatorTree PDT) {
ExitingEdges.push_back(I.getSuccessorIndex());		SmallVector<BasicBlock *, 8> BlockWorkList;
else if (isLoopBackEdge(Edge))		SmallVector<LoopBlock, 8> LoopWorkList;
BackEdges.push_back(I.getSuccessorIndex());
else {		// By doing RPO we make sure that all predecessors already have weights
		yroubanUnsubmitted Not Done Reply Inline Actions I'm not sure if this place is right but I suggest that we have a long description of the Estimated Execution Weight. As its definition starts from the basic blocks weight, I think this place is ok. ......................................... Lets introduce a notion of Estimated Execution Weight (or just weight). It is defined in several steps. It will be used to calculate branch probabilities according to the following rule: if all block successors have their Estimated Execution Weights defined then the branch probabilities can be calculated as if the weights are just !prof branch_weights metadata: BranchProbability[i] = BranchWeight [i]/( BranchWeight[0] + BranchWeight[1] + ... + BranchWeight[N-1]) Note that the Estimated Execution Weights are not mixed with the profile counters or with the branch_weights metedata. So, it is important for the rule to have all successors’ Estimated Execution Weights defined. If at least one branch undefined then the rule is not applicable. Definition of Estimated Execution Weight for basic blocks is based on their properties: Weight is defined for 4 kinds of basic blocks: Unreachable (weight=0) - if the block terminator is the unreachable instruction; Noreturn (weight=1) - if the block has a noreturn call; Unwind (weight=1) - if the block is an unwind branch target (i.e. a landingpad block); Cold (weight=65535) - if the block has a cold call; These properties are listed in their priority to define the block weight. E.g. if a block is Cold and Noreturn than the Noreturn property defines its weight=1. Example: Weights {Cold, Unwind, Unreachable} => Probabilities {Cold / (Cold + Unwind + Unreachable), Unwind / (Cold + Unwind + Unreachable), Unreachable /( Cold + Unwind + Unreachable)} = {65535/ (65535 + 1 + 0), 1/ (65535 + 1 + 0), 0/(65535 + 1 + 0)} It is worth mention that this block weight does not depend on number of block’s predecessors. In other words, these properties (unreachable, noreturn, unwind, cold) do not become any weaker or stronger if the block has more predecessors. I.e. any edge that ends with such block has its weight defined according to one of the properties of this block. Example: BB1 BB2 / \ / \ ColdBB3 ColdBB4 UnwindBB5 The block ColdBB4 has the weight Cold as the block ColdBB3. Probability(BB1->ColdBB3) = Probability(BB1->ColdBB4) = 50% Probability(BB2->ColdBB4) = Cold / (Cold + Unwind) Probability(BB2->UnwindBB5) = Unwind / (Cold + Unwind) ......................................... ... to be continued for loop weights and edge weights yrouban: I'm not sure if this place is right but I suggest that we have a long description of the…
InEdges.push_back(I.getSuccessorIndex());		// calculated before visiting theirs successors.
		ReversePostOrderTraversal<const Function *> RPOT(&F);
		for (const auto *BB : RPOT)
		if (auto BBWeight = getInitialEstimatedBlockWeight(BB))
		// If we were able to find estimated weight for the block set it to this
		yroubanUnsubmitted Not Done Reply Inline Actions if Edge is a loop exiting edge then we can set weight of the loop of DomBB as it lays on the same dom-postdom line with BB. Then we do not need to add the loop to the worklist but have to add its entering blocks to BlockWorkList. Avoid checking for loop exiting condition twice: if (isLoopExitingEdge(Edge)) // Check before isLoopEntering() as it might be also true. LoopWorkList.push_back(DomLoopBB); if (isLoopEnteringEdge(Edge)) ; // Nothing todo. Why? else if (!updateEstimatedBlockWeight(...)) break; Please comment that we do not update block weights for blocks that are not in the same loop with BB. Instead we update their loop weights. yrouban: 1. if Edge is a loop exiting edge then we can set weight of the loop of DomBB as it lays on the…
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions I framed the code this way to explicitly emphasize loop and non-loop cases. The compiler is clever enough to do the suggested optimization behind the scence. ebrevnov: I framed the code this way to explicitly emphasize loop and non-loop cases. The compiler is…
		// block and propagate up the IR.
		yroubanUnsubmitted Not Done Reply Inline Actions could it result in a deep recursion? may be a worklist would be better? yrouban: could it result in a deep recursion? may be a worklist would be better?
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Recursion won't happen as first thing updateEstimatedWeight does is checking if we have already processed this block. ebrevnov: Recursion won't happen as first thing updateEstimatedWeight does is checking if we have already…
		propagateEstimatedBlockWeight(getLoopBlock(BB), DT, PDT,
		BBWeight.getValue(), BlockWorkList,
		yroubanUnsubmitted Done Reply Inline Actions predecessors yrouban: predecessors
		LoopWorkList);
		yroubanUnsubmitted Done Reply Inline Actions BBWeight = IsNeverReturns(BB) ? NORETURN_WEIGHT : UNREACHABLE_WEIGHT; yrouban: BBWeight = IsNeverReturns(BB) ? NORETURN_WEIGHT : UNREACHABLE_WEIGHT;

		hiradityaUnsubmitted Not Done Reply Inline Actions do we have a tab here? hiraditya: do we have a tab here?
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions No, I don't have tabs in the code. Looks like phabricator represents indention increase by 4 this way. ebrevnov: No, I don't have tabs in the code. Looks like phabricator represents indention increase by 4…
		// BlockWorklist/LoopWorkList contains blocks/loops with at least one
		// successor/exit having estimated weight. Try to propagate weight to such
		// blocks/loops from successors/exits.
		// Process loops and blocks. Order is not important.
		do {
		yroubanUnsubmitted Done Reply Inline Actions please add comment: // Process loops and blocks. Order is not important. yrouban: please add comment: // Process loops and blocks. Order is not important.
		while (!LoopWorkList.empty()) {
		const LoopBlock LoopBB = LoopWorkList.pop_back_val();

		if (EstimatedLoopWeight.count(LoopBB.getLoopData()))
		continue;

		SmallVector<BasicBlock *, 4> Exits;
		getLoopExitBlocks(LoopBB, Exits);
		yroubanUnsubmitted Not Done Reply Inline Actions for the unwind property we do not need to iterate over all predecessors. if block has one unwind predecessors then all its predecessors must be unwind (that is because the block must start with a landing pad). I would suggest to commit this and check only one of the predecessors. yrouban: for the //unwind// property we do not need to iterate over all predecessors. if block has one…
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Then it will always stop on the first predecessor, right? May unwind have more than one predecessor at all? ebrevnov: Then it will always stop on the first predecessor, right? May unwind have more than one…
		yroubanUnsubmitted Not Done Reply Inline Actions I believe that unwind blocks can have more than one predecessor as one catch block for several callsites. yrouban: I believe that unwind blocks can have more than one predecessor as one catch block for several…
		auto LoopWeight = getMaxEstimatedEdgeWeight(
		yroubanUnsubmitted Not Done Reply Inline Actions not needed? yrouban: not needed?
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions Not sure what was the case, but I came across the case when Pred was null. ebrevnov: Not sure what was the case, but I came across the case when Pred was null.
		LoopBB, make_range(Exits.begin(), Exits.end()));

		if (LoopWeight) {
		// If we never exit the loop then we can enter it once at maximum.
		yroubanUnsubmitted Done Reply Inline Actions remove braces { .. } yrouban: remove braces { .. }
		if (LoopWeight <= static_cast<uint32_t>(BlockExecWeight::UNREACHABLE))
		LoopWeight = static_cast<uint32_t>(BlockExecWeight::LOWEST_NON_ZERO);

		EstimatedLoopWeight.insert(
		{LoopBB.getLoopData(), LoopWeight.getValue()});
		// Add all blocks entering the loop into working list.
		getLoopEnterBlocks(LoopBB, BlockWorkList);
}		}
}		}

if (BackEdges.empty() && ExitingEdges.empty() && UnlikelyEdges.empty())		while (!BlockWorkList.empty()) {
return false;		// We can reach here only if BlockWorkList is not empty.
		const BasicBlock *BB = BlockWorkList.pop_back_val();
		yroubanUnsubmitted Done Reply Inline Actions nit: remove { } yrouban: nit: remove { }
// Collect the sum of probabilities of back-edges/in-edges/exiting-edges, and		if (EstimatedBlockWeight.count(BB))
// normalize them so that they sum up to one.		continue;
unsigned Denom = (BackEdges.empty() ? 0 : LBH_TAKEN_WEIGHT) +
(InEdges.empty() ? 0 : LBH_TAKEN_WEIGHT) +
(UnlikelyEdges.empty() ? 0 : LBH_UNLIKELY_WEIGHT) +
(ExitingEdges.empty() ? 0 : LBH_NONTAKEN_WEIGHT);

SmallVector<BranchProbability, 4> EdgeProbabilities(		// We take maximum over all weights of successors. In other words we take
BB->getTerminator()->getNumSuccessors(), BranchProbability::getUnknown());		// weight of "hot" path. In theory we can probably find a better function
if (uint32_t numBackEdges = BackEdges.size()) {		// which gives higher accuracy results (comparing to "maximum") but I
BranchProbability TakenProb = BranchProbability(LBH_TAKEN_WEIGHT, Denom);		// can't
auto Prob = TakenProb / numBackEdges;		// think of any right now. And I doubt it will make any difference in
for (unsigned SuccIdx : BackEdges)		// practice.
EdgeProbabilities[SuccIdx] = Prob;		const LoopBlock LoopBB = getLoopBlock(BB);
}		auto MaxWeight = getMaxEstimatedEdgeWeight(LoopBB, successors(BB));
		yroubanUnsubmitted Not Done Reply Inline Actions .. BlockWorkList ... LoopWorkList has a priority over BlockWorklist. Why? yrouban: .. BlockWorkList ... LoopWorkList has a priority over BlockWorklist. Why?
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions There should be no difference which one processed first. Someone has to be first :-) ebrevnov: There should be no difference which one processed first. Someone has to be first :-)

if (uint32_t numInEdges = InEdges.size()) {		if (MaxWeight)
BranchProbability TakenProb = BranchProbability(LBH_TAKEN_WEIGHT, Denom);		propagateEstimatedBlockWeight(LoopBB, DT, PDT, MaxWeight.getValue(),
		yroubanUnsubmitted Done Reply Inline Actions 'while' would be better and the last 'continue' would not be needed. yrouban: 'while' would be better and the last 'continue' would not be needed.
auto Prob = TakenProb / numInEdges;		BlockWorkList, LoopWorkList);
for (unsigned SuccIdx : InEdges)		}
EdgeProbabilities[SuccIdx] = Prob;		} while (!BlockWorkList.empty() \|\| !LoopWorkList.empty());
		yroubanUnsubmitted Not Done Reply Inline Actions why should we avoid calculating the loop weight? what if we find another weight for this loop? yrouban: why should we avoid calculating the loop weight? what if we find another weight for this loop?
		ebrevnovAuthorUnsubmitted Done Reply Inline Actions In general, weight is assigned to a block/loop when it has final value and can't/shouldn't be changed. However, there are cases when a block/loop inherently has several (possibly "contradicting") weights. For example, "unwind" block may also contain "cold" call. In that case the first set weight is favored and all consequent weights are ignored. ebrevnov: In general, weight is assigned to a block/loop when it has final value and can't/shouldn't be…
}		}

if (uint32_t numExitingEdges = ExitingEdges.size()) {		// Calculate edge probabilities based on block's estimated weight.
BranchProbability NotTakenProb = BranchProbability(LBH_NONTAKEN_WEIGHT,		// Note that gathered weights were not scaled for loops. Thus edges entering
Denom);		// and exiting loops requires special processing.
		yroubanUnsubmitted Not Done Reply Inline Actions ... require yrouban: ... require
auto Prob = NotTakenProb / numExitingEdges;		bool BranchProbabilityInfo::calcEstimatedHeuristics(const BasicBlock *BB) {
		yroubanUnsubmitted Done Reply Inline Actions I would rename it to LoopWeight which is calculated as max of weights of loop exits but at least VERY_LOW_WEIGHT. This is the definition of the loop weight. Why/how is it relevant to block weights? yrouban: I would rename it to //LoopWeight// which is calculated as max of weights of loop exits but at…
for (unsigned SuccIdx : ExitingEdges)		assert(BB->getTerminator()->getNumSuccessors() > 1 &&
EdgeProbabilities[SuccIdx] = Prob;		"expected more than one successor!");
}
		const LoopBlock LoopBB = getLoopBlock(BB);
if (uint32_t numUnlikelyEdges = UnlikelyEdges.size()) {
BranchProbability UnlikelyProb = BranchProbability(LBH_UNLIKELY_WEIGHT,		SmallPtrSet<const BasicBlock *, 8> UnlikelyBlocks;
Denom);		uint32_t TC = LBH_TAKEN_WEIGHT / LBH_NONTAKEN_WEIGHT;
auto Prob = UnlikelyProb / numUnlikelyEdges;		if (LoopBB.getLoop())
for (unsigned SuccIdx : UnlikelyEdges)		computeUnlikelySuccessors(BB, LoopBB.getLoop(), UnlikelyBlocks);
EdgeProbabilities[SuccIdx] = Prob;
		// Changed to 'true' if at least one successor has estimated weight.
		bool FoundEstimatedWeight = false;
		SmallVector<uint32_t, 4> SuccWeights;
		uint64_t TotalWeight = 0;
		// Go over all successors of BB and put their weights into SuccWeights.
		for (const_succ_iterator I = succ_begin(BB), E = succ_end(BB); I != E; ++I) {
		const BasicBlock SuccBB = I;
		Optional<uint32_t> Weight;
		const LoopBlock SuccLoopBB = getLoopBlock(SuccBB);
		const LoopEdge Edge{LoopBB, SuccLoopBB};

		Weight = getEstimatedEdgeWeight(Edge);

		if (isLoopExitingEdge(Edge) &&
		// Avoid adjustment of ZERO weight since it should remain unchanged.
		john.brawnUnsubmitted Done Reply Inline Actions Weight, not Weigth john.brawn: Weight, not Weigth
		Weight != static_cast<uint32_t>(BlockExecWeight::ZERO)) {
		// Scale down loop exiting weight by trip count.
		Weight = std::max(
		static_cast<uint32_t>(BlockExecWeight::LOWEST_NON_ZERO),
		Weight.getValueOr(static_cast<uint32_t>(BlockExecWeight::DEFAULT)) /
		TC);
		}
		yroubanUnsubmitted Done Reply Inline Actions shorter Weight = std::max(BlockExecWeight::LOWEST_NON_ZERO, Weight.getValueOr(static_cast<uint32_t>(BlockExecWeight::DEFAULT) / TC); yrouban: shorter Weight = std::max(BlockExecWeight::LOWEST_NON_ZERO, Weight.getValueOr…
		bool IsUnlikelyEdge = LoopBB.getLoop() && UnlikelyBlocks.contains(SuccBB);
		if (IsUnlikelyEdge &&
		// Avoid adjustment of ZERO weight since it should remain unchanged.
		Weight != static_cast<uint32_t>(BlockExecWeight::ZERO)) {
		// 'Unlikely' blocks have twice lower weight.
		Weight = std::max(
		static_cast<uint32_t>(BlockExecWeight::LOWEST_NON_ZERO),
		john.brawnUnsubmitted Done Reply Inline Actions This is making sure that the divide isn't rounding down to zero, yes? In which case I don't think it'll correctly handle the case where getEstimatedEdgeWeight returns ZERO_WEIGHT. john.brawn: This is making sure that the divide isn't rounding down to zero, yes? In which case I don't…
		yroubanUnsubmitted Done Reply Inline Actions Weight = std::max(BlockExecWeight::LOWEST_NON_ZERO, ...); yrouban: Weight = std::max(BlockExecWeight::LOWEST_NON_ZERO, ...);
		Weight.getValueOr(static_cast<uint32_t>(BlockExecWeight::DEFAULT)) /
		2);
		}

		if (Weight)
		FoundEstimatedWeight = true;

		auto WeightVal =
		Weight.getValueOr(static_cast<uint32_t>(BlockExecWeight::DEFAULT));
		TotalWeight += WeightVal;
		SuccWeights.push_back(WeightVal);
		}

		// If non of blocks have estimated weight bail out.
		// If TotalWeight is 0 that means weight of each successor is 0 as well and
		yroubanUnsubmitted Done Reply Inline Actions If TotalWeight is zero then weight of ... yrouban: If TotalWeight is zero then weight of ...
		// equally likely. Bail out early to not deal with devision by zero.
		if (!FoundEstimatedWeight \|\| TotalWeight == 0)
		return false;

		assert(SuccWeights.size() == succ_size(BB) && "Missed successor?");
		const unsigned SuccCount = SuccWeights.size();

		// If the sum of weights does not fit in 32 bits, scale every weight down
		// accordingly.
		if (TotalWeight > UINT32_MAX) {
		uint64_t ScalingFactor = TotalWeight / UINT32_MAX + 1;
		TotalWeight = 0;
		for (unsigned Idx = 0; Idx < SuccCount; ++Idx) {
		SuccWeights[Idx] /= ScalingFactor;
		if (SuccWeights[Idx] == static_cast<uint32_t>(BlockExecWeight::ZERO))
		SuccWeights[Idx] =
		static_cast<uint32_t>(BlockExecWeight::LOWEST_NON_ZERO);
		TotalWeight += SuccWeights[Idx];
		}
		assert(TotalWeight <= UINT32_MAX && "Total weight overflows");
		yroubanUnsubmitted Done Reply Inline Actions none yrouban: none
}		}

		// Finally set probabilities to edges according to estimated block weights.
		SmallVector<BranchProbability, 4> EdgeProbabilities(
		SuccCount, BranchProbability::getUnknown());

		for (unsigned Idx = 0; Idx < SuccCount; ++Idx) {
		EdgeProbabilities[Idx] =
		BranchProbability(SuccWeights[Idx], (uint32_t)TotalWeight);
		}
setEdgeProbability(BB, EdgeProbabilities);		setEdgeProbability(BB, EdgeProbabilities);
return true;		return true;
}		}

bool BranchProbabilityInfo::calcZeroHeuristics(const BasicBlock *BB,		bool BranchProbabilityInfo::calcZeroHeuristics(const BasicBlock *BB,
const TargetLibraryInfo *TLI) {		const TargetLibraryInfo *TLI) {
const BranchInst *BI = dyn_cast<BranchInst>(BB->getTerminator());		const BranchInst *BI = dyn_cast<BranchInst>(BB->getTerminator());
if (!BI \|\| !BI->isConditional())		if (!BI \|\| !BI->isConditional())
▲ Show 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	bool BranchProbabilityInfo::calcFloatingPointHeuristics(const BasicBlock *BB) {
if (!isProb)		if (!isProb)
std::swap(TakenProb, UntakenProb);		std::swap(TakenProb, UntakenProb);

setEdgeProbability(		setEdgeProbability(
BB, SmallVector<BranchProbability, 2>({TakenProb, UntakenProb}));		BB, SmallVector<BranchProbability, 2>({TakenProb, UntakenProb}));
return true;		return true;
}		}

bool BranchProbabilityInfo::calcInvokeHeuristics(const BasicBlock *BB) {
const InvokeInst *II = dyn_cast<InvokeInst>(BB->getTerminator());
if (!II)
return false;

BranchProbability TakenProb(IH_TAKEN_WEIGHT,
IH_TAKEN_WEIGHT + IH_NONTAKEN_WEIGHT);
setEdgeProbability(
BB, SmallVector<BranchProbability, 2>({TakenProb, TakenProb.getCompl()}));
return true;
}

void BranchProbabilityInfo::releaseMemory() {		void BranchProbabilityInfo::releaseMemory() {
Probs.clear();		Probs.clear();
Handles.clear();		Handles.clear();
}		}

bool BranchProbabilityInfo::invalidate(Function &, const PreservedAnalyses &PA,		bool BranchProbabilityInfo::invalidate(Function &, const PreservedAnalyses &PA,
FunctionAnalysisManager::Invalidator &) {		FunctionAnalysisManager::Invalidator &) {
// Check whether the analysis, all analyses on functions, or the function's		// Check whether the analysis, all analyses on functions, or the function's
▲ Show 20 Lines • Show All 159 Lines • ▼ Show 20 Lines	if (MapI == Probs.end()) {
assert(Probs.count(std::make_pair(BB, I + 1)) == 0 &&		assert(Probs.count(std::make_pair(BB, I + 1)) == 0 &&
"Must be no more successors");		"Must be no more successors");
return;		return;
}		}
Probs.erase(MapI);		Probs.erase(MapI);
}		}
}		}

void BranchProbabilityInfo::calculate(const Function &F, const LoopInfo &LI,		void BranchProbabilityInfo::calculate(const Function &F, const LoopInfo &LoopI,
const TargetLibraryInfo *TLI,		const TargetLibraryInfo *TLI,
		DominatorTree *DT,
PostDominatorTree *PDT) {		PostDominatorTree *PDT) {
LLVM_DEBUG(dbgs() << "---- Branch Probability Info : " << F.getName()		LLVM_DEBUG(dbgs() << "---- Branch Probability Info : " << F.getName()
<< " ----\n\n");		<< " ----\n\n");
LastF = &F; // Store the last function we ran on for printing.		LastF = &F; // Store the last function we ran on for printing.
assert(PostDominatedByUnreachable.empty());		LI = &LoopI;
assert(PostDominatedByColdCall.empty());

SccI = std::make_unique<SccInfo>(F);		SccI = std::make_unique<SccInfo>(F);

		assert(EstimatedBlockWeight.empty());
		assert(EstimatedLoopWeight.empty());

		std::unique_ptr<DominatorTree> DTPtr;
std::unique_ptr<PostDominatorTree> PDTPtr;		std::unique_ptr<PostDominatorTree> PDTPtr;

		if (!DT) {
		DTPtr = std::make_unique<DominatorTree>(const_cast<Function &>(F));
		DT = DTPtr.get();
		}

if (!PDT) {		if (!PDT) {
PDTPtr = std::make_unique<PostDominatorTree>(const_cast<Function &>(F));		PDTPtr = std::make_unique<PostDominatorTree>(const_cast<Function &>(F));
PDT = PDTPtr.get();		PDT = PDTPtr.get();
}		}

computePostDominatedByUnreachable(F, PDT);		computeEestimateBlockWeight(F, DT, PDT);
computePostDominatedByColdCall(F, PDT);

// Walk the basic blocks in post-order so that we can build up state about		// Walk the basic blocks in post-order so that we can build up state about
// the successors of a block iteratively.		// the successors of a block iteratively.
for (auto BB : post_order(&F.getEntryBlock())) {		for (auto BB : post_order(&F.getEntryBlock())) {
LLVM_DEBUG(dbgs() << "Computing probabilities for " << BB->getName()		LLVM_DEBUG(dbgs() << "Computing probabilities for " << BB->getName()
<< "\n");		<< "\n");
// If there is no at least two successors, no sense to set probability.		// If there is no at least two successors, no sense to set probability.
if (BB->getTerminator()->getNumSuccessors() < 2)		if (BB->getTerminator()->getNumSuccessors() < 2)
continue;		continue;
if (calcMetadataWeights(BB))		if (calcMetadataWeights(BB))
continue;		continue;
if (calcInvokeHeuristics(BB))		if (calcEstimatedHeuristics(BB))
continue;
if (calcUnreachableHeuristics(BB))
continue;
if (calcColdCallHeuristics(BB))
continue;
if (calcLoopBranchHeuristics(BB, LI))
continue;		continue;
if (calcPointerHeuristics(BB))		if (calcPointerHeuristics(BB))
continue;		continue;
if (calcZeroHeuristics(BB, TLI))		if (calcZeroHeuristics(BB, TLI))
continue;		continue;
if (calcFloatingPointHeuristics(BB))		if (calcFloatingPointHeuristics(BB))
continue;		continue;
}		}

PostDominatedByUnreachable.clear();		EstimatedLoopWeight.clear();
PostDominatedByColdCall.clear();		EstimatedBlockWeight.clear();
SccI.reset();		SccI.reset();

if (PrintBranchProb &&		if (PrintBranchProb &&
(PrintBranchProbFuncName.empty() \|\|		(PrintBranchProbFuncName.empty() \|\|
F.getName().equals(PrintBranchProbFuncName))) {		F.getName().equals(PrintBranchProbFuncName))) {
print(dbgs());		print(dbgs());
}		}
}		}

void BranchProbabilityInfoWrapperPass::getAnalysisUsage(		void BranchProbabilityInfoWrapperPass::getAnalysisUsage(
AnalysisUsage &AU) const {		AnalysisUsage &AU) const {
// We require DT so it's available when LI is available. The LI updating code		// We require DT so it's available when LI is available. The LI updating code
// asserts that DT is also present so if we don't make sure that we have DT		// asserts that DT is also present so if we don't make sure that we have DT
// here, that assert will trigger.		// here, that assert will trigger.
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<LoopInfoWrapperPass>();		AU.addRequired<LoopInfoWrapperPass>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<PostDominatorTreeWrapperPass>();		AU.addRequired<PostDominatorTreeWrapperPass>();
AU.setPreservesAll();		AU.setPreservesAll();
}		}

bool BranchProbabilityInfoWrapperPass::runOnFunction(Function &F) {		bool BranchProbabilityInfoWrapperPass::runOnFunction(Function &F) {
const LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();		const LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
const TargetLibraryInfo &TLI =		const TargetLibraryInfo &TLI =
getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);		getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
		DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
PostDominatorTree &PDT =		PostDominatorTree &PDT =
getAnalysis<PostDominatorTreeWrapperPass>().getPostDomTree();		getAnalysis<PostDominatorTreeWrapperPass>().getPostDomTree();
BPI.calculate(F, LI, &TLI, &PDT);		BPI.calculate(F, LI, &TLI, &DT, &PDT);
return false;		return false;
}		}

void BranchProbabilityInfoWrapperPass::releaseMemory() { BPI.releaseMemory(); }		void BranchProbabilityInfoWrapperPass::releaseMemory() { BPI.releaseMemory(); }

void BranchProbabilityInfoWrapperPass::print(raw_ostream &OS,		void BranchProbabilityInfoWrapperPass::print(raw_ostream &OS,
const Module *) const {		const Module *) const {
BPI.print(OS);		BPI.print(OS);
}		}

AnalysisKey BranchProbabilityAnalysis::Key;		AnalysisKey BranchProbabilityAnalysis::Key;
BranchProbabilityInfo		BranchProbabilityInfo
BranchProbabilityAnalysis::run(Function &F, FunctionAnalysisManager &AM) {		BranchProbabilityAnalysis::run(Function &F, FunctionAnalysisManager &AM) {
BranchProbabilityInfo BPI;		BranchProbabilityInfo BPI;
BPI.calculate(F, AM.getResult<LoopAnalysis>(F),		BPI.calculate(F, AM.getResult<LoopAnalysis>(F),
&AM.getResult<TargetLibraryAnalysis>(F),		&AM.getResult<TargetLibraryAnalysis>(F),
		&AM.getResult<DominatorTreeAnalysis>(F),
&AM.getResult<PostDominatorTreeAnalysis>(F));		&AM.getResult<PostDominatorTreeAnalysis>(F));
return BPI;		return BPI;
}		}

PreservedAnalyses		PreservedAnalyses
BranchProbabilityPrinterPass::run(Function &F, FunctionAnalysisManager &AM) {		BranchProbabilityPrinterPass::run(Function &F, FunctionAnalysisManager &AM) {
OS << "Printing analysis results of BPI for function "		OS << "Printing analysis results of BPI for function "
<< "'" << F.getName() << "':"		<< "'" << F.getName() << "':"
<< "\n";		<< "\n";
AM.getResult<BranchProbabilityAnalysis>(F).print(OS);		AM.getResult<BranchProbabilityAnalysis>(F).print(OS);
return PreservedAnalyses::all();		return PreservedAnalyses::all();
}		}

llvm/lib/Analysis/OptimizationRemarkEmitter.cpp

Show All 31 Lines	OptimizationRemarkEmitter::OptimizationRemarkEmitter(const Function *F)
DominatorTree DT;		DominatorTree DT;
DT.recalculate(const_cast<Function >(F));		DT.recalculate(const_cast<Function >(F));

// Generate LoopInfo from it.		// Generate LoopInfo from it.
LoopInfo LI;		LoopInfo LI;
LI.analyze(DT);		LI.analyze(DT);

// Then compute BranchProbabilityInfo.		// Then compute BranchProbabilityInfo.
BranchProbabilityInfo BPI(*F, LI);		BranchProbabilityInfo BPI(*F, LI, nullptr, &DT, nullptr);

// Finally compute BFI.		// Finally compute BFI.
OwnedBFI = std::make_unique<BlockFrequencyInfo>(*F, BPI, LI);		OwnedBFI = std::make_unique<BlockFrequencyInfo>(*F, BPI, LI);
BFI = OwnedBFI.get();		BFI = OwnedBFI.get();
}		}

bool OptimizationRemarkEmitter::invalidate(		bool OptimizationRemarkEmitter::invalidate(
Function &F, const PreservedAnalyses &PA,		Function &F, const PreservedAnalyses &PA,
▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/LoopPredication.cpp

	Show First 20 Lines • Show All 356 Lines • ▼ Show 20 Lines

	PreservedAnalyses LoopPredicationPass::run(Loop &L, LoopAnalysisManager &AM,			PreservedAnalyses LoopPredicationPass::run(Loop &L, LoopAnalysisManager &AM,
	LoopStandardAnalysisResults &AR,			LoopStandardAnalysisResults &AR,
	LPMUpdater &U) {			LPMUpdater &U) {
	Function *F = L.getHeader()->getParent();			Function *F = L.getHeader()->getParent();
	// For the new PM, we also can't use BranchProbabilityInfo as an analysis			// For the new PM, we also can't use BranchProbabilityInfo as an analysis
	// pass. Function analyses need to be preserved across loop transformations			// pass. Function analyses need to be preserved across loop transformations
	// but BPI is not preserved, hence a newly built one is needed.			// but BPI is not preserved, hence a newly built one is needed.
	BranchProbabilityInfo BPI(*F, AR.LI, &AR.TLI);			BranchProbabilityInfo BPI(*F, AR.LI, &AR.TLI, &AR.DT, nullptr);
	LoopPredication LP(&AR.AA, &AR.DT, &AR.SE, &AR.LI, &BPI);			LoopPredication LP(&AR.AA, &AR.DT, &AR.SE, &AR.LI, &BPI);
	if (!LP.runOnLoop(&L))			if (!LP.runOnLoop(&L))
	return PreservedAnalyses::all();			return PreservedAnalyses::all();

	return getLoopPassPreservedAnalyses();			return getLoopPassPreservedAnalyses();
	}			}

	Optional<LoopICmp>			Optional<LoopICmp>
	▲ Show 20 Lines • Show All 873 Lines • Show Last 20 Lines

llvm/test/Analysis/BlockFrequencyInfo/redundant_edges.ll

	; RUN: opt < %s -analyze -block-freq -enable-new-pm=0 \| FileCheck %s			; RUN: opt < %s -analyze -block-freq -enable-new-pm=0 \| FileCheck %s
	; RUN: opt < %s -analyze -lazy-block-freq -enable-new-pm=0 \| FileCheck %s			; RUN: opt < %s -analyze -lazy-block-freq -enable-new-pm=0 \| FileCheck %s
	; RUN: opt < %s -passes='print<block-freq>' -disable-output 2>&1 \| FileCheck %s			; RUN: opt < %s -passes='print<block-freq>' -disable-output 2>&1 \| FileCheck %s

	define void @test1() {			define void @test1() {
	; CHECK-LABEL: Printing analysis {{.*}} for function 'test1':			; CHECK-LABEL: Printing analysis {{.*}} for function 'test1':
	; CHECK-NEXT: block-frequency-info: test1			; CHECK-NEXT: block-frequency-info: test1
	; CHECK-NEXT: entry: float = 1.0, int = [[ENTRY:[0-9]+]]			; CHECK-NEXT: entry: float = 1.0, int = [[ENTRY:[0-9]+]]
	entry:			entry:
	br label %loop			br label %loop

	; CHECK-NEXT: loop: float = 32.0			; CHECK-NEXT: loop: float = 16.5
				davidxlUnsubmitted Not Done Reply Inline Actions why is the new estimation better? davidxl: why is the new estimation better?
				ebrevnovAuthorUnsubmitted Done Reply Inline Actions This is tough question. Let's take an extreme case when loop has infinite number of exits. In this case probability to take back branch should go to zero and loop frequency should be 1.0. That means more exits we have less probable to got to the next iteration. Existing algorithm completely ignores number of exits and always assumes loop makes 32 iteration. New algorithm assumes that back branch weight is 32 times higher than default exiting weight. Thus in case of one loop exit both algorithms give the following probability of the back branch: edge loop -> loop probability is 0x7c000000 / 0x80000000 = 96.88% [HOT edge] But because in this case there are two exiting edges probability of the back branch is calculated as '32DW/(32DW+2DW)==32/34=0.94" edge loop -> loop probability is 0x783e0f84 / 0x80000000 = 93.94% I think this is pretty reasonable probability much better matching general high level picture. There is one cavity though. Please note in this example we have two exiting branches from latch. If we had side exit from another block even new algorithm won't take it into account. I think we should put a TODO on that and follow up later. What do you think? ebrevnov:* This is tough question. Let's take an extreme case when loop has infinite number of exits. In…
	loop:			loop:
	switch i32 undef, label %loop [			switch i32 undef, label %loop [
	i32 0, label %return			i32 0, label %return
	i32 1, label %return			i32 1, label %return
	]			]

	; CHECK-NEXT: return: float = 1.0			; CHECK-NEXT: return: float = 1.0
	return:			return:
	ret void			ret void
	}			}

llvm/test/Analysis/BranchProbabilityInfo/basic.ll

	Show First 20 Lines • Show All 118 Lines • ▼ Show 20 Lines
	!2 = !{!"branch_weights", i32 7, i32 6, i32 4, i32 4, i32 64}			!2 = !{!"branch_weights", i32 7, i32 6, i32 4, i32 4, i32 64}

	declare void @coldfunc() cold			declare void @coldfunc() cold

	define i32 @test5(i32 %a, i32 %b, i1 %flag) {			define i32 @test5(i32 %a, i32 %b, i1 %flag) {
	; CHECK: Printing analysis {{.*}} for function 'test5'			; CHECK: Printing analysis {{.*}} for function 'test5'
	entry:			entry:
	br i1 %flag, label %then, label %else			br i1 %flag, label %then, label %else
	; CHECK: edge entry -> then probability is 0x07878788 / 0x80000000 = 5.88%			; CHECK: edge entry -> then probability is 0x078780e3 / 0x80000000 = 5.88%
	; CHECK: edge entry -> else probability is 0x78787878 / 0x80000000 = 94.12% [HOT edge]			; CHECK: edge entry -> else probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]

	then:			then:
	call void @coldfunc()			call void @coldfunc()
	br label %exit			br label %exit
	; CHECK: edge then -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge then -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

	else:			else:
	br label %exit			br label %exit
	; CHECK: edge else -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge else -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

	exit:			exit:
	%result = phi i32 [ %a, %then ], [ %b, %else ]			%result = phi i32 [ %a, %then ], [ %b, %else ]
	ret i32 %result			ret i32 %result
	}			}

	define i32 @test_cold_loop(i32 %a, i32 %b) {			define i32 @test_cold_loop(i32 %a, i32 %b) {
	entry:			entry:
	%cond1 = icmp eq i32 %a, 42			%cond1 = icmp eq i32 %a, 42
	br i1 %cond1, label %header, label %exit			br i1 %cond1, label %header, label %exit
				; CHECK: edge entry -> header probability is 0x40000000 / 0x80000000 = 50.00%
				; CHECK: edge entry -> exit probability is 0x40000000 / 0x80000000 = 50.00%
	header:			header:
	br label %body			br label %body

	body:			body:
	%cond2 = icmp eq i32 %b, 42			%cond2 = icmp eq i32 %b, 42
	br i1 %cond2, label %header, label %exit			br i1 %cond2, label %header, label %exit
	; CHECK: edge body -> header probability is 0x40000000 / 0x80000000 = 50.00%			; CHECK: edge body -> header probability is 0x7fbe1203 / 0x80000000 = 99.80% [HOT edge]
				; CHECK: edge body -> exit probability is 0x0041edfd / 0x80000000 = 0.20%
	exit:			exit:
				yroubanUnsubmitted Not Done Reply Inline Actions space between ; and CHECK yrouban: space between ; and CHECK
	call void @coldfunc()			call void @coldfunc()
	ret i32 %b			ret i32 %b
	}			}

	declare i32 @regular_function(i32 %i)			declare i32 @regular_function(i32 %i)

	define i32 @test_cold_call_sites_with_prof(i32 %a, i32 %b, i1 %flag, i1 %flag2) {			define i32 @test_cold_call_sites_with_prof(i32 %a, i32 %b, i1 %flag, i1 %flag2) {
	; CHECK: Printing analysis {{.*}} for function 'test_cold_call_sites_with_prof'			; CHECK: Printing analysis {{.*}} for function 'test_cold_call_sites_with_prof'
	entry:			entry:
	br i1 %flag, label %then, label %else			br i1 %flag, label %then, label %else
	; CHECK: edge entry -> then probability is 0x07878788 / 0x80000000 = 5.88%			; CHECK: edge entry -> then probability is 0x078780e3 / 0x80000000 = 5.88%
	; CHECK: edge entry -> else probability is 0x78787878 / 0x80000000 = 94.12% [HOT edge]			; CHECK: edge entry -> else probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]

	then:			then:
	br i1 %flag2, label %then2, label %else2, !prof !3			br i1 %flag2, label %then2, label %else2, !prof !3
	; CHECK: edge then -> then2 probability is 0x7ebb907a / 0x80000000 = 99.01% [HOT edge]			; CHECK: edge then -> then2 probability is 0x7ebb907a / 0x80000000 = 99.01% [HOT edge]
	; CHECK: edge then -> else2 probability is 0x01446f86 / 0x80000000 = 0.99%			; CHECK: edge then -> else2 probability is 0x01446f86 / 0x80000000 = 0.99%

	then2:			then2:
	br label %join			br label %join
	Show All 23 Lines
	define i32 @test_cold_call_sites(i32* %a) {			define i32 @test_cold_call_sites(i32* %a) {
	; Test that edges to blocks post-dominated by cold call sites			; Test that edges to blocks post-dominated by cold call sites
	; are marked as not expected to be taken.			; are marked as not expected to be taken.
	; TODO(dnovillo) The calls to regular_function should not be merged, but			; TODO(dnovillo) The calls to regular_function should not be merged, but
	; they are currently being merged. Convert this into a code generation test			; they are currently being merged. Convert this into a code generation test
	; after that is fixed.			; after that is fixed.

	; CHECK: Printing analysis {{.*}} for function 'test_cold_call_sites'			; CHECK: Printing analysis {{.*}} for function 'test_cold_call_sites'
	; CHECK: edge entry -> then probability is 0x07878788 / 0x80000000 = 5.88%			; CHECK: edge entry -> then probability is 0x078780e3 / 0x80000000 = 5.88%
	; CHECK: edge entry -> else probability is 0x78787878 / 0x80000000 = 94.12% [HOT edge]			; CHECK: edge entry -> else probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]

	entry:			entry:
	%gep1 = getelementptr i32, i32* %a, i32 1			%gep1 = getelementptr i32, i32* %a, i32 1
	%val1 = load i32, i32* %gep1			%val1 = load i32, i32* %gep1
	%cond1 = icmp ugt i32 %val1, 1			%cond1 = icmp ugt i32 %val1, 1
	br i1 %cond1, label %then, label %else			br i1 %cond1, label %then, label %else

	then:			then:
	Show All 14 Lines

	; CHECK-LABEL: test_invoke_code_callsite1			; CHECK-LABEL: test_invoke_code_callsite1
	define i32 @test_invoke_code_callsite1(i1 %c) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define i32 @test_invoke_code_callsite1(i1 %c) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	br i1 %c, label %if.then, label %if.end			br i1 %c, label %if.then, label %if.end
	; Edge "entry->if.end" should have higher probability based on the cold call			; Edge "entry->if.end" should have higher probability based on the cold call
	; heuristic which treat %if.then as a cold block because the normal destination			; heuristic which treat %if.then as a cold block because the normal destination
	; of the invoke instruction in %if.then is post-dominated by ColdFunc().			; of the invoke instruction in %if.then is post-dominated by ColdFunc().
	; CHECK: edge entry -> if.then probability is 0x07878788 / 0x80000000 = 5.88%			; CHECK: edge entry -> if.then probability is 0x078780e3 / 0x80000000 = 5.88%
	; CHECK: edge entry -> if.end probability is 0x78787878 / 0x80000000 = 94.12% [HOT edge]			; CHECK: edge entry -> if.end probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]

	if.then:			if.then:
	invoke i32 @InvokeCall()			invoke i32 @InvokeCall()
	to label %invoke.cont unwind label %lpad			to label %invoke.cont unwind label %lpad
	; CHECK: edge if.then -> invoke.cont probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge if.then -> invoke.cont probability is 0x7fff8000 / 0x80000000 = 100.00% [HOT edge]
	; CHECK: edge if.then -> lpad probability is 0x00000800 / 0x80000000 = 0.00%			; CHECK: edge if.then -> lpad probability is 0x00008000 / 0x80000000 = 0.00%

	invoke.cont:			invoke.cont:
	call void @ColdFunc() #0			call void @ColdFunc() #0
	br label %if.end			br label %if.end

	lpad:			lpad:
	%ll = landingpad { i8*, i32 }			%ll = landingpad { i8*, i32 }
	cleanup			cleanup
	br label %if.end			br label %if.end

	if.end:			if.end:
	ret i32 0			ret i32 0
	}			}

	; CHECK-LABEL: test_invoke_code_callsite2			; CHECK-LABEL: test_invoke_code_callsite2
	define i32 @test_invoke_code_callsite2(i1 %c) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define i32 @test_invoke_code_callsite2(i1 %c) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	br i1 %c, label %if.then, label %if.end			br i1 %c, label %if.then, label %if.end

	; CHECK: edge entry -> if.then probability is 0x40000000 / 0x80000000 = 50.00%			; CHECK: edge entry -> if.then probability is 0x40000000 / 0x80000000 = 50.00%
	; CHECK: edge entry -> if.end probability is 0x40000000 / 0x80000000 = 50.00%			; CHECK: edge entry -> if.end probability is 0x40000000 / 0x80000000 = 50.00%

	if.then:			if.then:
	invoke i32 @InvokeCall()			invoke i32 @InvokeCall()
	to label %invoke.cont unwind label %lpad			to label %invoke.cont unwind label %lpad
	; The cold call heuristic should not kick in when the cold callsite is in EH path.			; The cold call heuristic should not kick in when the cold callsite is in EH path.
	; CHECK: edge if.then -> invoke.cont probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge if.then -> invoke.cont probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]
	; CHECK: edge if.then -> lpad probability is 0x00000800 / 0x80000000 = 0.00%			; CHECK: edge if.then -> lpad probability is 0x00000800 / 0x80000000 = 0.00%
				yroubanUnsubmitted Not Done Reply Inline Actions space yrouban: space

	invoke.cont:			invoke.cont:
	br label %if.end			br label %if.end

	lpad:			lpad:
	%ll = landingpad { i8*, i32 }			%ll = landingpad { i8*, i32 }
	cleanup			cleanup
	call void @ColdFunc() #0			call void @ColdFunc() #0
	br label %if.end			br label %if.end

	if.end:			if.end:
	ret i32 0			ret i32 0
	}			}

	; CHECK-LABEL: test_invoke_code_callsite3			; CHECK-LABEL: test_invoke_code_callsite3
	define i32 @test_invoke_code_callsite3(i1 %c) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define i32 @test_invoke_code_callsite3(i1 %c) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	br i1 %c, label %if.then, label %if.end			br i1 %c, label %if.then, label %if.end
	; CHECK: edge entry -> if.then probability is 0x07878788 / 0x80000000 = 5.88%			; CHECK: edge entry -> if.then probability is 0x078780e3 / 0x80000000 = 5.88%
	; CHECK: edge entry -> if.end probability is 0x78787878 / 0x80000000 = 94.12% [HOT edge]			; CHECK: edge entry -> if.end probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]

	if.then:			if.then:
	invoke i32 @InvokeCall()			invoke i32 @InvokeCall()
	to label %invoke.cont unwind label %lpad			to label %invoke.cont unwind label %lpad
	; Regardless of cold calls, edge weights from a invoke instruction should be			; Regardless of cold calls, edge weights from a invoke instruction should be
	; determined by the invoke heuristic.			; determined by the invoke heuristic.
	; CHECK: edge if.then -> invoke.cont probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge if.then -> invoke.cont probability is 0x7fff8000 / 0x80000000 = 100.00% [HOT edge]
	; CHECK: edge if.then -> lpad probability is 0x00000800 / 0x80000000 = 0.00%			; CHECK: edge if.then -> lpad probability is 0x00008000 / 0x80000000 = 0.00%

	invoke.cont:			invoke.cont:
	call void @ColdFunc() #0			call void @ColdFunc() #0
	br label %if.end			br label %if.end

	lpad:			lpad:
	%ll = landingpad { i8*, i32 }			%ll = landingpad { i8*, i32 }
	cleanup			cleanup
	▲ Show 20 Lines • Show All 338 Lines • Show Last 20 Lines

llvm/test/Analysis/BranchProbabilityInfo/deopt-intrinsic.ll

	; RUN: opt -analyze -branch-prob < %s -enable-new-pm=0 \| FileCheck %s			; RUN: opt -analyze -branch-prob < %s -enable-new-pm=0 \| FileCheck %s
	; RUN: opt < %s -passes='print<branch-prob>' -disable-output 2>&1 \| FileCheck %s			; RUN: opt < %s -passes='print<branch-prob>' -disable-output 2>&1 \| FileCheck %s

	declare i32 @llvm.experimental.deoptimize.i32(...)			declare i32 @llvm.experimental.deoptimize.i32(...)

	define i32 @test1(i32 %a, i32 %b) {			define i32 @test1(i32 %a, i32 %b) {
	; CHECK-LABEL: Printing analysis {{.*}} for function 'test1':			; CHECK-LABEL: Printing analysis {{.*}} for function 'test1':
	entry:			entry:
	%cond = icmp eq i32 %a, 42			%cond = icmp eq i32 %a, 42
	br i1 %cond, label %exit, label %deopt			br i1 %cond, label %exit, label %deopt

	; CHECK: edge entry -> exit probability is 0x7fffffff / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge entry -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
	; CHECK: edge entry -> deopt probability is 0x00000001 / 0x80000000 = 0.00%			; CHECK: edge entry -> deopt probability is 0x00000000 / 0x80000000 = 0.00%

	deopt:			deopt:
	%rval = call i32(...) @llvm.experimental.deoptimize.i32() [ "deopt"() ]			%rval = call i32(...) @llvm.experimental.deoptimize.i32() [ "deopt"() ]
	ret i32 %rval			ret i32 %rval

	exit:			exit:
	ret i32 %b			ret i32 %b
	}			}

llvm/test/Analysis/BranchProbabilityInfo/deopt-invoke.ll

This file was added.

				; RUN: opt -analyze -branch-prob < %s \| FileCheck %s
				; RUN: opt < %s -passes='print<branch-prob>' -disable-output 2>&1 \| FileCheck %s

				declare i32* @"personality_function"() #1
				declare void @foo(i32)
				declare void @bar()
				declare void @llvm.experimental.deoptimize.isVoid(...)
				declare void @cold() cold

				; Even though the likeliness of 'invoke' to throw an exception is assessed as low
				; all other paths are even less likely. Check that hot paths leads to excepion handler.
				define void @test1(i32 %0) personality i32* ()* @"personality_function" !prof !1 {
				;CHECK: edge entry -> unreached probability is 0x00000001 / 0x80000000 = 0.00%
				;CHECK: edge entry -> invoke probability is 0x7fffffff / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge invoke -> invoke.cont.unreached probability is 0x00000000 / 0x80000000 = 0.00%
				;CHECK: edge invoke -> land.pad probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge land.pad -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

				entry:
				br i1 undef, label %unreached, label %invoke, !prof !2
				invoke:
				invoke void @foo(i32 %0)
				to label %invoke.cont.unreached unwind label %land.pad
				invoke.cont.unreached:
				call void (...) @llvm.experimental.deoptimize.isVoid(i32 10) [ "deopt"() ]
				ret void

				unreached:
				unreachable

				land.pad:
				%v20 = landingpad { i8*, i32 }
				cleanup
				%v21 = load i8 addrspace(1), i8 addrspace(1) addrspace(256)* inttoptr (i64 8 to i8 addrspace(1)* addrspace(256)*), align 8
				br label %exit

				exit:
				call void @bar()
				ret void
				}

				define void @test2(i32 %0) personality i32* ()* @"personality_function" {
				;CHECK: edge entry -> unreached probability is 0x00000000 / 0x80000000 = 0.00%
				;CHECK: edge entry -> invoke probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				yroubanUnsubmitted Not Done Reply Inline Actions space yrouban: space
				;CHECK: edge invoke -> invoke.cont.cold probability is 0x7fff8000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge invoke -> land.pad probability is 0x00008000 / 0x80000000 = 0.00%
				;CHECK: edge land.pad -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

				entry:
				br i1 undef, label %unreached, label %invoke
				invoke:
				invoke void @foo(i32 %0)
				to label %invoke.cont.cold unwind label %land.pad
				invoke.cont.cold:
				call void @cold()
				ret void

				unreached:
				unreachable

				land.pad:
				%v20 = landingpad { i8*, i32 }
				cleanup
				%v21 = load i8 addrspace(1), i8 addrspace(1) addrspace(256)* inttoptr (i64 8 to i8 addrspace(1)* addrspace(256)*), align 8
				br label %exit

				exit:
				call void @bar()
				ret void
				}

				define void @test3(i32 %0) personality i32* ()* @"personality_function" {
				;CHECK: edge entry -> unreached probability is 0x00000000 / 0x80000000 = 0.00%
				;CHECK: edge entry -> invoke probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge invoke -> invoke.cont.cold probability is 0x7fff8000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge invoke -> land.pad probability is 0x00008000 / 0x80000000 = 0.00%
				;CHECK: edge land.pad -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				entry:
				br i1 undef, label %unreached, label %invoke
				invoke:
				invoke void @foo(i32 %0)
				to label %invoke.cont.cold unwind label %land.pad
				invoke.cont.cold:
				call void @cold()
				ret void

				unreached:
				unreachable

				land.pad:
				%v20 = landingpad { i8*, i32 }
				cleanup
				%v21 = load i8 addrspace(1), i8 addrspace(1) addrspace(256)* inttoptr (i64 8 to i8 addrspace(1)* addrspace(256)*), align 8
				call void @cold()
				br label %exit

				exit:
				call void @bar()
				ret void
				}


				attributes #1 = { nounwind }

				!1 = !{!"function_entry_count", i64 32768}
				!2 = !{!"branch_weights", i32 1, i32 983040}

llvm/test/Analysis/BranchProbabilityInfo/loop.ll

Show First 20 Lines • Show All 422 Lines • ▼ Show 20 Lines

for.body:		for.body:
%arrayidx = getelementptr inbounds i32, i32* %p, i32 %i.0		%arrayidx = getelementptr inbounds i32, i32* %p, i32 %i.0
%0 = load i32, i32* %arrayidx, align 4		%0 = load i32, i32* %arrayidx, align 4
%add = add nsw i32 %sum.0, %0		%add = add nsw i32 %sum.0, %0
%inc = add nsw i32 %count.0, 1		%inc = add nsw i32 %count.0, 1
%cmp1 = icmp sgt i32 %count.0, 6		%cmp1 = icmp sgt i32 %count.0, 6
br i1 %cmp1, label %if.then, label %for.inc		br i1 %cmp1, label %if.then, label %for.inc
; CHECK: edge for.body -> if.then probability is 0x2aaaaaab / 0x80000000 = 33.33%		; CHECK: edge for.body -> if.then probability is 0x2aaaa8e4 / 0x80000000 = 33.33%
; CHECK: edge for.body -> for.inc probability is 0x55555555 / 0x80000000 = 66.67%		; CHECK: edge for.body -> for.inc probability is 0x5555571c / 0x80000000 = 66.67%

if.then:		if.then:
store i32 %add, i32* %arrayidx, align 4		store i32 %add, i32* %arrayidx, align 4
br label %for.inc		br label %for.inc
; CHECK: edge if.then -> for.inc probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]		; CHECK: edge if.then -> for.inc probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

for.inc:		for.inc:
%count.1 = phi i32 [ 0, %if.then ], [ %inc, %for.body ]		%count.1 = phi i32 [ 0, %if.then ], [ %inc, %for.body ]
%sum.1 = phi i32 [ 0, %if.then ], [ %add, %for.body ]		%sum.1 = phi i32 [ 0, %if.then ], [ %add, %for.body ]
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	%ll = landingpad { i8*, i32 }
cleanup		cleanup
br label %exit		br label %exit

exit:		exit:
ret void		ret void
}		}

declare i32 @InvokeCall()		declare i32 @InvokeCall()
		declare void @cold() cold

		; If loop has single exit and it leads to 'cold' block then edge leading to loop enter
		; should be considered 'cold' as well.
		define void @test13() {
		; CHECK: edge entry -> loop probability is 0x078780e3 / 0x80000000 = 5.88%
		yroubanUnsubmitted Done Reply Inline Actions space yrouban: space
		; CHECK: edge entry -> exit probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]
		; CHECK: edge loop -> loop probability is 0x7fbe1203 / 0x80000000 = 99.80% [HOT edge]
		; CHECK: edge loop -> cold probability is 0x0041edfd / 0x80000000 = 0.20%
		; CHECK: edge cold -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

		entry:
		br i1 undef, label %loop, label %exit

		loop:
		%i.0 = phi i32 [ 0, %entry ], [ %inc, %loop ]
		%inc = add nsw i32 %i.0, 1
		br i1 undef, label %loop, label %cold

		cold:
		call void @cold()
		br label %exit

		exit:
		ret void
		}

		; This is the same case as test13 but with additional loop 'preheader' block.
		define void @test14() {
		; CHECK: edge entry -> preheader probability is 0x078780e3 / 0x80000000 = 5.88%
		yroubanUnsubmitted Done Reply Inline Actions space please yrouban: space please
		; CHECK: edge entry -> exit probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]
		; CHECK: edge preheader -> loop probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
		; CHECK: edge loop -> loop probability is 0x7fbe1203 / 0x80000000 = 99.80% [HOT edge]
		; CHECK: edge loop -> cold probability is 0x0041edfd / 0x80000000 = 0.20%
		; CHECK: edge cold -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

		entry:
		br i1 undef, label %preheader, label %exit

		preheader:
		br label %loop

		loop:
		%i.0 = phi i32 [ 0, %preheader ], [ %inc, %loop ]
		%inc = add nsw i32 %i.0, 1
		br i1 undef, label %loop, label %cold

		cold:
		call void @cold()
		br label %exit

		exit:
		ret void
		}

		; If loop has multiple low probability exits then edge leading to loop enter
		; should be considered low probable as well.
		define void @test15() {
		; CHECK: edge entry -> loop probability is 0x078780e3 / 0x80000000 = 5.88%
		; CHECK: edge entry -> exit probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]
		; CHECK: edge loop -> cont probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
		; CHECK: edge loop -> unreached probability is 0x00000000 / 0x80000000 = 0.00%
		; CHECK: edge cont -> loop probability is 0x7fbe1203 / 0x80000000 = 99.80% [HOT edge]
		; CHECK: edge cont -> cold probability is 0x0041edfd / 0x80000000 = 0.20%
		; CHECK: edge cold -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

		entry:
		br i1 undef, label %loop, label %exit

		loop:
		%i.0 = phi i32 [ 0, %entry ], [ %inc, %cont ]
		%inc = add nsw i32 %i.0, 1
		br i1 undef, label %cont, label %unreached

		cont:
		br i1 undef, label %loop, label %cold

		unreached:
		unreachable


		cold:
		call void @cold()
		br label %exit

		exit:
		ret void
		}

		; This is the same case as test15 but with additional loop 'preheader' block.
		define void @test16() {
		; CHECK: edge entry -> preheader probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
		; CHECK: edge preheader -> loop probability is 0x078780e3 / 0x80000000 = 5.88%
		; CHECK: edge preheader -> exit probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]
		; CHECK: edge loop -> cont probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
		; CHECK: edge loop -> unreached probability is 0x00000000 / 0x80000000 = 0.00%
		; CHECK: edge cont -> loop probability is 0x7fbe1203 / 0x80000000 = 99.80% [HOT edge]
		; CHECK: edge cont -> cold probability is 0x0041edfd / 0x80000000 = 0.20%
		; CHECK: edge cold -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

		entry:
		br label %preheader

		preheader:
		br i1 undef, label %loop, label %exit

		loop:
		%i.0 = phi i32 [ 0, %preheader ], [ %inc, %cont ]
		%inc = add nsw i32 %i.0, 1
		br i1 undef, label %cont, label %unreached

		cont:
		br i1 undef, label %loop, label %cold

		unreached:
		unreachable


		cold:
		call void @cold()
		br label %exit

		exit:
		ret void
		}

		declare void @abort() noreturn

		; Check that 'preheader' has 50/50 probability since there is one 'normal' exit.
		; Check that exit to 'cold' and 'noreturn' has lower probability than 'normal' exit.
		define void @test17() {
		; CHECK: edge entry -> preheader probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
		; CHECK: edge preheader -> loop probability is 0x40000000 / 0x80000000 = 50.00%
		; CHECK: edge preheader -> exit probability is 0x40000000 / 0x80000000 = 50.00%
		; CHECK: edge loop -> cont probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]
		; CHECK: edge loop -> noreturn probability is 0x00000800 / 0x80000000 = 0.00%
		; CHECK: edge cont -> cont2 probability is 0x7fbe1203 / 0x80000000 = 99.80% [HOT edge]
		; CHECK: edge cont -> cold probability is 0x0041edfd / 0x80000000 = 0.20%
		; CHECK: edge cont2 -> loop probability is 0x7c000000 / 0x80000000 = 96.88% [HOT edge]
		; CHECK: edge cont2 -> exit probability is 0x04000000 / 0x80000000 = 3.12%
		; CHECK: edge cold -> exit probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
		entry:
		br label %preheader

		preheader:
		br i1 undef, label %loop, label %exit

		loop:
		%i.0 = phi i32 [ 0, %preheader ], [ %inc, %cont2 ]
		%inc = add nsw i32 %i.0, 1
		br i1 undef, label %cont, label %noreturn

		cont:
		br i1 undef, label %cont2, label %cold

		cont2:
		br i1 undef, label %loop, label %exit

		noreturn:
		call void @abort()
		unreachable

		cold:
		call void @cold()
		br label %exit

		exit:
		ret void
		}


		; This is case with two loops where one nested into another. Nested loop has
		; low probable exit what encreases robability to take exit in the top level loop.
		define void @test18() {
		; CHECK: edge entry -> top.loop probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
		; CHECK: edge top.loop -> loop probability is 0x546cd4b7 / 0x80000000 = 65.96%
		; CHECK: edge top.loop -> exit probability is 0x2b932b49 / 0x80000000 = 34.04%
		; CHECK: edge loop -> loop probability is 0x7fbe1203 / 0x80000000 = 99.80% [HOT edge]
		; CHECK: edge loop -> cold probability is 0x0041edfd / 0x80000000 = 0.20%
		; CHECK: edge cold -> top.loop probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

		entry:
		br label %top.loop

		top.loop:
		%j.0 = phi i32 [ 0, %entry ], [ %j.inc, %cold ]
		br i1 undef, label %loop, label %exit

		loop:
		%i.0 = phi i32 [ %j.0, %top.loop ], [ %inc, %loop ]
		%inc = add nsw i32 %i.0, 1
		br i1 undef, label %loop, label %cold

		cold:
		call void @cold()
		%j.inc = add nsw i32 %j.0, 1
		br label %top.loop

		exit:
		ret void
		}

llvm/test/Analysis/BranchProbabilityInfo/noreturn.ll

	; Test the static branch probability heuristics for no-return functions.			; Test the static branch probability heuristics for no-return functions.
	; RUN: opt < %s -analyze -branch-prob -enable-new-pm=0 \| FileCheck %s			; RUN: opt < %s -analyze -branch-prob -enable-new-pm=0 \| FileCheck %s
	; RUN: opt < %s -passes='print<branch-prob>' -disable-output 2>&1 \| FileCheck %s			; RUN: opt < %s -passes='print<branch-prob>' -disable-output 2>&1 \| FileCheck %s

	declare void @abort() noreturn			declare void @abort() noreturn

	define i32 @test1(i32 %a, i32 %b) {			define i32 @test1(i32 %a, i32 %b) {
	; CHECK: Printing analysis {{.*}} for function 'test1'			; CHECK: Printing analysis {{.*}} for function 'test1'
	entry:			entry:
	%cond = icmp eq i32 %a, 42			%cond = icmp eq i32 %a, 42
	br i1 %cond, label %exit, label %abort			br i1 %cond, label %exit, label %abort
	; CHECK: edge entry -> exit probability is 0x7fffffff / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge entry -> exit probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]
	; CHECK: edge entry -> abort probability is 0x00000001 / 0x80000000 = 0.00%			; CHECK: edge entry -> abort probability is 0x00000800 / 0x80000000 = 0.00%

	abort:			abort:
	call void @abort() noreturn			call void @abort() noreturn
	unreachable			unreachable

	exit:			exit:
	ret i32 %b			ret i32 %b
	}			}

	define i32 @test2(i32 %a, i32 %b) {			define i32 @test2(i32 %a, i32 %b) {
	; CHECK: Printing analysis {{.*}} for function 'test2'			; CHECK: Printing analysis {{.*}} for function 'test2'
	entry:			entry:
	switch i32 %a, label %exit [i32 1, label %case_a			switch i32 %a, label %exit [i32 1, label %case_a
	i32 2, label %case_b			i32 2, label %case_b
	i32 3, label %case_c			i32 3, label %case_c
	i32 4, label %case_d]			i32 4, label %case_d]
	; CHECK: edge entry -> exit probability is 0x7ffffffc / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge entry -> exit probability is 0x7fffe000 / 0x80000000 = 100.00% [HOT edge]
	; CHECK: edge entry -> case_a probability is 0x00000001 / 0x80000000 = 0.00%			; CHECK: edge entry -> case_a probability is 0x00000800 / 0x80000000 = 0.00%
	; CHECK: edge entry -> case_b probability is 0x00000001 / 0x80000000 = 0.00%			; CHECK: edge entry -> case_b probability is 0x00000800 / 0x80000000 = 0.00%
	; CHECK: edge entry -> case_c probability is 0x00000001 / 0x80000000 = 0.00%			; CHECK: edge entry -> case_c probability is 0x00000800 / 0x80000000 = 0.00%
	; CHECK: edge entry -> case_d probability is 0x00000001 / 0x80000000 = 0.00%			; CHECK: edge entry -> case_d probability is 0x00000800 / 0x80000000 = 0.00%

	case_a:			case_a:
	br label %case_b			br label %case_b

	case_b:			case_b:
	br label %case_c			br label %case_c

	case_c:			case_c:
	br label %case_d			br label %case_d

	case_d:			case_d:
	call void @abort() noreturn			call void @abort() noreturn
	unreachable			unreachable

	exit:			exit:
	ret i32 %b			ret i32 %b
	}			}

	define i32 @test3(i32 %a, i32 %b) {			define i32 @test3(i32 %a, i32 %b) {
	; CHECK: Printing analysis {{.*}} for function 'test3'			; CHECK: Printing analysis {{.*}} for function 'test3'
	; Make sure we unify across multiple conditional branches.			; Make sure we unify across multiple conditional branches.
	entry:			entry:
	%cond1 = icmp eq i32 %a, 42			%cond1 = icmp eq i32 %a, 42
	br i1 %cond1, label %exit, label %dom			br i1 %cond1, label %exit, label %dom
	; CHECK: edge entry -> exit probability is 0x7fffffff / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge entry -> exit probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]
	; CHECK: edge entry -> dom probability is 0x00000001 / 0x80000000 = 0.00%			; CHECK: edge entry -> dom probability is 0x00000800 / 0x80000000 = 0.00%

	dom:			dom:
	%cond2 = icmp ult i32 %a, 42			%cond2 = icmp ult i32 %a, 42
	br i1 %cond2, label %idom1, label %idom2			br i1 %cond2, label %idom1, label %idom2
	; CHECK: edge dom -> idom1 probability is 0x40000000 / 0x80000000 = 50.00%			; CHECK: edge dom -> idom1 probability is 0x40000000 / 0x80000000 = 50.00%
	; CHECK: edge dom -> idom2 probability is 0x40000000 / 0x80000000 = 50.00%			; CHECK: edge dom -> idom2 probability is 0x40000000 / 0x80000000 = 50.00%

	idom1:			idom1:
	Show All 11 Lines
	}			}

	define i32 @test4(i32 %a, i32 %b) {			define i32 @test4(i32 %a, i32 %b) {
	; CHECK: Printing analysis {{.*}} for function 'test4'			; CHECK: Printing analysis {{.*}} for function 'test4'
	; Make sure we handle loops post-dominated by unreachables.			; Make sure we handle loops post-dominated by unreachables.
	entry:			entry:
	%cond1 = icmp eq i32 %a, 42			%cond1 = icmp eq i32 %a, 42
	br i1 %cond1, label %header, label %exit			br i1 %cond1, label %header, label %exit
	; CHECK: edge entry -> header probability is 0x00000001 / 0x80000000 = 0.00%			; CHECK: edge entry -> header probability is 0x00000800 / 0x80000000 = 0.00%
	; CHECK: edge entry -> exit probability is 0x7fffffff / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge entry -> exit probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]

	header:			header:
	br label %body			br label %body

	body:			body:
	%cond2 = icmp eq i32 %a, 42			%cond2 = icmp eq i32 %a, 42
	br i1 %cond2, label %header, label %abort			br i1 %cond2, label %header, label %abort
	; CHECK: edge body -> header probability is 0x40000000 / 0x80000000 = 50.00%			; CHECK: edge body -> header probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]
	; CHECK: edge body -> abort probability is 0x40000000 / 0x80000000 = 50.00%			; CHECK: edge body -> abort probability is 0x00000800 / 0x80000000 = 0.00%

	abort:			abort:
	call void @abort() noreturn			call void @abort() noreturn
	unreachable			unreachable

	exit:			exit:
	ret i32 %b			ret i32 %b
	}			}

	@_ZTIi = external global i8*			@_ZTIi = external global i8*

	; CHECK-LABEL: throwSmallException			; CHECK-LABEL: throwSmallException
	; CHECK-NOT: invoke i32 @smallFunction			; CHECK-NOT: invoke i32 @smallFunction
	define i32 @throwSmallException(i32 %idx, i32 %limit) #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define i32 @throwSmallException(i32 %idx, i32 %limit) #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:			entry:
	%cmp = icmp sge i32 %idx, %limit			%cmp = icmp sge i32 %idx, %limit
	br i1 %cmp, label %if.then, label %if.end			br i1 %cmp, label %if.then, label %if.end
	; CHECK: edge entry -> if.then probability is 0x00000001 / 0x80000000 = 0.00%			; CHECK: edge entry -> if.then probability is 0x00000800 / 0x80000000 = 0.00%
	; CHECK: edge entry -> if.end probability is 0x7fffffff / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge entry -> if.end probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]

	if.then: ; preds = %entry			if.then: ; preds = %entry
	%exception = call i8* @__cxa_allocate_exception(i64 1) #0			%exception = call i8* @__cxa_allocate_exception(i64 1) #0
	invoke i32 @smallFunction(i32 %idx)			invoke i32 @smallFunction(i32 %idx)
	to label %invoke.cont unwind label %lpad			to label %invoke.cont unwind label %lpad
	; CHECK: edge if.then -> invoke.cont probability is 0x7ffff800 / 0x80000000 = 100.00% [HOT edge]			; CHECK: edge if.then -> invoke.cont probability is 0x40000000 / 0x80000000 = 50.00%
	; CHECK: edge if.then -> lpad probability is 0x00000800 / 0x80000000 = 0.00%			; CHECK: edge if.then -> lpad probability is 0x40000000 / 0x80000000 = 50.00%

	invoke.cont: ; preds = %if.then			invoke.cont: ; preds = %if.then
	call void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #1			call void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #1
	unreachable			unreachable

	lpad: ; preds = %if.then			lpad: ; preds = %if.then
	%ll = landingpad { i8*, i32 }			%ll = landingpad { i8*, i32 }
	cleanup			cleanup
	Show All 19 Lines

llvm/test/Analysis/BranchProbabilityInfo/unreachable.ll

This file was added.

				; RUN: opt -analyze -branch-prob < %s \| FileCheck %s
				; RUN: opt < %s -passes='print<branch-prob>' -disable-output 2>&1 \| FileCheck %s

				declare void @bar() cold

				; Both 'l1' and 'r1' has one edge leading to 'cold' and another one to
				; 'unreachable' blocks. Check that 'cold' paths are preferred. Also ensure both
				; paths from 'entry' block are equal.
				define void @test1(i32 %0) {
				;CHECK: edge entry -> l1 probability is 0x40000000 / 0x80000000 = 50.00%
				;CHECK: edge entry -> r1 probability is 0x40000000 / 0x80000000 = 50.00%
				;CHECK: edge l1 -> cold probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge l1 -> unreached probability is 0x00000000 / 0x80000000 = 0.00%
				;CHECK: edge r1 -> unreached probability is 0x00000000 / 0x80000000 = 0.00%
				;CHECK: edge r1 -> cold probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

				entry:
				br i1 undef, label %l1, label %r1

				l1:
				br i1 undef, label %cold, label %unreached

				r1:
				br i1 undef, label %unreached, label %cold

				unreached:
				unreachable

				cold:
				call void @bar()
				ret void
				}

				; Both edges of 'l1' leads to 'cold' blocks while one edge of 'r1' leads to
				; 'unreachable' block. Check that 'l1' has 50/50 while 'r1' has 0/100
				; distributuion. Also ensure both paths from 'entry' block are equal.
				define void @test2(i32 %0) {
				;CHECK: edge entry -> l1 probability is 0x40000000 / 0x80000000 = 50.00%
				;CHECK: edge entry -> r1 probability is 0x40000000 / 0x80000000 = 50.00%
				;CHECK: edge l1 -> cold probability is 0x40000000 / 0x80000000 = 50.00%
				;CHECK: edge l1 -> cold2 probability is 0x40000000 / 0x80000000 = 50.00%
				;CHECK: edge r1 -> unreached probability is 0x00000000 / 0x80000000 = 0.00%
				;CHECK: edge r1 -> cold probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

				entry:
				br i1 undef, label %l1, label %r1

				l1:
				br i1 undef, label %cold, label %cold2

				r1:
				br i1 undef, label %unreached, label %cold

				unreached:
				unreachable

				cold:
				call void @bar()
				ret void

				cold2:
				call void @bar()
				ret void
				}

				; Both edges of 'r1' leads to 'unreachable' blocks while one edge of 'l1' leads to
				; 'cold' block. Ensure that path leading to 'cold' block is preferred.
				define void @test3(i32 %0) {
				;CHECK: edge entry -> l1 probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge entry -> r1 probability is 0x00000000 / 0x80000000 = 0.00%
				;CHECK: edge l1 -> cold probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge l1 -> unreached probability is 0x00000000 / 0x80000000 = 0.00%
				;CHECK: edge r1 -> unreached probability is 0x40000000 / 0x80000000 = 50.00%
				;CHECK: edge r1 -> unreached2 probability is 0x40000000 / 0x80000000 = 50.00%

				entry:
				br i1 undef, label %l1, label %r1

				l1:
				br i1 undef, label %cold, label %unreached

				r1:
				br i1 undef, label %unreached, label %unreached2

				unreached:
				unreachable

				unreached2:
				unreachable

				cold:
				call void @bar()
				ret void
				}

				; Left edge of 'entry' leads to 'cold' block while right edge is 'normal' continuation.
				; Check that we able to propagate 'cold' weight to 'entry' block. Also ensure
				; both edges from 'l1' are equally likely.
				define void @test4(i32 %0) {
				;CHECK: edge entry -> l1 probability is 0x078780e3 / 0x80000000 = 5.88%
				;CHECK: edge entry -> r1 probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]
				;CHECK: edge l1 -> l2 probability is 0x40000000 / 0x80000000 = 50.00%
				;CHECK: edge l1 -> r2 probability is 0x40000000 / 0x80000000 = 50.00%
				;CHECK: edge l2 -> to.cold probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge r2 -> to.cold probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge to.cold -> cold probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]

				entry:
				br i1 undef, label %l1, label %r1

				l1:
				br i1 undef, label %l2, label %r2

				l2:
				br label %to.cold

				r2:
				br label %to.cold

				to.cold:
				br label %cold

				r1:
				ret void

				cold:
				call void @bar()
				ret void
				}

				; Check that most likely path from 'entry' to 'l2' through 'r1' is preferred.
				define void @test5(i32 %0) {
				;CHECK: edge entry -> cold probability is 0x078780e3 / 0x80000000 = 5.88%
				;CHECK: edge entry -> r1 probability is 0x78787f1d / 0x80000000 = 94.12% [HOT edge]
				;CHECK: edge cold -> l2 probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge r1 -> l2 probability is 0x80000000 / 0x80000000 = 100.00% [HOT edge]
				;CHECK: edge r1 -> unreached probability is 0x00000000 / 0x80000000 = 0.00%

				entry:
				br i1 undef, label %cold, label %r1

				cold:
				call void @bar()
				br label %l2

				r1:
				br i1 undef, label %l2, label %unreached

				l2:
				ret void

				unreached:
				unreachable
				}

llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-invoke-probabilities.ll

	; RUN: llc -mtriple=aarch64-apple-ios -global-isel -stop-after=irtranslator %s -o - \| FileCheck %s			; RUN: llc -mtriple=aarch64-apple-ios -global-isel -stop-after=irtranslator %s -o - \| FileCheck %s

	%struct.foo = type { i64, i64, %struct.pluto, %struct.pluto }			%struct.foo = type { i64, i64, %struct.pluto, %struct.pluto }
	%struct.pluto = type { %struct.wombat }			%struct.pluto = type { %struct.wombat }
	%struct.wombat = type { i32, i32, %struct.barney }			%struct.wombat = type { i32, i32, %struct.barney }
	%struct.barney = type { %struct.widget }			%struct.barney = type { %struct.widget }
	%struct.widget = type { i32* }			%struct.widget = type { i32* }

	declare i32 @hoge(...)			declare i32 @hoge(...)

	define void @pluto() align 2 personality i8* bitcast (i32 (...)* @hoge to i8*) {			define void @pluto() align 2 personality i8* bitcast (i32 (...)* @hoge to i8*) {
	; CHECK-LABEL: @pluto			; CHECK-LABEL: @pluto
	; CHECK: bb.1.bb			; CHECK: bb.1.bb
	; CHECK: successors: %bb.2(0x7ffff800), %bb.3(0x00000800)			; CHECK: successors: %bb.2(0x40000000), %bb.3(0x40000000)
	; CHECK: EH_LABEL <mcsymbol >			; CHECK: EH_LABEL <mcsymbol >
	; CHECK: G_BR %bb.2			; CHECK: G_BR %bb.2

	bb:			bb:
	invoke void @spam()			invoke void @spam()
	to label %bb1 unwind label %bb2			to label %bb1 unwind label %bb2

	bb1: ; preds = %bb			bb1: ; preds = %bb
	Show All 10 Lines

llvm/test/CodeGen/AMDGPU/transform-block-with-return-to-epilog.ll

; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
; RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx900 -verify-machineinstrs -stop-after=si-pre-emit-peephole -o - %s \| FileCheck -check-prefix=GCN %s		; RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx900 -verify-machineinstrs -stop-after=si-pre-emit-peephole -o - %s \| FileCheck -check-prefix=GCN %s
; If the block containing the SI_RETURN_TO_EPILOG is not the last block, insert an empty block at the end and		; If the block containing the SI_RETURN_TO_EPILOG is not the last block, insert an empty block at the end and
; insert an unconditional jump there.		; insert an unconditional jump there.
define amdgpu_ps float @simple_test_return_to_epilog(float %a) #0 {		define amdgpu_ps float @simple_test_return_to_epilog(float %a) #0 {
; GCN-LABEL: name: simple_test_return_to_epilog		; GCN-LABEL: name: simple_test_return_to_epilog
; GCN: bb.0.entry:		; GCN: bb.0.entry:
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN: SI_RETURN_TO_EPILOG killed $vgpr0		; GCN: SI_RETURN_TO_EPILOG killed $vgpr0
entry:		entry:
ret float %a		ret float %a
}		}

define amdgpu_ps float @test_return_to_epilog_into_end_block(i32 inreg %a, float %b) #0 {		define amdgpu_ps float @test_return_to_epilog_into_end_block(i32 inreg %a, float %b) #0 {
; GCN-LABEL: name: test_return_to_epilog_into_end_block		; GCN-LABEL: name: test_return_to_epilog_into_end_block
; GCN: bb.0.entry:		; GCN: bb.0.entry:
; GCN: successors: %bb.1(0x7fffffff), %bb.2(0x00000001)		; GCN: successors: %bb.1(0x80000000), %bb.2(0x00000000)
; GCN: liveins: $sgpr2, $vgpr0		; GCN: liveins: $sgpr2, $vgpr0
; GCN: S_CMP_LT_I32 killed renamable $sgpr2, 1, implicit-def $scc		; GCN: S_CMP_LT_I32 killed renamable $sgpr2, 1, implicit-def $scc
; GCN: S_CBRANCH_SCC1 %bb.2, implicit killed $scc		; GCN: S_CBRANCH_SCC1 %bb.2, implicit killed $scc
; GCN: bb.1.if:		; GCN: bb.1.if:
; GCN: successors: %bb.3(0x80000000)		; GCN: successors: %bb.3(0x80000000)
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN: S_BRANCH %bb.3		; GCN: S_BRANCH %bb.3
; GCN: bb.2.else:		; GCN: bb.2.else:
Show All 18 Lines	define amdgpu_ps float @test_unify_return_to_epilog_into_end_block(i32 inreg %a, i32 inreg %b, float %c, float %d) #0 {
; GCN: liveins: $sgpr2, $sgpr3, $vgpr0, $vgpr1		; GCN: liveins: $sgpr2, $sgpr3, $vgpr0, $vgpr1
; GCN: S_CMP_LT_I32 killed renamable $sgpr2, 1, implicit-def $scc		; GCN: S_CMP_LT_I32 killed renamable $sgpr2, 1, implicit-def $scc
; GCN: S_CBRANCH_SCC1 %bb.2, implicit killed $scc		; GCN: S_CBRANCH_SCC1 %bb.2, implicit killed $scc
; GCN: bb.1.if:		; GCN: bb.1.if:
; GCN: successors: %bb.5(0x80000000)		; GCN: successors: %bb.5(0x80000000)
; GCN: liveins: $vgpr0		; GCN: liveins: $vgpr0
; GCN: S_BRANCH %bb.5		; GCN: S_BRANCH %bb.5
; GCN: bb.2.else.if.cond:		; GCN: bb.2.else.if.cond:
; GCN: successors: %bb.3(0x7fffffff), %bb.4(0x00000001)		; GCN: successors: %bb.3(0x80000000), %bb.4(0x00000000)
; GCN: liveins: $sgpr3, $vgpr1		; GCN: liveins: $sgpr3, $vgpr1
; GCN: S_CMP_LT_I32 killed renamable $sgpr3, 1, implicit-def $scc		; GCN: S_CMP_LT_I32 killed renamable $sgpr3, 1, implicit-def $scc
; GCN: S_CBRANCH_SCC1 %bb.4, implicit killed $scc		; GCN: S_CBRANCH_SCC1 %bb.4, implicit killed $scc
; GCN: bb.3.else.if:		; GCN: bb.3.else.if:
; GCN: successors: %bb.5(0x80000000)		; GCN: successors: %bb.5(0x80000000)
; GCN: liveins: $vgpr1		; GCN: liveins: $vgpr1
; GCN: $vgpr0 = V_MOV_B32_e32 killed $vgpr1, implicit $exec, implicit $exec		; GCN: $vgpr0 = V_MOV_B32_e32 killed $vgpr1, implicit $exec, implicit $exec
; GCN: S_BRANCH %bb.5		; GCN: S_BRANCH %bb.5
▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/ifcvt-branch-weight-bug.ll

	Show All 16 Lines
	; for.body -> lor.lhs.false.i (50%)			; for.body -> lor.lhs.false.i (50%)
	; -> for.cond.backedge (50%)			; -> for.cond.backedge (50%)
	; lor.lhs.false.i -> for.cond.backedge (100%)			; lor.lhs.false.i -> for.cond.backedge (100%)
	; -> cond.false.i (0%)			; -> cond.false.i (0%)
	; Afer if conversion, we have			; Afer if conversion, we have
	; for.body -> for.cond.backedge (100%)			; for.body -> for.cond.backedge (100%)
	; -> cond.false.i (0%)			; -> cond.false.i (0%)
	; CHECK: bb.1.for.body:			; CHECK: bb.1.for.body:
	; CHECK: successors: %bb.2(0x80000000), %bb.4(0x00000001)			; CHECK: successors: %bb.2(0x80000000), %bb.4(0x00000000)
	for.body:			for.body:
	br i1 undef, label %for.cond.backedge, label %lor.lhs.false.i, !prof !1			br i1 undef, label %for.cond.backedge, label %lor.lhs.false.i, !prof !1

	for.cond.backedge:			for.cond.backedge:
	%tobool = icmp eq %classL* %p0, null			%tobool = icmp eq %classL* %p0, null
	br i1 %tobool, label %for.end, label %for.body			br i1 %tobool, label %for.end, label %for.body

	lor.lhs.false.i:			lor.lhs.false.i:
	Show All 31 Lines

llvm/test/CodeGen/ARM/sub-cmp-peephole.ll

	Show First 20 Lines • Show All 162 Lines • ▼ Show 20 Lines

	; If the comparison uses the V bit (signed overflow/underflow), we can't			; If the comparison uses the V bit (signed overflow/underflow), we can't
	; omit the comparison.			; omit the comparison.
	define i32 @cmp_slt0(i32 %a, i32 %b, i32 %x, i32 %y) {			define i32 @cmp_slt0(i32 %a, i32 %b, i32 %x, i32 %y) {
	entry:			entry:
	; CHECK-LABEL: cmp_slt0			; CHECK-LABEL: cmp_slt0
	; CHECK: sub			; CHECK: sub
	; CHECK: cmn			; CHECK: cmn
	; CHECK: bgt			; CHECK: ble
	%load = load i32, i32* @t, align 4			%load = load i32, i32* @t, align 4
	%sub = sub i32 %load, 17			%sub = sub i32 %load, 17
	%cmp = icmp slt i32 %sub, 0			%cmp = icmp slt i32 %sub, 0
	br i1 %cmp, label %if.then, label %if.else			br i1 %cmp, label %if.then, label %if.else

	if.then:			if.then:
	call void @abort()			call void @abort()
	unreachable			unreachable
	Show All 27 Lines

llvm/test/CodeGen/ARM/v8m.base-jumptable_alignment.ll

	Show All 17 Lines
	; CHECK-NEXT: ldr r0, [r0]			; CHECK-NEXT: ldr r0, [r0]
	; CHECK-NEXT: movs r0, #0			; CHECK-NEXT: movs r0, #0
	; CHECK-NEXT: cmp r0, #0			; CHECK-NEXT: cmp r0, #0
	; CHECK-NEXT: beq .LBB0_8			; CHECK-NEXT: beq .LBB0_8
	; CHECK-NEXT: @ %bb.1: @ %for.cond7.preheader.i.lr.ph.i.i			; CHECK-NEXT: @ %bb.1: @ %for.cond7.preheader.i.lr.ph.i.i
	; CHECK-NEXT: bne .LBB0_8			; CHECK-NEXT: bne .LBB0_8
	; CHECK-NEXT: .LBB0_2: @ %for.cond14.preheader.us.i.i.i			; CHECK-NEXT: .LBB0_2: @ %for.cond14.preheader.us.i.i.i
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: cbnz r0, .LBB0_7			; CHECK-NEXT: cbnz r0, .LBB0_6
	; CHECK-NEXT: @ %bb.3: @ %for.cond14.preheader.us.i.i.i			; CHECK-NEXT: @ %bb.3: @ %for.cond14.preheader.us.i.i.i
	; CHECK-NEXT: @ in Loop: Header=BB0_2 Depth=1			; CHECK-NEXT: @ in Loop: Header=BB0_2 Depth=1
	; CHECK-NEXT: lsls r1, r0, #2			; CHECK-NEXT: lsls r1, r0, #2
	; CHECK-NEXT: adr r2, .LJTI0_0			; CHECK-NEXT: adr r2, .LJTI0_0
	; CHECK-NEXT: adds r1, r2, r1			; CHECK-NEXT: adds r1, r2, r1
	; CHECK-NEXT: mov pc, r1			; CHECK-NEXT: mov pc, r1
	; CHECK-NEXT: @ %bb.4:			; CHECK-NEXT: @ %bb.4:
	; CHECK-NEXT: .p2align 2			; CHECK-NEXT: .p2align 2
	; CHECK-NEXT: .LJTI0_0:			; CHECK-NEXT: .LJTI0_0:
	; CHECK-NEXT: b.w .LBB0_5			; CHECK-NEXT: b.w .LBB0_5
	; CHECK-NEXT: b.w .LBB0_7
	; CHECK-NEXT: b.w .LBB0_6			; CHECK-NEXT: b.w .LBB0_6
	; CHECK-NEXT: b.w .LBB0_8
	; CHECK-NEXT: b.w .LBB0_7
	; CHECK-NEXT: b.w .LBB0_7
	; CHECK-NEXT: b.w .LBB0_7
	; CHECK-NEXT: b.w .LBB0_7
	; CHECK-NEXT: b.w .LBB0_7
	; CHECK-NEXT: b.w .LBB0_7
	; CHECK-NEXT: b.w .LBB0_7			; CHECK-NEXT: b.w .LBB0_7
				; CHECK-NEXT: b.w .LBB0_8
				; CHECK-NEXT: b.w .LBB0_6
				; CHECK-NEXT: b.w .LBB0_6
				; CHECK-NEXT: b.w .LBB0_6
				; CHECK-NEXT: b.w .LBB0_6
				; CHECK-NEXT: b.w .LBB0_6
				; CHECK-NEXT: b.w .LBB0_6
				; CHECK-NEXT: b.w .LBB0_6
	; CHECK-NEXT: b.w .LBB0_5			; CHECK-NEXT: b.w .LBB0_5
	; CHECK-NEXT: .LBB0_5: @ %for.cond14.preheader.us.i.i.i			; CHECK-NEXT: .LBB0_5: @ %for.cond14.preheader.us.i.i.i
	; CHECK-NEXT: @ in Loop: Header=BB0_2 Depth=1			; CHECK-NEXT: @ in Loop: Header=BB0_2 Depth=1
	; CHECK-NEXT: b .LBB0_2			; CHECK-NEXT: b .LBB0_2
	; CHECK-NEXT: .LBB0_6: @ %lbl_1394.i.i.i.loopexit			; CHECK-NEXT: .LBB0_6: @ %func_1.exit.loopexit
	; CHECK-NEXT: .LBB0_7: @ %func_1.exit.loopexit			; CHECK-NEXT: .LBB0_7: @ %lbl_1394.i.i.i.loopexit
	; CHECK-NEXT: .LBB0_8: @ %for.end476.i.i.i.loopexit			; CHECK-NEXT: .LBB0_8: @ %for.end476.i.i.i.loopexit
	entry:			entry:
	%0 = load volatile i32, i32* @g_566, align 4			%0 = load volatile i32, i32* @g_566, align 4
	br label %func_16.exit.i.i.i			br label %func_16.exit.i.i.i

	lbl_1394.i.i.i.loopexit: ; preds = %for.cond14.preheader.us.i.i.i			lbl_1394.i.i.i.loopexit: ; preds = %for.cond14.preheader.us.i.i.i
	unreachable			unreachable

	Show All 30 Lines

llvm/test/CodeGen/PowerPC/p10-spill-crgt.ll

	Show All 17 Lines
	define dso_local fastcc void @P10_Spill_CR_GT() unnamed_addr {			define dso_local fastcc void @P10_Spill_CR_GT() unnamed_addr {
	; CHECK-LABEL: P10_Spill_CR_GT:			; CHECK-LABEL: P10_Spill_CR_GT:
	; CHECK: .localentry P10_Spill_CR_GT, 1			; CHECK: .localentry P10_Spill_CR_GT, 1
	; CHECK-NEXT: # %bb.0: # %bb			; CHECK-NEXT: # %bb.0: # %bb
	; CHECK-NEXT: mflr r0			; CHECK-NEXT: mflr r0
	; CHECK-NEXT: mfcr r12			; CHECK-NEXT: mfcr r12
	; CHECK-NEXT: std r0, 16(r1)			; CHECK-NEXT: std r0, 16(r1)
	; CHECK-NEXT: stw r12, 8(r1)			; CHECK-NEXT: stw r12, 8(r1)
	; CHECK-NEXT: stdu r1, -80(r1)			; CHECK-NEXT: stdu r1, -64(r1)
	; CHECK-NEXT: .cfi_def_cfa_offset 80			; CHECK-NEXT: .cfi_def_cfa_offset 64
	; CHECK-NEXT: .cfi_offset lr, 16			; CHECK-NEXT: .cfi_offset lr, 16
	; CHECK-NEXT: .cfi_offset r29, -24			; CHECK-NEXT: .cfi_offset r29, -24
	; CHECK-NEXT: .cfi_offset r30, -16			; CHECK-NEXT: .cfi_offset r30, -16
	; CHECK-NEXT: .cfi_offset cr2, 8			; CHECK-NEXT: .cfi_offset cr2, 8
	; CHECK-NEXT: .cfi_offset cr3, 8			; CHECK-NEXT: .cfi_offset cr3, 8
	; CHECK-NEXT: .cfi_offset cr4, 8			; CHECK-NEXT: .cfi_offset cr4, 8
	; CHECK-NEXT: lwz r3, 0(r3)			; CHECK-NEXT: lwz r3, 0(r3)
	; CHECK-NEXT: std r29, 56(r1) # 8-byte Folded Spill			; CHECK-NEXT: std r29, 40(r1) # 8-byte Folded Spill
	; CHECK-NEXT: std r30, 64(r1) # 8-byte Folded Spill			; CHECK-NEXT: std r30, 48(r1) # 8-byte Folded Spill
	; CHECK-NEXT: paddi r29, 0, .LJTI0_0@PCREL, 1			; CHECK-NEXT: paddi r29, 0, .LJTI0_0@PCREL, 1
	; CHECK-NEXT: srwi r4, r3, 4			; CHECK-NEXT: srwi r4, r3, 4
	; CHECK-NEXT: srwi r3, r3, 5			; CHECK-NEXT: srwi r3, r3, 5
	; CHECK-NEXT: andi. r4, r4, 1			; CHECK-NEXT: andi. r4, r4, 1
	; CHECK-NEXT: li r4, 0			; CHECK-NEXT: li r4, 0
	; CHECK-NEXT: crmove 4*cr4+lt, gt			; CHECK-NEXT: crmove 4*cr2+gt, gt
	; CHECK-NEXT: andi. r3, r3, 1			; CHECK-NEXT: andi. r3, r3, 1
	; CHECK-NEXT: setnbc r3, gt			; CHECK-NEXT: crmove 4*cr2+lt, gt
	; CHECK-NEXT: stw r3, 52(r1)
	; CHECK-NEXT: cmplwi cr3, r3, 336			; CHECK-NEXT: cmplwi cr3, r3, 336
	; CHECK-NEXT: li r3, 0			; CHECK-NEXT: li r3, 0
	; CHECK-NEXT: sldi r30, r3, 2			; CHECK-NEXT: sldi r30, r3, 2
	; CHECK-NEXT: b .LBB0_2			; CHECK-NEXT: b .LBB0_2
	; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_1: # %bb43			; CHECK-NEXT: .LBB0_1: # %bb43
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: bl call_1@notoc			; CHECK-NEXT: bl call_1@notoc
	; CHECK-NEXT: li r4, 0			; CHECK-NEXT: li r4, 0
	; CHECK-NEXT: setnbc r3, 4*cr2+eq			; CHECK-NEXT: setnbc r3, 4*cr4+eq
	; CHECK-NEXT: stb r4, 0(r3)			; CHECK-NEXT: stb r4, 0(r3)
	; CHECK-NEXT: li r4, 0			; CHECK-NEXT: li r4, 0
				; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_2: # %bb5			; CHECK-NEXT: .LBB0_2: # %bb5
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: bc 12, 4*cr4+lt, .LBB0_31			; CHECK-NEXT: bc 12, 4*cr2+gt, .LBB0_31
	; CHECK-NEXT: # %bb.3: # %bb10			; CHECK-NEXT: # %bb.3: # %bb10
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: bgt cr3, .LBB0_5			; CHECK-NEXT: bgt cr3, .LBB0_5
	; CHECK-NEXT: # %bb.4: # %bb10			; CHECK-NEXT: # %bb.4: # %bb10
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: mr r3, r4			; CHECK-NEXT: mr r3, r4
	; CHECK-NEXT: lwz r5, 0(r3)			; CHECK-NEXT: lwz r5, 0(r3)
	; CHECK-NEXT: rlwinm r4, r5, 0, 21, 22			; CHECK-NEXT: rlwinm r4, r5, 0, 21, 22
	; CHECK-NEXT: cmpwi cr2, r4, 512			; CHECK-NEXT: cmpwi cr4, r4, 512
	; CHECK-NEXT: lwax r4, r30, r29			; CHECK-NEXT: lwax r4, r30, r29
	; CHECK-NEXT: add r4, r4, r29			; CHECK-NEXT: add r4, r4, r29
	; CHECK-NEXT: mtctr r4			; CHECK-NEXT: mtctr r4
	; CHECK-NEXT: li r4, 0			; CHECK-NEXT: li r4, 0
	; CHECK-NEXT: bctr			; CHECK-NEXT: bctr
	; CHECK-NEXT: .LBB0_5: # %bb13			; CHECK-NEXT: .LBB0_5: # %bb13
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: li r4, 16			; CHECK-NEXT: li r4, 16
	Show All 22 Lines
	; CHECK-NEXT: .LBB0_11: # %bb42			; CHECK-NEXT: .LBB0_11: # %bb42
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_11			; CHECK-NEXT: b .LBB0_11
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_12: # %bb54			; CHECK-NEXT: .LBB0_12: # %bb54
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_12			; CHECK-NEXT: b .LBB0_12
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_13: # %bb47			; CHECK-NEXT: .LBB0_13: # %bb61
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_13			; CHECK-NEXT: b .LBB0_13
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_14: # %bb58			; CHECK-NEXT: .LBB0_14: # %bb47
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_14			; CHECK-NEXT: b .LBB0_14
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_15: # %bb24			; CHECK-NEXT: .LBB0_15: # %bb24
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_15			; CHECK-NEXT: b .LBB0_15
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_16: # %bb19			; CHECK-NEXT: .LBB0_16: # %bb19
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_16			; CHECK-NEXT: b .LBB0_16
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_17: # %bb23			; CHECK-NEXT: .LBB0_17: # %bb59
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_17			; CHECK-NEXT: b .LBB0_17
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_18: # %bb60			; CHECK-NEXT: .LBB0_18: # %bb46
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_18			; CHECK-NEXT: b .LBB0_18
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_19: # %bb59			; CHECK-NEXT: .LBB0_19: # %bb49
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_19			; CHECK-NEXT: b .LBB0_19
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_20: # %bb46			; CHECK-NEXT: .LBB0_20: # %bb57
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_20			; CHECK-NEXT: b .LBB0_20
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_21: # %bb49			; CHECK-NEXT: .LBB0_21: # %bb18
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_21			; CHECK-NEXT: b .LBB0_21
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_22: # %bb57			; CHECK-NEXT: .LBB0_22: # %bb58
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_22			; CHECK-NEXT: b .LBB0_22
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_23: # %bb56			; CHECK-NEXT: .LBB0_23: # %bb23
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_23			; CHECK-NEXT: b .LBB0_23
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_24: # %bb20			; CHECK-NEXT: .LBB0_24: # %bb60
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_24			; CHECK-NEXT: b .LBB0_24
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_25: # %bb18			; CHECK-NEXT: .LBB0_25: # %bb55
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_25			; CHECK-NEXT: b .LBB0_25
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_26: # %bb61			; CHECK-NEXT: .LBB0_26: # %bb62
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_26			; CHECK-NEXT: b .LBB0_26
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_27: # %bb55			; CHECK-NEXT: .LBB0_27: # %bb56
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_27			; CHECK-NEXT: b .LBB0_27
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_28: # %bb62			; CHECK-NEXT: .LBB0_28: # %bb20
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_28			; CHECK-NEXT: b .LBB0_28
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_29: # %bb50			; CHECK-NEXT: .LBB0_29: # %bb50
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_29			; CHECK-NEXT: b .LBB0_29
	; CHECK-NEXT: .p2align 4			; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_30: # %bb48			; CHECK-NEXT: .LBB0_30: # %bb48
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: b .LBB0_30			; CHECK-NEXT: b .LBB0_30
	; CHECK-NEXT: .LBB0_31: # %bb9			; CHECK-NEXT: .LBB0_31: # %bb9
	; CHECK-NEXT: ld r30, 64(r1) # 8-byte Folded Reload			; CHECK-NEXT: ld r30, 48(r1) # 8-byte Folded Reload
	; CHECK-NEXT: ld r29, 56(r1) # 8-byte Folded Reload			; CHECK-NEXT: ld r29, 40(r1) # 8-byte Folded Reload
	; CHECK-NEXT: addi r1, r1, 80			; CHECK-NEXT: addi r1, r1, 64
	; CHECK-NEXT: ld r0, 16(r1)			; CHECK-NEXT: ld r0, 16(r1)
	; CHECK-NEXT: lwz r12, 8(r1)			; CHECK-NEXT: lwz r12, 8(r1)
	; CHECK-NEXT: mtlr r0			; CHECK-NEXT: mtlr r0
	; CHECK-NEXT: mtocrf 32, r12			; CHECK-NEXT: mtocrf 32, r12
	; CHECK-NEXT: mtocrf 16, r12			; CHECK-NEXT: mtocrf 16, r12
	; CHECK-NEXT: mtocrf 8, r12			; CHECK-NEXT: mtocrf 8, r12
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	; CHECK-NEXT: .LBB0_32: # %bb29			; CHECK-NEXT: .LBB0_32: # %bb29
	; CHECK-NEXT: lwz r4, 52(r1)			; CHECK-NEXT: mcrf cr0, cr4
	; CHECK-NEXT: cmpwi cr4, r3, 0
	; CHECK-NEXT: setnbc r30, 4*cr2+eq
	; CHECK-NEXT: # implicit-def: $cr2lt
	; CHECK-NEXT: mfocrf r3, 32
	; CHECK-NEXT: cmpwi cr3, r5, 366			; CHECK-NEXT: cmpwi cr3, r5, 366
				; CHECK-NEXT: cmpwi cr4, r3, 0
	; CHECK-NEXT: li r29, 0			; CHECK-NEXT: li r29, 0
	; CHECK-NEXT: rlwimi r3, r4, 24, 8, 8			; CHECK-NEXT: setnbc r30, eq
	; CHECK-NEXT: mtocrf 32, r3			; CHECK-NEXT: bc 12, 4*cr2+lt, .LBB0_36
	; CHECK-NEXT: .p2align 5			; CHECK-NEXT: .p2align 5
	; CHECK-NEXT: .LBB0_33: # %bb32			; CHECK-NEXT: .LBB0_33: # %bb36
	; CHECK-NEXT: #			; CHECK-NEXT: bc 12, 4*cr4+eq, .LBB0_35
	; CHECK-NEXT: bc 4, 4*cr2+lt, .LBB0_35			; CHECK-NEXT: .LBB0_34: # %bb32
	; CHECK-NEXT: # %bb.34: # %bb33			; CHECK-NEXT: bc 4, 4*cr2+lt, .LBB0_33
	; CHECK-NEXT: #			; CHECK-NEXT: b .LBB0_36
	; CHECK-NEXT: stb r29, 0(r30)			; CHECK-NEXT: .p2align 5
	; CHECK-NEXT: .LBB0_35: # %bb36			; CHECK-NEXT: .LBB0_35: # %bb39
	; CHECK-NEXT: #
	; CHECK-NEXT: bc 4, 4*cr4+eq, .LBB0_33
	; CHECK-NEXT: # %bb.36: # %bb39
	; CHECK-NEXT: #
	; CHECK-NEXT: bl call_2@notoc			; CHECK-NEXT: bl call_2@notoc
	; CHECK-NEXT: b .LBB0_33			; CHECK-NEXT: bc 4, 4*cr2+lt, .LBB0_33
				; CHECK-NEXT: .LBB0_36: # %bb33
				; CHECK-NEXT: stb r29, 0(r30)
				; CHECK-NEXT: bc 4, 4*cr4+eq, .LBB0_34
				; CHECK-NEXT: b .LBB0_35
	;			;
	; CHECK-BE-LABEL: P10_Spill_CR_GT:			; CHECK-BE-LABEL: P10_Spill_CR_GT:
	; CHECK-BE: # %bb.0: # %bb			; CHECK-BE: # %bb.0: # %bb
	; CHECK-BE-NEXT: mflr r0			; CHECK-BE-NEXT: mflr r0
	; CHECK-BE-NEXT: mfcr r12			; CHECK-BE-NEXT: mfcr r12
	; CHECK-BE-NEXT: std r0, 16(r1)			; CHECK-BE-NEXT: std r0, 16(r1)
	; CHECK-BE-NEXT: stw r12, 8(r1)			; CHECK-BE-NEXT: stw r12, 8(r1)
	; CHECK-BE-NEXT: stdu r1, -160(r1)			; CHECK-BE-NEXT: stdu r1, -144(r1)
	; CHECK-BE-NEXT: .cfi_def_cfa_offset 160			; CHECK-BE-NEXT: .cfi_def_cfa_offset 144
	; CHECK-BE-NEXT: .cfi_offset lr, 16			; CHECK-BE-NEXT: .cfi_offset lr, 16
	; CHECK-BE-NEXT: .cfi_offset r29, -24			; CHECK-BE-NEXT: .cfi_offset r29, -24
	; CHECK-BE-NEXT: .cfi_offset r30, -16			; CHECK-BE-NEXT: .cfi_offset r30, -16
	; CHECK-BE-NEXT: .cfi_offset cr2, 8			; CHECK-BE-NEXT: .cfi_offset cr2, 8
	; CHECK-BE-NEXT: .cfi_offset cr2, 8			; CHECK-BE-NEXT: .cfi_offset cr2, 8
	; CHECK-BE-NEXT: .cfi_offset cr2, 8			; CHECK-BE-NEXT: .cfi_offset cr2, 8
	; CHECK-BE-NEXT: lwz r3, 0(r3)			; CHECK-BE-NEXT: lwz r3, 0(r3)
	; CHECK-BE-NEXT: std r29, 136(r1) # 8-byte Folded Spill			; CHECK-BE-NEXT: std r29, 120(r1) # 8-byte Folded Spill
	; CHECK-BE-NEXT: std r30, 144(r1) # 8-byte Folded Spill			; CHECK-BE-NEXT: std r30, 128(r1) # 8-byte Folded Spill
	; CHECK-BE-NEXT: srwi r4, r3, 4			; CHECK-BE-NEXT: srwi r4, r3, 4
	; CHECK-BE-NEXT: srwi r3, r3, 5			; CHECK-BE-NEXT: srwi r3, r3, 5
	; CHECK-BE-NEXT: andi. r4, r4, 1			; CHECK-BE-NEXT: andi. r4, r4, 1
	; CHECK-BE-NEXT: li r4, 0			; CHECK-BE-NEXT: li r4, 0
	; CHECK-BE-NEXT: crmove 4*cr4+lt, gt			; CHECK-BE-NEXT: crmove 4*cr2+gt, gt
	; CHECK-BE-NEXT: andi. r3, r3, 1			; CHECK-BE-NEXT: andi. r3, r3, 1
	; CHECK-BE-NEXT: setnbc r3, gt			; CHECK-BE-NEXT: crmove 4*cr2+lt, gt
	; CHECK-BE-NEXT: stw r3, 132(r1)
	; CHECK-BE-NEXT: cmplwi cr3, r3, 336			; CHECK-BE-NEXT: cmplwi cr3, r3, 336
	; CHECK-BE-NEXT: li r3, 0			; CHECK-BE-NEXT: li r3, 0
	; CHECK-BE-NEXT: sldi r30, r3, 2			; CHECK-BE-NEXT: sldi r30, r3, 2
	; CHECK-BE-NEXT: addis r3, r2, .LC0@toc@ha			; CHECK-BE-NEXT: addis r3, r2, .LC0@toc@ha
	; CHECK-BE-NEXT: ld r29, .LC0@toc@l(r3)			; CHECK-BE-NEXT: ld r29, .LC0@toc@l(r3)
	; CHECK-BE-NEXT: b .LBB0_2			; CHECK-BE-NEXT: b .LBB0_2
	; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_1: # %bb43			; CHECK-BE-NEXT: .LBB0_1: # %bb43
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: bl call_1			; CHECK-BE-NEXT: bl call_1
	; CHECK-BE-NEXT: nop			; CHECK-BE-NEXT: nop
	; CHECK-BE-NEXT: li r4, 0			; CHECK-BE-NEXT: li r4, 0
	; CHECK-BE-NEXT: setnbc r3, 4*cr2+eq			; CHECK-BE-NEXT: setnbc r3, 4*cr4+eq
	; CHECK-BE-NEXT: stb r4, 0(r3)			; CHECK-BE-NEXT: stb r4, 0(r3)
	; CHECK-BE-NEXT: li r4, 0			; CHECK-BE-NEXT: li r4, 0
				; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_2: # %bb5			; CHECK-BE-NEXT: .LBB0_2: # %bb5
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: bc 12, 4*cr4+lt, .LBB0_31			; CHECK-BE-NEXT: bc 12, 4*cr2+gt, .LBB0_31
	; CHECK-BE-NEXT: # %bb.3: # %bb10			; CHECK-BE-NEXT: # %bb.3: # %bb10
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: bgt cr3, .LBB0_5			; CHECK-BE-NEXT: bgt cr3, .LBB0_5
	; CHECK-BE-NEXT: # %bb.4: # %bb10			; CHECK-BE-NEXT: # %bb.4: # %bb10
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: mr r3, r4			; CHECK-BE-NEXT: mr r3, r4
	; CHECK-BE-NEXT: lwz r5, 0(r3)			; CHECK-BE-NEXT: lwz r5, 0(r3)
	; CHECK-BE-NEXT: rlwinm r4, r5, 0, 21, 22			; CHECK-BE-NEXT: rlwinm r4, r5, 0, 21, 22
	; CHECK-BE-NEXT: cmpwi cr2, r4, 512			; CHECK-BE-NEXT: cmpwi cr4, r4, 512
	; CHECK-BE-NEXT: lwax r4, r30, r29			; CHECK-BE-NEXT: lwax r4, r30, r29
	; CHECK-BE-NEXT: add r4, r4, r29			; CHECK-BE-NEXT: add r4, r4, r29
	; CHECK-BE-NEXT: mtctr r4			; CHECK-BE-NEXT: mtctr r4
	; CHECK-BE-NEXT: li r4, 0			; CHECK-BE-NEXT: li r4, 0
	; CHECK-BE-NEXT: bctr			; CHECK-BE-NEXT: bctr
	; CHECK-BE-NEXT: .LBB0_5: # %bb13			; CHECK-BE-NEXT: .LBB0_5: # %bb13
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: li r4, 16			; CHECK-BE-NEXT: li r4, 16
	Show All 22 Lines
	; CHECK-BE-NEXT: .LBB0_11: # %bb42			; CHECK-BE-NEXT: .LBB0_11: # %bb42
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_11			; CHECK-BE-NEXT: b .LBB0_11
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_12: # %bb54			; CHECK-BE-NEXT: .LBB0_12: # %bb54
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_12			; CHECK-BE-NEXT: b .LBB0_12
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_13: # %bb47			; CHECK-BE-NEXT: .LBB0_13: # %bb61
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_13			; CHECK-BE-NEXT: b .LBB0_13
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_14: # %bb58			; CHECK-BE-NEXT: .LBB0_14: # %bb47
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_14			; CHECK-BE-NEXT: b .LBB0_14
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_15: # %bb24			; CHECK-BE-NEXT: .LBB0_15: # %bb24
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_15			; CHECK-BE-NEXT: b .LBB0_15
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_16: # %bb19			; CHECK-BE-NEXT: .LBB0_16: # %bb19
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_16			; CHECK-BE-NEXT: b .LBB0_16
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_17: # %bb23			; CHECK-BE-NEXT: .LBB0_17: # %bb59
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_17			; CHECK-BE-NEXT: b .LBB0_17
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_18: # %bb60			; CHECK-BE-NEXT: .LBB0_18: # %bb46
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_18			; CHECK-BE-NEXT: b .LBB0_18
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_19: # %bb59			; CHECK-BE-NEXT: .LBB0_19: # %bb49
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_19			; CHECK-BE-NEXT: b .LBB0_19
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_20: # %bb46			; CHECK-BE-NEXT: .LBB0_20: # %bb57
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_20			; CHECK-BE-NEXT: b .LBB0_20
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_21: # %bb49			; CHECK-BE-NEXT: .LBB0_21: # %bb18
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_21			; CHECK-BE-NEXT: b .LBB0_21
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_22: # %bb57			; CHECK-BE-NEXT: .LBB0_22: # %bb58
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_22			; CHECK-BE-NEXT: b .LBB0_22
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_23: # %bb56			; CHECK-BE-NEXT: .LBB0_23: # %bb23
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_23			; CHECK-BE-NEXT: b .LBB0_23
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_24: # %bb20			; CHECK-BE-NEXT: .LBB0_24: # %bb60
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_24			; CHECK-BE-NEXT: b .LBB0_24
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_25: # %bb18			; CHECK-BE-NEXT: .LBB0_25: # %bb55
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_25			; CHECK-BE-NEXT: b .LBB0_25
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_26: # %bb61			; CHECK-BE-NEXT: .LBB0_26: # %bb62
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_26			; CHECK-BE-NEXT: b .LBB0_26
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_27: # %bb55			; CHECK-BE-NEXT: .LBB0_27: # %bb56
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_27			; CHECK-BE-NEXT: b .LBB0_27
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_28: # %bb62			; CHECK-BE-NEXT: .LBB0_28: # %bb20
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_28			; CHECK-BE-NEXT: b .LBB0_28
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_29: # %bb50			; CHECK-BE-NEXT: .LBB0_29: # %bb50
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_29			; CHECK-BE-NEXT: b .LBB0_29
	; CHECK-BE-NEXT: .p2align 4			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_30: # %bb48			; CHECK-BE-NEXT: .LBB0_30: # %bb48
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: b .LBB0_30			; CHECK-BE-NEXT: b .LBB0_30
	; CHECK-BE-NEXT: .LBB0_31: # %bb9			; CHECK-BE-NEXT: .LBB0_31: # %bb9
	; CHECK-BE-NEXT: ld r30, 144(r1) # 8-byte Folded Reload			; CHECK-BE-NEXT: ld r30, 128(r1) # 8-byte Folded Reload
	; CHECK-BE-NEXT: ld r29, 136(r1) # 8-byte Folded Reload			; CHECK-BE-NEXT: ld r29, 120(r1) # 8-byte Folded Reload
	; CHECK-BE-NEXT: addi r1, r1, 160			; CHECK-BE-NEXT: addi r1, r1, 144
	; CHECK-BE-NEXT: ld r0, 16(r1)			; CHECK-BE-NEXT: ld r0, 16(r1)
	; CHECK-BE-NEXT: lwz r12, 8(r1)			; CHECK-BE-NEXT: lwz r12, 8(r1)
	; CHECK-BE-NEXT: mtlr r0			; CHECK-BE-NEXT: mtlr r0
	; CHECK-BE-NEXT: mtocrf 32, r12			; CHECK-BE-NEXT: mtocrf 32, r12
	; CHECK-BE-NEXT: mtocrf 16, r12			; CHECK-BE-NEXT: mtocrf 16, r12
	; CHECK-BE-NEXT: mtocrf 8, r12			; CHECK-BE-NEXT: mtocrf 8, r12
	; CHECK-BE-NEXT: blr			; CHECK-BE-NEXT: blr
	; CHECK-BE-NEXT: .LBB0_32: # %bb29			; CHECK-BE-NEXT: .LBB0_32: # %bb29
	; CHECK-BE-NEXT: lwz r4, 132(r1)			; CHECK-BE-NEXT: mcrf cr0, cr4
	; CHECK-BE-NEXT: cmpwi cr4, r3, 0
	; CHECK-BE-NEXT: setnbc r30, 4*cr2+eq
	; CHECK-BE-NEXT: # implicit-def: $cr2lt
	; CHECK-BE-NEXT: mfocrf r3, 32
	; CHECK-BE-NEXT: cmpwi cr3, r5, 366			; CHECK-BE-NEXT: cmpwi cr3, r5, 366
				; CHECK-BE-NEXT: cmpwi cr4, r3, 0
	; CHECK-BE-NEXT: li r29, 0			; CHECK-BE-NEXT: li r29, 0
	; CHECK-BE-NEXT: rlwimi r3, r4, 24, 8, 8			; CHECK-BE-NEXT: setnbc r30, eq
	; CHECK-BE-NEXT: mtocrf 32, r3			; CHECK-BE-NEXT: bc 12, 4*cr2+lt, .LBB0_36
	; CHECK-BE-NEXT: .p2align 5			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_33: # %bb32			; CHECK-BE-NEXT: .LBB0_33: # %bb36
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: bc 12, 4*cr4+eq, .LBB0_35
	; CHECK-BE-NEXT: bc 4, 4*cr2+lt, .LBB0_35			; CHECK-BE-NEXT: .LBB0_34: # %bb32
	; CHECK-BE-NEXT: # %bb.34: # %bb33			; CHECK-BE-NEXT: bc 4, 4*cr2+lt, .LBB0_33
	; CHECK-BE-NEXT: #			; CHECK-BE-NEXT: b .LBB0_36
	; CHECK-BE-NEXT: stb r29, 0(r30)			; CHECK-BE-NEXT: .p2align 4
	; CHECK-BE-NEXT: .LBB0_35: # %bb36			; CHECK-BE-NEXT: .LBB0_35: # %bb39
	; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: bc 4, 4*cr4+eq, .LBB0_33
	; CHECK-BE-NEXT: # %bb.36: # %bb39
	; CHECK-BE-NEXT: #
	; CHECK-BE-NEXT: bl call_2			; CHECK-BE-NEXT: bl call_2
	; CHECK-BE-NEXT: nop			; CHECK-BE-NEXT: nop
	; CHECK-BE-NEXT: b .LBB0_33			; CHECK-BE-NEXT: bc 4, 4*cr2+lt, .LBB0_33
				; CHECK-BE-NEXT: .LBB0_36: # %bb33
				; CHECK-BE-NEXT: stb r29, 0(r30)
				; CHECK-BE-NEXT: bc 4, 4*cr4+eq, .LBB0_34
				; CHECK-BE-NEXT: b .LBB0_35
	bb:			bb:
	%tmp = load i32, i32* undef, align 8			%tmp = load i32, i32* undef, align 8
	%tmp1 = and i32 %tmp, 16			%tmp1 = and i32 %tmp, 16
	%tmp2 = icmp ne i32 %tmp1, 0			%tmp2 = icmp ne i32 %tmp1, 0
	%tmp3 = and i32 %tmp, 32			%tmp3 = and i32 %tmp, 32
	%tmp4 = icmp ne i32 %tmp3, 0			%tmp4 = icmp ne i32 %tmp3, 0
	br label %bb5			br label %bb5

	▲ Show 20 Lines • Show All 214 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/pr36292.ll

	Show All 9 Lines
	; CHECK-NEXT: mflr 0			; CHECK-NEXT: mflr 0
	; CHECK-NEXT: std 29, -24(1) # 8-byte Folded Spill			; CHECK-NEXT: std 29, -24(1) # 8-byte Folded Spill
	; CHECK-NEXT: std 30, -16(1) # 8-byte Folded Spill			; CHECK-NEXT: std 30, -16(1) # 8-byte Folded Spill
	; CHECK-NEXT: std 0, 16(1)			; CHECK-NEXT: std 0, 16(1)
	; CHECK-NEXT: stdu 1, -64(1)			; CHECK-NEXT: stdu 1, -64(1)
	; CHECK-NEXT: ld 29, 0(3)			; CHECK-NEXT: ld 29, 0(3)
	; CHECK-NEXT: ld 30, 32(1)			; CHECK-NEXT: ld 30, 32(1)
	; CHECK-NEXT: cmpld 30, 29			; CHECK-NEXT: cmpld 30, 29
	; CHECK-NEXT: bge 0, .LBB0_2			; CHECK-NEXT: bge- 0, .LBB0_2
				; CHECK-NEXT: .p2align 5
	; CHECK-NEXT: .LBB0_1: # %bounds.ok			; CHECK-NEXT: .LBB0_1: # %bounds.ok
	; CHECK-NEXT: #			; CHECK-NEXT: #
	; CHECK-NEXT: lfsx 2, 0, 3			; CHECK-NEXT: lfsx 2, 0, 3
	; CHECK-NEXT: xxlxor 1, 1, 1			; CHECK-NEXT: xxlxor 1, 1, 1
	; CHECK-NEXT: bl fmodf			; CHECK-NEXT: bl fmodf
	; CHECK-NEXT: nop			; CHECK-NEXT: nop
	; CHECK-NEXT: addi 30, 30, 1			; CHECK-NEXT: addi 30, 30, 1
	; CHECK-NEXT: stfsx 1, 0, 3			; CHECK-NEXT: stfsx 1, 0, 3
	; CHECK-NEXT: cmpld 30, 29			; CHECK-NEXT: cmpld 30, 29
	; CHECK-NEXT: blt 0, .LBB0_1			; CHECK-NEXT: blt+ 0, .LBB0_1
	; CHECK-NEXT: .LBB0_2: # %bounds.fail			; CHECK-NEXT: .LBB0_2: # %bounds.fail
	; CHECK-NEXT: std 30, 32(1)			; CHECK-NEXT: std 30, 32(1)
	%pos = alloca i64, align 8			%pos = alloca i64, align 8
	br label %forcond			br label %forcond

	forcond: ; preds = %bounds.ok, %0			forcond: ; preds = %bounds.ok, %0
	%1 = load i64, i64* %pos			%1 = load i64, i64* %pos
	%.len1 = load i64, i64* undef			%.len1 = load i64, i64* undef
	Show All 16 Lines

llvm/test/CodeGen/PowerPC/sms-cpy-1.ll

	Show All 38 Lines
	; CHECK-NEXT: addi 3, 3, 1			; CHECK-NEXT: addi 3, 3, 1
	; CHECK-NEXT: cntlzw 6, 6			; CHECK-NEXT: cntlzw 6, 6
	; CHECK-NEXT: srwi 7, 6, 5			; CHECK-NEXT: srwi 7, 6, 5
	; CHECK-NEXT: xori 6, 5, 84			; CHECK-NEXT: xori 6, 5, 84
	; CHECK-NEXT: clrldi 5, 8, 32			; CHECK-NEXT: clrldi 5, 8, 32
	; CHECK-NEXT: addi 8, 8, -1			; CHECK-NEXT: addi 8, 8, -1
	; CHECK-NEXT: lbz 5, 0(5)			; CHECK-NEXT: lbz 5, 0(5)
	; CHECK-NEXT: bdz .LBB0_4			; CHECK-NEXT: bdz .LBB0_4
				; CHECK-NEXT: .p2align 4
	; CHECK-NEXT: .LBB0_3:			; CHECK-NEXT: .LBB0_3:
	; CHECK-NEXT: addi 3, 3, 1			; CHECK-NEXT: addi 3, 3, 1
	; CHECK-NEXT: clrldi 10, 8, 32			; CHECK-NEXT: clrldi 10, 8, 32
	; CHECK-NEXT: addi 8, 8, -1			; CHECK-NEXT: addi 8, 8, -1
	; CHECK-NEXT: cntlzw 9, 6			; CHECK-NEXT: cntlzw 9, 6
	; CHECK-NEXT: xori 6, 5, 84			; CHECK-NEXT: xori 6, 5, 84
	; CHECK-NEXT: lbz 5, 0(10)			; CHECK-NEXT: lbz 5, 0(10)
	; CHECK-NEXT: add 4, 4, 7			; CHECK-NEXT: add 4, 4, 7
	▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

llvm/test/CodeGen/SPARC/missinglabel.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -verify-machineinstrs \| FileCheck %s			; RUN: llc < %s -verify-machineinstrs \| FileCheck %s

	target datalayout = "E-m:e-i64:64-n32:64-S128"			target datalayout = "E-m:e-i64:64-n32:64-S128"
	target triple = "sparc64-unknown-linux-gnu"			target triple = "sparc64-unknown-linux-gnu"

	define void @f(i64 %a0) align 2 {			define void @f(i64 %a0) align 2 {
	; CHECK-LABEL: f:			; CHECK-LABEL: f:
	; CHECK: .cfi_startproc			; CHECK: .cfi_startproc
	; CHECK-NEXT: ! %bb.0: ! %entry			; CHECK-NEXT: ! %bb.0: ! %entry
	; CHECK-NEXT: cmp %o0, 0			; CHECK-NEXT: cmp %o0, 0
	; CHECK-NEXT: be %xcc, .LBB0_2			; CHECK-NEXT: be %xcc, .LBB0_2
	; CHECK-NEXT: nop			; CHECK-NEXT: nop
	; CHECK-NEXT: ba .LBB0_1			; CHECK-NEXT: ba .LBB0_1
	; CHECK-NEXT: nop			; CHECK-NEXT: nop
				; CHECK-NEXT: .LBB0_1: ! %cond.false
	; CHECK-NEXT: .LBB0_2: ! %targetblock			; CHECK-NEXT: .LBB0_2: ! %targetblock
	; CHECK-NEXT: mov %g0, %o0			; CHECK-NEXT: mov %g0, %o0
	; CHECK-NEXT: cmp %o0, 0			; CHECK-NEXT: cmp %o0, 0
	; CHECK-NEXT: bne .LBB0_4			; CHECK-NEXT: bne .LBB0_4
	; CHECK-NEXT: nop			; CHECK-NEXT: nop
	; CHECK-NEXT: ! %bb.3: ! %cond.false.i83			; CHECK-NEXT: ! %bb.3: ! %cond.false.i83
	; CHECK-NEXT: .LBB0_1: ! %cond.false
	; CHECK-NEXT: .LBB0_4: ! %exit.i85			; CHECK-NEXT: .LBB0_4: ! %exit.i85
	entry:			entry:
	%cmp = icmp eq i64 %a0, 0			%cmp = icmp eq i64 %a0, 0
	br i1 %cmp, label %targetblock, label %cond.false			br i1 %cmp, label %targetblock, label %cond.false

	cond.false:			cond.false:
	unreachable			unreachable

	Show All 9 Lines

llvm/test/CodeGen/SystemZ/debuginstr-cgp.mir

	# Check that the codegenprepare succeeds in dupRetToEnableTailCallOpts() also			# Check that the codegenprepare succeeds in dupRetToEnableTailCallOpts() also
	# in the presence of a call to @llvm.dbg.value()			# in the presence of a call to @llvm.dbg.value()
	#			#
	# RUN: llc %s -mtriple=s390x-linux-gnu -mcpu=z13 -start-before=codegenprepare \			# RUN: llc %s -mtriple=s390x-linux-gnu -mcpu=z13 -start-before=codegenprepare \
	# RUN: -stop-after codegenprepare -o - \| FileCheck %s			# RUN: -stop-after codegenprepare -o - \| FileCheck %s
	#			#
	# CHECK-LABEL: bb2:			# CHECK-LABEL: bb1:
	# CHECK: ret			# CHECK: ret
	# CHECK-LABEL: bb4:			# CHECK-LABEL: bb2:
	# CHECK: ret			# CHECK: ret


	# Generated with:			# Generated with:
	#			#
	# bin/llc -mtriple=s390x-linux-gnu -mcpu=z13 -stop-before codegenprepare -simplify-mir			# bin/llc -mtriple=s390x-linux-gnu -mcpu=z13 -stop-before codegenprepare -simplify-mir
	#			#
	# %0 = type { i32 (...)*, i16, %1 }			# %0 = type { i32 (...)*, i16, %1 }
	▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

llvm/test/CodeGen/WebAssembly/switch-unreachable-default.ll

	Show All 37 Lines
	}			}

	; CHECK-LABEL: split:			; CHECK-LABEL: split:
	; CHECK: .functype split (i32) -> ()			; CHECK: .functype split (i32) -> ()
	; CHECK: block			; CHECK: block
	; CHECK: br_if 0			; CHECK: br_if 0
	; CHECK: block			; CHECK: block
	; CHECK: block			; CHECK: block
	; CHECK: br_table {1, 1, 1, 1, 1, 1, 1, 0}			; CHECK: br_table {1, 1, 0}
	; CHECK: .LBB1_2			; CHECK: .LBB1_2
	; CHECK: end_block			; CHECK: end_block
	; CHECK: br_table {0, 0, 0}			; CHECK: br_table {0, 0, 0, 0, 0, 0, 0, 0}
	; CHECK: .LBB1_3			; CHECK: .LBB1_3
	; CHECK: end_block			; CHECK: end_block
	; CHECK: unreachable			; CHECK: unreachable
	; CHECK: .LBB1_4			; CHECK: .LBB1_4
	; CHECK: end_block			; CHECK: end_block
	; CHECK: end_function			; CHECK: end_function
	define void @split(i8 %c) {			define void @split(i8 %c) {
	entry:			entry:
	Show All 30 Lines

llvm/test/CodeGen/X86/2008-04-17-CoalescerBug.ll

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: je LBB0_27			; CHECK-NEXT: je LBB0_27
	; CHECK-NEXT: ## %bb.3: ## %bb142.i			; CHECK-NEXT: ## %bb.3: ## %bb142.i
	; CHECK-NEXT: je LBB0_27			; CHECK-NEXT: je LBB0_27
	; CHECK-NEXT: ## %bb.4:			; CHECK-NEXT: ## %bb.4:
	; CHECK-NEXT: movl L_.str89$non_lazy_ptr, %edi			; CHECK-NEXT: movl L_.str89$non_lazy_ptr, %edi
	; CHECK-NEXT: movb $1, %bh			; CHECK-NEXT: movb $1, %bh
	; CHECK-NEXT: movl $274877907, %ebp ## imm = 0x10624DD3			; CHECK-NEXT: movl $274877907, %ebp ## imm = 0x10624DD3
	; CHECK-NEXT: jmp LBB0_5			; CHECK-NEXT: jmp LBB0_5
	; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_23: ## %bb7806			; CHECK-NEXT: LBB0_23: ## %bb7806
	; CHECK-NEXT: ## in Loop: Header=BB0_5 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_5 Depth=1
	; CHECK-NEXT: Ltmp16:			; CHECK-NEXT: Ltmp16:
	; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)			; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)
	; CHECK-NEXT: movl $1, {{[0-9]+}}(%esp)			; CHECK-NEXT: movl $1, {{[0-9]+}}(%esp)
	; CHECK-NEXT: movl $0, (%esp)			; CHECK-NEXT: movl $0, (%esp)
	; CHECK-NEXT: calll __ZN12wxStringBase6appendEmw			; CHECK-NEXT: calll __ZN12wxStringBase6appendEmw
	; CHECK-NEXT: Ltmp17:			; CHECK-NEXT: Ltmp17:
	▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Ltmp10:			; CHECK-NEXT: Ltmp10:
	; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)			; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)
	; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)			; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)
	; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)			; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)
	; CHECK-NEXT: movl $0, (%esp)			; CHECK-NEXT: movl $0, (%esp)
	; CHECK-NEXT: calll __ZN12wxStringBase10ConcatSelfEmPKwm			; CHECK-NEXT: calll __ZN12wxStringBase10ConcatSelfEmPKwm
	; CHECK-NEXT: Ltmp11:			; CHECK-NEXT: Ltmp11:
	; CHECK-NEXT: jmp LBB0_5			; CHECK-NEXT: jmp LBB0_5
	; CHECK-NEXT: LBB0_22: ## %bb5968
	; CHECK-NEXT: Ltmp2:
	; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)
	; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)
	; CHECK-NEXT: movl $0, (%esp)
	; CHECK-NEXT: calll __ZN8wxString6FormatEPKwz
	; CHECK-NEXT: subl $4, %esp
	; CHECK-NEXT: Ltmp3:
	; CHECK-NEXT: jmp LBB0_27
	; CHECK-NEXT: LBB0_9: ## %bb5657			; CHECK-NEXT: LBB0_9: ## %bb5657
	; CHECK-NEXT: Ltmp13:			; CHECK-NEXT: Ltmp13:
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax			; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax
	; CHECK-NEXT: movl %eax, {{[0-9]+}}(%esp)			; CHECK-NEXT: movl %eax, {{[0-9]+}}(%esp)
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax			; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax
	; CHECK-NEXT: movl %eax, (%esp)			; CHECK-NEXT: movl %eax, (%esp)
	; CHECK-NEXT: calll __ZNK10wxDateTime12GetDayOfYearERKNS_8TimeZoneE			; CHECK-NEXT: calll __ZNK10wxDateTime12GetDayOfYearERKNS_8TimeZoneE
	; CHECK-NEXT: Ltmp14:			; CHECK-NEXT: Ltmp14:
				; CHECK-NEXT: jmp LBB0_27
				; CHECK-NEXT: LBB0_22: ## %bb5968
				; CHECK-NEXT: Ltmp2:
				; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)
				; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)
				; CHECK-NEXT: movl $0, (%esp)
				; CHECK-NEXT: calll __ZN8wxString6FormatEPKwz
				; CHECK-NEXT: subl $4, %esp
				; CHECK-NEXT: Ltmp3:
	; CHECK-NEXT: LBB0_27: ## %bb115.critedge.i			; CHECK-NEXT: LBB0_27: ## %bb115.critedge.i
	; CHECK-NEXT: movl %esi, %eax			; CHECK-NEXT: movl %esi, %eax
	; CHECK-NEXT: addl $28, %esp			; CHECK-NEXT: addl $28, %esp
	; CHECK-NEXT: popl %esi			; CHECK-NEXT: popl %esi
	; CHECK-NEXT: popl %edi			; CHECK-NEXT: popl %edi
	; CHECK-NEXT: popl %ebx			; CHECK-NEXT: popl %ebx
	; CHECK-NEXT: popl %ebp			; CHECK-NEXT: popl %ebp
	; CHECK-NEXT: retl $4			; CHECK-NEXT: retl $4
	▲ Show 20 Lines • Show All 171 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/block-placement.ll

	Show First 20 Lines • Show All 352 Lines • ▼ Show 20 Lines
	; Test that we can handle a loop with a nested natural loop and an unnatural			; Test that we can handle a loop with a nested natural loop and an unnatural
	; loop. This was reduced from a crash on block placement when run over			; loop. This was reduced from a crash on block placement when run over
	; single-source GCC.			; single-source GCC.
	; CHECK-LABEL: unnatural_cfg2			; CHECK-LABEL: unnatural_cfg2
	; CHECK: %entry			; CHECK: %entry
	; CHECK: %loop.header			; CHECK: %loop.header
	; CHECK: %loop.body1			; CHECK: %loop.body1
	; CHECK: %loop.body2			; CHECK: %loop.body2
	; CHECK: %loop.body3
	; CHECK: %loop.inner1.begin
	; CHECK: %loop.body4			; CHECK: %loop.body4
	; CHECK: %loop.inner2.begin			; CHECK: %loop.inner2.begin
	; CHECK: %loop.inner2.begin			; CHECK: %loop.inner2.begin
				; CHECK: %loop.body3
				; CHECK: %loop.inner1.begin
	; CHECK: %bail			; CHECK: %bail

	entry:			entry:
	br label %loop.header			br label %loop.header

	loop.header:			loop.header:
	%comp0 = icmp eq i32* %p0, null			%comp0 = icmp eq i32* %p0, null
	br i1 %comp0, label %bail, label %loop.body1			br i1 %comp0, label %bail, label %loop.body1
	▲ Show 20 Lines • Show All 1,229 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/misched_phys_reg_assign_order.ll

	Show All 21 Lines
	; CHECK-NEXT: movb (%esi), %al			; CHECK-NEXT: movb (%esi), %al
	; CHECK-NEXT: movb %al, {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Spill			; CHECK-NEXT: movb %al, {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Spill
	; CHECK-NEXT: xorl %eax, %eax			; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: xorl %edx, %edx			; CHECK-NEXT: xorl %edx, %edx
	; CHECK-NEXT: xorl %ecx, %ecx			; CHECK-NEXT: xorl %ecx, %ecx
	; CHECK-NEXT: xorl %ebx, %ebx			; CHECK-NEXT: xorl %ebx, %ebx
	; CHECK-NEXT: lock cmpxchg8b (%esi)			; CHECK-NEXT: lock cmpxchg8b (%esi)
	; CHECK-NEXT: cmpb $0, {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Reload			; CHECK-NEXT: cmpb $0, {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Folded Reload
	; CHECK-NEXT: jne .LBB0_1			; CHECK-NEXT: je .LBB0_2
	; CHECK-NEXT: # %bb.2: # %k.end			; CHECK-NEXT: # %bb.1: # %.
	; CHECK-NEXT: .LBB0_1: # %.
	; CHECK-NEXT: calll m			; CHECK-NEXT: calll m
				; CHECK-NEXT: .LBB0_2: # %k.end
	entry:			entry:
	%p = load i8, i8* @f			%p = load i8, i8* @f
	%v1 = load atomic i8, i8* %p monotonic, align 1			%v1 = load atomic i8, i8* %p monotonic, align 1
	%d.h.h.h.h.h = bitcast i8* %p to i64*			%d.h.h.h.h.h = bitcast i8* %p to i64*
	%v2 = load atomic i64, i64* %d.h.h.h.h.h monotonic, align 8			%v2 = load atomic i64, i64* %d.h.h.h.h.h monotonic, align 8
	%j.h = icmp eq i8 %v1, 0			%j.h = icmp eq i8 %v1, 0
	br i1 %j.h, label %k.end, label %.			br i1 %j.h, label %k.end, label %.

	Show All 11 Lines

llvm/test/CodeGen/X86/pr27501.ll

	; RUN: llc < %s \| FileCheck %s			; RUN: llc < %s \| FileCheck %s
	target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-pc-windows-msvc"			target triple = "x86_64-pc-windows-msvc"

	define void @test1(i64* %result.repack) personality i32 (...)* @__CxxFrameHandler3 {			define void @test1(i64* %result.repack) personality i32 (...)* @__CxxFrameHandler3 {
	bb:			bb:
	invoke void @may_throw(i32 1)			invoke void @may_throw(i32 1)
	to label %postinvoke unwind label %cleanuppad			to label %postinvoke unwind label %cleanuppad
				; CHECK: movq %rcx, [[SpillLoc:.*$%rbp$]]
	; CHECK: movl $1, %ecx			; CHECK: movl $1, %ecx
	; CHECK: callq may_throw			; CHECK: callq may_throw

	postinvoke: ; preds = %bb			postinvoke: ; preds = %bb
	store i64 19, i64* %result.repack, align 8			store i64 19, i64* %result.repack, align 8
				; CHECK: movq [[SpillLoc]], [[R1:%r..]]
	; CHECK: movq $19, (%rsi)			; CHECK: movq $19, ([[R1]])
	; CHECK: movl $2, %ecx			; CHECK: movl $2, %ecx
	; CHECK-NEXT: movq %rsi, -8(%rbp)
	; CHECK-NEXT: callq may_throw			; CHECK-NEXT: callq may_throw
	invoke void @may_throw(i32 2)			invoke void @may_throw(i32 2)
	to label %assertFailed unwind label %catch.dispatch			to label %assertFailed unwind label %catch.dispatch

	catch.dispatch: ; preds = %cleanuppad9, %postinvoke			catch.dispatch: ; preds = %cleanuppad9, %postinvoke
	%tmp3 = catchswitch within none [label %catch.object.Throwable] unwind label %cleanuppad			%tmp3 = catchswitch within none [label %catch.object.Throwable] unwind label %cleanuppad

	catch.object.Throwable: ; preds = %catch.dispatch			catch.object.Throwable: ; preds = %catch.dispatch
	%tmp2 = catchpad within %tmp3 [i8* null, i32 64, i8* null]			%tmp2 = catchpad within %tmp3 [i8* null, i32 64, i8* null]
	catchret from %tmp2 to label %catchhandler			catchret from %tmp2 to label %catchhandler

	catchhandler: ; preds = %catch.object.Throwable			catchhandler: ; preds = %catch.object.Throwable
	invoke void @may_throw(i32 3)			invoke void @may_throw(i32 3)
	to label %try.success.or.caught unwind label %cleanuppad			to label %try.success.or.caught unwind label %cleanuppad

	try.success.or.caught: ; preds = %catchhandler			try.success.or.caught: ; preds = %catchhandler
	invoke void @may_throw(i32 4)			invoke void @may_throw(i32 4)
	to label %postinvoke27 unwind label %cleanuppad24			to label %postinvoke27 unwind label %cleanuppad24
	; CHECK: movl $4, %ecx			; CHECK: movl $4, %ecx
	; CHECK-NEXT: callq may_throw			; CHECK-NEXT: callq may_throw

	postinvoke27: ; preds = %try.success.or.caught			postinvoke27: ; preds = %try.success.or.caught
	store i64 42, i64* %result.repack, align 8			store i64 42, i64* %result.repack, align 8
	; CHECK: movq -8(%rbp), %[[reload:r..]]			; CHECK: movq [[SpillLoc]], [[R2:%r..]]
	; CHECK-NEXT: movq $42, (%[[reload]])			; CHECK-NEXT: movq $42, ([[R2]])
	ret void			ret void

	cleanuppad24: ; preds = %try.success.or.caught			cleanuppad24: ; preds = %try.success.or.caught
	%tmp5 = cleanuppad within none []			%tmp5 = cleanuppad within none []
	cleanupret from %tmp5 unwind to caller			cleanupret from %tmp5 unwind to caller

	cleanuppad: ; preds = %catchhandler, %catch.dispatch, %bb			cleanuppad: ; preds = %catchhandler, %catch.dispatch, %bb
	%tmp1 = cleanuppad within none []			%tmp1 = cleanuppad within none []
	Show All 17 Lines

llvm/test/CodeGen/X86/pr37916.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=i386-unknown-linux-gnu %s -o - \| FileCheck %s			; RUN: llc -mtriple=i386-unknown-linux-gnu %s -o - \| FileCheck %s

	@f = external dso_local local_unnamed_addr global i64*, align 4			@f = external dso_local local_unnamed_addr global i64*, align 4
	@a = external dso_local global i64, align 8			@a = external dso_local global i64, align 8

	define void @fn1() local_unnamed_addr {			define void @fn1() local_unnamed_addr {
	; CHECK-LABEL: fn1:			; CHECK-LABEL: fn1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: .LBB0_1: # %if.end			; CHECK: .LBB0_1: # %if.end
	; CHECK-NEXT: # =>This Inner Loop Header: Depth=1			; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: movl a+4, %eax			; CHECK-NEXT: movl a+4, %eax
	; CHECK-NEXT: orl a, %eax			; CHECK-NEXT: orl a, %eax
	; CHECK-NEXT: movl $a, f			; CHECK-NEXT: movl $a, f
	; CHECK-NEXT: je .LBB0_3			; CHECK-NEXT: je .LBB0_3
	; CHECK-NEXT: # %bb.2: # %if.end			; CHECK-NEXT: # %bb.2: # %if.end
	; CHECK-NEXT: # in Loop: Header=BB0_1 Depth=1			; CHECK-NEXT: # in Loop: Header=BB0_1 Depth=1
	; CHECK-NEXT: jne .LBB0_1			; CHECK-NEXT: jne .LBB0_1
	Show All 25 Lines

llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll

	Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: popq %rbp			; CHECK-NEXT: popq %rbp
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	; CHECK-NEXT: LBB0_5: ## %if.end25			; CHECK-NEXT: LBB0_5: ## %if.end25
	; CHECK-NEXT: xorl %eax, %eax			; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: testb %al, %al			; CHECK-NEXT: testb %al, %al
	; CHECK-NEXT: je LBB0_55			; CHECK-NEXT: je LBB0_55
	; CHECK-NEXT: ## %bb.6: ## %SyTime.exit2720			; CHECK-NEXT: ## %bb.6: ## %SyTime.exit2720
	; CHECK-NEXT: movq %rdx, %rbx			; CHECK-NEXT: movq %rdx, %rbx
	; CHECK-NEXT: movq %rdi, %r14			; CHECK-NEXT: movq %rdi, %rbp
	; CHECK-NEXT: leaq {{[0-9]+}}(%rsp), %rax			; CHECK-NEXT: leaq {{[0-9]+}}(%rsp), %rax
	; CHECK-NEXT: leaq {{[0-9]+}}(%rsp), %rcx			; CHECK-NEXT: leaq {{[0-9]+}}(%rsp), %rcx
	; CHECK-NEXT: cmpq %rax, %rcx			; CHECK-NEXT: cmpq %rax, %rcx
	; CHECK-NEXT: jae LBB0_8			; CHECK-NEXT: jae LBB0_8
	; CHECK-NEXT: ## %bb.7: ## %for.body.lr.ph			; CHECK-NEXT: ## %bb.7: ## %for.body.lr.ph
	; CHECK-NEXT: movl $512, %edx ## imm = 0x200			; CHECK-NEXT: movl $512, %edx ## imm = 0x200
	; CHECK-NEXT: movl $32, %esi			; CHECK-NEXT: movl $32, %esi
	; CHECK-NEXT: callq _memset			; CHECK-NEXT: callq _memset
	; CHECK-NEXT: LBB0_8: ## %while.body.preheader			; CHECK-NEXT: LBB0_8: ## %while.body.preheader
	; CHECK-NEXT: imulq $1040, %rbx, %rax ## imm = 0x410			; CHECK-NEXT: imulq $1040, %rbx, %rax ## imm = 0x410
	; CHECK-NEXT: movq _syBuf@{{.*}}(%rip), %rcx			; CHECK-NEXT: movq _syBuf@{{.*}}(%rip), %rcx
	; CHECK-NEXT: leaq 8(%rcx,%rax), %rax			; CHECK-NEXT: leaq 8(%rcx,%rax), %rdx
	; CHECK-NEXT: movq %rax, {{[-0-9]+}}(%r{{[sb]}}p) ## 8-byte Spill
	; CHECK-NEXT: movl $1, %r15d			; CHECK-NEXT: movl $1, %r15d
	; CHECK-NEXT: movq _syCTRO@{{.*}}(%rip), %rax			; CHECK-NEXT: movq _syCTRO@{{.*}}(%rip), %rax
	; CHECK-NEXT: movb $1, %cl			; CHECK-NEXT: movb $1, %cl
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_9: ## %do.body			; CHECK-NEXT: LBB0_9: ## %do.body
	; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1			; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: movl $0, (%rax)			; CHECK-NEXT: movl $0, (%rax)
	; CHECK-NEXT: testb %cl, %cl			; CHECK-NEXT: testb %cl, %cl
	; CHECK-NEXT: jne LBB0_9			; CHECK-NEXT: jne LBB0_9
	; CHECK-NEXT: ## %bb.10: ## %do.end			; CHECK-NEXT: ## %bb.10: ## %do.end
	; CHECK-NEXT: xorl %ebp, %ebp			; CHECK-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) ## 8-byte Spill
	; CHECK-NEXT: testb %bpl, %bpl			; CHECK-NEXT: movq %rbp, {{[-0-9]+}}(%r{{[sb]}}p) ## 8-byte Spill
				; CHECK-NEXT: xorl %r13d, %r13d
				; CHECK-NEXT: testb %r13b, %r13b
	; CHECK-NEXT: jne LBB0_11			; CHECK-NEXT: jne LBB0_11
	; CHECK-NEXT: ## %bb.12: ## %while.body200.preheader			; CHECK-NEXT: ## %bb.12: ## %while.body200.preheader
	; CHECK-NEXT: xorl %ebx, %ebx
	; CHECK-NEXT: leaq {{.*}}(%rip), %r13
	; CHECK-NEXT: movl $0, {{[-0-9]+}}(%r{{[sb]}}p) ## 4-byte Folded Spill
	; CHECK-NEXT: xorl %r12d, %r12d			; CHECK-NEXT: xorl %r12d, %r12d
	; CHECK-NEXT: movq %r14, {{[-0-9]+}}(%r{{[sb]}}p) ## 8-byte Spill			; CHECK-NEXT: leaq {{.*}}(%rip), %rdx
				; CHECK-NEXT: leaq {{.*}}(%rip), %rbx
				; CHECK-NEXT: movl $0, {{[-0-9]+}}(%r{{[sb]}}p) ## 4-byte Folded Spill
				; CHECK-NEXT: xorl %r14d, %r14d
	; CHECK-NEXT: jmp LBB0_13			; CHECK-NEXT: jmp LBB0_13
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_20: ## %sw.bb256			; CHECK-NEXT: LBB0_20: ## %sw.bb256
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: movl %ebp, %r12d			; CHECK-NEXT: movl %r13d, %r14d
	; CHECK-NEXT: LBB0_21: ## %while.cond197.backedge			; CHECK-NEXT: LBB0_21: ## %while.cond197.backedge
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: decl %r15d			; CHECK-NEXT: decl %r15d
	; CHECK-NEXT: testl %r15d, %r15d			; CHECK-NEXT: testl %r15d, %r15d
	; CHECK-NEXT: movl %r12d, %ebp			; CHECK-NEXT: movl %r14d, %r13d
	; CHECK-NEXT: jle LBB0_22			; CHECK-NEXT: jle LBB0_22
	; CHECK-NEXT: LBB0_13: ## %while.body200			; CHECK-NEXT: LBB0_13: ## %while.body200
	; CHECK-NEXT: ## =>This Loop Header: Depth=1			; CHECK-NEXT: ## =>This Loop Header: Depth=1
	; CHECK-NEXT: ## Child Loop BB0_29 Depth 2			; CHECK-NEXT: ## Child Loop BB0_29 Depth 2
	; CHECK-NEXT: ## Child Loop BB0_38 Depth 2			; CHECK-NEXT: ## Child Loop BB0_38 Depth 2
	; CHECK-NEXT: leal -268(%rbp), %eax			; CHECK-NEXT: leal -268(%r13), %eax
	; CHECK-NEXT: cmpl $105, %eax			; CHECK-NEXT: cmpl $105, %eax
	; CHECK-NEXT: ja LBB0_14			; CHECK-NEXT: ja LBB0_14
	; CHECK-NEXT: ## %bb.56: ## %while.body200			; CHECK-NEXT: ## %bb.56: ## %while.body200
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: movslq (%r13,%rax,4), %rax			; CHECK-NEXT: movslq (%rbx,%rax,4), %rax
	; CHECK-NEXT: addq %r13, %rax			; CHECK-NEXT: addq %rbx, %rax
	; CHECK-NEXT: jmpq *%rax			; CHECK-NEXT: jmpq *%rax
	; CHECK-NEXT: LBB0_44: ## %while.cond1037.preheader			; CHECK-NEXT: LBB0_44: ## %while.cond1037.preheader
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: testb %bl, %bl			; CHECK-NEXT: testb %r12b, %r12b
	; CHECK-NEXT: movl %ebp, %r12d			; CHECK-NEXT: movl %r13d, %r14d
	; CHECK-NEXT: jne LBB0_21			; CHECK-NEXT: jne LBB0_21
	; CHECK-NEXT: jmp LBB0_55			; CHECK-NEXT: jmp LBB0_55
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_14: ## %while.body200			; CHECK-NEXT: LBB0_14: ## %while.body200
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: leal 1(%rbp), %eax			; CHECK-NEXT: leal 1(%r13), %eax
	; CHECK-NEXT: cmpl $21, %eax			; CHECK-NEXT: cmpl $21, %eax
	; CHECK-NEXT: ja LBB0_20			; CHECK-NEXT: ja LBB0_20
	; CHECK-NEXT: ## %bb.15: ## %while.body200			; CHECK-NEXT: ## %bb.15: ## %while.body200
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: movl $-1, %r12d			; CHECK-NEXT: movl $-1, %r14d
	; CHECK-NEXT: leaq {{.*}}(%rip), %rcx			; CHECK-NEXT: movslq (%rdx,%rax,4), %rax
	; CHECK-NEXT: movslq (%rcx,%rax,4), %rax			; CHECK-NEXT: addq %rdx, %rax
	; CHECK-NEXT: addq %rcx, %rax
	; CHECK-NEXT: jmpq *%rax			; CHECK-NEXT: jmpq *%rax
	; CHECK-NEXT: LBB0_18: ## %while.cond201.preheader			; CHECK-NEXT: LBB0_18: ## %while.cond201.preheader
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: movl $1, %r12d			; CHECK-NEXT: movl $1, %r14d
	; CHECK-NEXT: jmp LBB0_21			; CHECK-NEXT: jmp LBB0_21
	; CHECK-NEXT: LBB0_26: ## %sw.bb474			; CHECK-NEXT: LBB0_26: ## %sw.bb474
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: testb %bl, %bl			; CHECK-NEXT: testb %r12b, %r12b
	; CHECK-NEXT: ## implicit-def: $r14			; CHECK-NEXT: ## implicit-def: $rbp
	; CHECK-NEXT: jne LBB0_34			; CHECK-NEXT: jne LBB0_34
	; CHECK-NEXT: ## %bb.27: ## %do.body479.preheader			; CHECK-NEXT: ## %bb.27: ## %do.body479.preheader
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: testb %bl, %bl			; CHECK-NEXT: testb %r12b, %r12b
	; CHECK-NEXT: ## implicit-def: $r14			; CHECK-NEXT: ## implicit-def: $rbp
	; CHECK-NEXT: jne LBB0_34			; CHECK-NEXT: jne LBB0_34
	; CHECK-NEXT: ## %bb.28: ## %land.rhs485.preheader			; CHECK-NEXT: ## %bb.28: ## %land.rhs485.preheader
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: ## implicit-def: $rax			; CHECK-NEXT: ## implicit-def: $rax
	; CHECK-NEXT: jmp LBB0_29			; CHECK-NEXT: jmp LBB0_29
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_32: ## %do.body479.backedge			; CHECK-NEXT: LBB0_32: ## %do.body479.backedge
	; CHECK-NEXT: ## in Loop: Header=BB0_29 Depth=2			; CHECK-NEXT: ## in Loop: Header=BB0_29 Depth=2
	; CHECK-NEXT: leaq 1(%r14), %rax			; CHECK-NEXT: leaq 1(%rbp), %rax
	; CHECK-NEXT: testb %bl, %bl			; CHECK-NEXT: testb %r12b, %r12b
	; CHECK-NEXT: je LBB0_33			; CHECK-NEXT: je LBB0_33
	; CHECK-NEXT: LBB0_29: ## %land.rhs485			; CHECK-NEXT: LBB0_29: ## %land.rhs485
	; CHECK-NEXT: ## Parent Loop BB0_13 Depth=1			; CHECK-NEXT: ## Parent Loop BB0_13 Depth=1
	; CHECK-NEXT: ## => This Inner Loop Header: Depth=2			; CHECK-NEXT: ## => This Inner Loop Header: Depth=2
	; CHECK-NEXT: testb %al, %al			; CHECK-NEXT: testb %al, %al
	; CHECK-NEXT: js LBB0_55			; CHECK-NEXT: js LBB0_55
	; CHECK-NEXT: ## %bb.30: ## %cond.true.i.i2780			; CHECK-NEXT: ## %bb.30: ## %cond.true.i.i2780
	; CHECK-NEXT: ## in Loop: Header=BB0_29 Depth=2			; CHECK-NEXT: ## in Loop: Header=BB0_29 Depth=2
	; CHECK-NEXT: movq %rax, %r14			; CHECK-NEXT: movq %rax, %rbp
	; CHECK-NEXT: testb %bl, %bl			; CHECK-NEXT: testb %r12b, %r12b
	; CHECK-NEXT: jne LBB0_32			; CHECK-NEXT: jne LBB0_32
	; CHECK-NEXT: ## %bb.31: ## %lor.rhs500			; CHECK-NEXT: ## %bb.31: ## %lor.rhs500
	; CHECK-NEXT: ## in Loop: Header=BB0_29 Depth=2			; CHECK-NEXT: ## in Loop: Header=BB0_29 Depth=2
	; CHECK-NEXT: movl $256, %esi ## imm = 0x100			; CHECK-NEXT: movl $256, %esi ## imm = 0x100
	; CHECK-NEXT: callq ___maskrune			; CHECK-NEXT: callq ___maskrune
	; CHECK-NEXT: testb %bl, %bl			; CHECK-NEXT: testb %r12b, %r12b
	; CHECK-NEXT: jne LBB0_32			; CHECK-NEXT: jne LBB0_32
	; CHECK-NEXT: jmp LBB0_34			; CHECK-NEXT: jmp LBB0_34
	; CHECK-NEXT: LBB0_45: ## %sw.bb1134			; CHECK-NEXT: LBB0_45: ## %sw.bb1134
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: leaq {{[0-9]+}}(%rsp), %rax			; CHECK-NEXT: leaq {{[0-9]+}}(%rsp), %rax
	; CHECK-NEXT: leaq {{[0-9]+}}(%rsp), %rcx			; CHECK-NEXT: leaq {{[0-9]+}}(%rsp), %rcx
	; CHECK-NEXT: cmpq %rax, %rcx			; CHECK-NEXT: cmpq %rax, %rcx
	; CHECK-NEXT: jb LBB0_55			; CHECK-NEXT: jb LBB0_55
	; CHECK-NEXT: ## %bb.46: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## %bb.46: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: movl $0, {{[-0-9]+}}(%r{{[sb]}}p) ## 4-byte Folded Spill			; CHECK-NEXT: movl $0, {{[-0-9]+}}(%r{{[sb]}}p) ## 4-byte Folded Spill
	; CHECK-NEXT: movl $268, %r12d ## imm = 0x10C			; CHECK-NEXT: movl $268, %r14d ## imm = 0x10C
	; CHECK-NEXT: jmp LBB0_21			; CHECK-NEXT: jmp LBB0_21
	; CHECK-NEXT: LBB0_40: ## %sw.bb566			; CHECK-NEXT: LBB0_40: ## %sw.bb566
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: movl $20, %r12d			; CHECK-NEXT: movl $20, %r14d
	; CHECK-NEXT: jmp LBB0_21			; CHECK-NEXT: jmp LBB0_21
	; CHECK-NEXT: LBB0_19: ## %sw.bb243			; CHECK-NEXT: LBB0_19: ## %sw.bb243
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: movl $2, %r12d			; CHECK-NEXT: movl $2, %r14d
	; CHECK-NEXT: jmp LBB0_21			; CHECK-NEXT: jmp LBB0_21
	; CHECK-NEXT: LBB0_33: ## %if.end517.loopexitsplit			; CHECK-NEXT: LBB0_33: ## %if.end517.loopexitsplit
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: incq %r14			; CHECK-NEXT: incq %rbp
	; CHECK-NEXT: LBB0_34: ## %if.end517			; CHECK-NEXT: LBB0_34: ## %if.end517
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: leal -324(%r12), %eax			; CHECK-NEXT: leal -324(%r14), %eax
	; CHECK-NEXT: cmpl $59, %eax			; CHECK-NEXT: cmpl $59, %eax
	; CHECK-NEXT: ja LBB0_35			; CHECK-NEXT: ja LBB0_35
	; CHECK-NEXT: ## %bb.57: ## %if.end517			; CHECK-NEXT: ## %bb.57: ## %if.end517
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: movabsq $576460756598390785, %rcx ## imm = 0x800000100000001			; CHECK-NEXT: movabsq $576460756598390785, %rcx ## imm = 0x800000100000001
	; CHECK-NEXT: btq %rax, %rcx			; CHECK-NEXT: btq %rax, %rcx
	; CHECK-NEXT: jb LBB0_38			; CHECK-NEXT: jb LBB0_38
	; CHECK-NEXT: LBB0_35: ## %if.end517			; CHECK-NEXT: LBB0_35: ## %if.end517
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: cmpl $11, %r12d			; CHECK-NEXT: cmpl $11, %r14d
	; CHECK-NEXT: je LBB0_38			; CHECK-NEXT: je LBB0_38
	; CHECK-NEXT: ## %bb.36: ## %if.end517			; CHECK-NEXT: ## %bb.36: ## %if.end517
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: cmpl $24, %r12d			; CHECK-NEXT: cmpl $24, %r14d
	; CHECK-NEXT: je LBB0_38			; CHECK-NEXT: je LBB0_38
	; CHECK-NEXT: ## %bb.37: ## %if.then532			; CHECK-NEXT: ## %bb.37: ## %if.then532
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: movq _SyFgets.yank@{{.*}}(%rip), %rax			; CHECK-NEXT: movq _SyFgets.yank@{{.*}}(%rip), %rax
	; CHECK-NEXT: movb $0, (%rax)			; CHECK-NEXT: movb $0, (%rax)
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_38: ## %for.cond534			; CHECK-NEXT: LBB0_38: ## %for.cond534
	; CHECK-NEXT: ## Parent Loop BB0_13 Depth=1			; CHECK-NEXT: ## Parent Loop BB0_13 Depth=1
	; CHECK-NEXT: ## => This Inner Loop Header: Depth=2			; CHECK-NEXT: ## => This Inner Loop Header: Depth=2
	; CHECK-NEXT: testb %bl, %bl			; CHECK-NEXT: testb %r12b, %r12b
	; CHECK-NEXT: jne LBB0_38			; CHECK-NEXT: jne LBB0_38
	; CHECK-NEXT: ## %bb.39: ## %for.cond542.preheader			; CHECK-NEXT: ## %bb.39: ## %for.cond542.preheader
	; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1			; CHECK-NEXT: ## in Loop: Header=BB0_13 Depth=1
	; CHECK-NEXT: testb %bl, %bl			; CHECK-NEXT: testb %r12b, %r12b
	; CHECK-NEXT: movb $0, (%r14)			; CHECK-NEXT: movb $0, (%rbp)
	; CHECK-NEXT: movl %ebp, %r12d			; CHECK-NEXT: movl %r13d, %r14d
	; CHECK-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %r14 ## 8-byte Reload			; CHECK-NEXT: leaq {{.*}}(%rip), %rdx
	; CHECK-NEXT: jmp LBB0_21			; CHECK-NEXT: jmp LBB0_21
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_42: ## %while.cond864			; CHECK-NEXT: LBB0_42: ## %while.cond864
	; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1			; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: jmp LBB0_42			; CHECK-NEXT: jmp LBB0_42
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_43: ## %while.cond962			; CHECK-NEXT: LBB0_43: ## %while.cond962
	; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1			; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: jmp LBB0_43			; CHECK-NEXT: jmp LBB0_43
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_25: ## %for.cond357			; CHECK-NEXT: LBB0_25: ## %for.cond357
	; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1			; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: jmp LBB0_25			; CHECK-NEXT: jmp LBB0_25
	; CHECK-NEXT: LBB0_11:			; CHECK-NEXT: LBB0_11:
	; CHECK-NEXT: movl $0, {{[-0-9]+}}(%r{{[sb]}}p) ## 4-byte Folded Spill			; CHECK-NEXT: movl $0, {{[-0-9]+}}(%r{{[sb]}}p) ## 4-byte Folded Spill
	; CHECK-NEXT: xorl %r12d, %r12d			; CHECK-NEXT: xorl %r14d, %r14d
	; CHECK-NEXT: LBB0_22: ## %while.end1465			; CHECK-NEXT: LBB0_22: ## %while.end1465
	; CHECK-NEXT: incl %r12d			; CHECK-NEXT: incl %r14d
	; CHECK-NEXT: cmpl $16, %r12d			; CHECK-NEXT: cmpl $16, %r14d
	; CHECK-NEXT: ja LBB0_50			; CHECK-NEXT: ja LBB0_50
	; CHECK-NEXT: ## %bb.23: ## %while.end1465			; CHECK-NEXT: ## %bb.23: ## %while.end1465
	; CHECK-NEXT: movl $83969, %eax ## imm = 0x14801			; CHECK-NEXT: movl $83969, %eax ## imm = 0x14801
	; CHECK-NEXT: btl %r12d, %eax			; CHECK-NEXT: btl %r14d, %eax
	; CHECK-NEXT: jae LBB0_50			; CHECK-NEXT: jae LBB0_50
	; CHECK-NEXT: ## %bb.24:			; CHECK-NEXT: ## %bb.24:
	; CHECK-NEXT: xorl %ebx, %ebx			; CHECK-NEXT: xorl %ebp, %ebp
				; CHECK-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %rbx ## 8-byte Reload
	; CHECK-NEXT: LBB0_48: ## %if.then1477			; CHECK-NEXT: LBB0_48: ## %if.then1477
	; CHECK-NEXT: movl $1, %edx			; CHECK-NEXT: movl $1, %edx
	; CHECK-NEXT: callq _write			; CHECK-NEXT: callq _write
	; CHECK-NEXT: subq %rbx, %r14			; CHECK-NEXT: subq %rbp, %rbx
	; CHECK-NEXT: movq _syHistory@{{.*}}(%rip), %rax			; CHECK-NEXT: movq _syHistory@{{.*}}(%rip), %rax
	; CHECK-NEXT: leaq 8189(%r14,%rax), %rax			; CHECK-NEXT: leaq 8189(%rbx,%rax), %rax
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_49: ## %for.body1723			; CHECK-NEXT: LBB0_49: ## %for.body1723
	; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1			; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: decq %rax			; CHECK-NEXT: decq %rax
	; CHECK-NEXT: jmp LBB0_49			; CHECK-NEXT: jmp LBB0_49
	; CHECK-NEXT: LBB0_47: ## %if.then1477.loopexit			; CHECK-NEXT: LBB0_47: ## %if.then1477.loopexit
	; CHECK-NEXT: movq %r14, %rbx			; CHECK-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %rbx ## 8-byte Reload
				; CHECK-NEXT: movq %rbx, %rbp
	; CHECK-NEXT: jmp LBB0_48			; CHECK-NEXT: jmp LBB0_48
	; CHECK-NEXT: LBB0_16: ## %while.cond635.preheader			; CHECK-NEXT: LBB0_16: ## %while.cond635.preheader
	; CHECK-NEXT: xorl %eax, %eax			; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: testb %al, %al			; CHECK-NEXT: testb %al, %al
	; CHECK-NEXT: je LBB0_41			; CHECK-NEXT: je LBB0_41
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_17: ## %for.body643.us			; CHECK-NEXT: LBB0_17: ## %for.body643.us
	; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1			; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: jmp LBB0_17			; CHECK-NEXT: jmp LBB0_17
	; CHECK-NEXT: .p2align 4, 0x90			; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_41: ## %while.cond661			; CHECK-NEXT: LBB0_41: ## %while.cond661
	; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1			; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: jmp LBB0_41			; CHECK-NEXT: jmp LBB0_41
	; CHECK-NEXT: LBB0_50: ## %for.cond1480.preheader			; CHECK-NEXT: LBB0_50: ## %for.cond1480.preheader
	; CHECK-NEXT: movl $512, %eax ## imm = 0x200			; CHECK-NEXT: movl $512, %eax ## imm = 0x200
	; CHECK-NEXT: cmpq %rax, %rax			; CHECK-NEXT: cmpq %rax, %rax
	; CHECK-NEXT: jae LBB0_55			; CHECK-NEXT: jae LBB0_55
	; CHECK-NEXT: ## %bb.51: ## %for.body1664.lr.ph			; CHECK-NEXT: ## %bb.51: ## %for.body1664.lr.ph
	; CHECK-NEXT: xorl %eax, %eax			; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: testb %al, %al			; CHECK-NEXT: testb %al, %al
				; CHECK-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %rbx ## 8-byte Reload
				; CHECK-NEXT: movl {{[-0-9]+}}(%r{{[sb]}}p), %ebp ## 4-byte Reload
	; CHECK-NEXT: jne LBB0_54			; CHECK-NEXT: jne LBB0_54
	; CHECK-NEXT: ## %bb.52: ## %while.body1679.preheader			; CHECK-NEXT: ## %bb.52: ## %while.body1679.preheader
	; CHECK-NEXT: incl {{[-0-9]+}}(%r{{[sb]}}p) ## 4-byte Folded Spill			; CHECK-NEXT: incl %ebp
				; CHECK-NEXT: .p2align 4, 0x90
	; CHECK-NEXT: LBB0_53: ## %while.body1679			; CHECK-NEXT: LBB0_53: ## %while.body1679
	; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1			; CHECK-NEXT: ## =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %rax ## 8-byte Reload			; CHECK-NEXT: movq (%rbx), %rdi
	; CHECK-NEXT: movq (%rax), %rdi
	; CHECK-NEXT: callq _fileno			; CHECK-NEXT: callq _fileno
	; CHECK-NEXT: movslq {{[-0-9]+}}(%r{{[sb]}}p), %rax ## 4-byte Folded Reload			; CHECK-NEXT: movslq %ebp, %rax
	; CHECK-NEXT: leal 1(%rax), %ecx			; CHECK-NEXT: leal 1(%rax), %ebp
	; CHECK-NEXT: movl %ecx, {{[-0-9]+}}(%r{{[sb]}}p) ## 4-byte Spill
	; CHECK-NEXT: cmpq %rax, %rax			; CHECK-NEXT: cmpq %rax, %rax
	; CHECK-NEXT: jl LBB0_53			; CHECK-NEXT: jl LBB0_53
	; CHECK-NEXT: LBB0_54: ## %while.cond1683.preheader			; CHECK-NEXT: LBB0_54: ## %while.cond1683.preheader
	; CHECK-NEXT: xorl %eax, %eax			; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: testb %al, %al			; CHECK-NEXT: testb %al, %al
	; CHECK-NEXT: LBB0_55: ## %if.then.i			; CHECK-NEXT: LBB0_55: ## %if.then.i
	; CHECK-NEXT: ud2			; CHECK-NEXT: ud2
	entry:			entry:
	▲ Show 20 Lines • Show All 368 Lines • Show Last 20 Lines

llvm/test/Transforms/JumpThreading/thread-prob-3.ll

	; RUN: opt -debug-only=branch-prob -jump-threading -S %s 2>&1 \| FileCheck %s			; RUN: opt -debug-only=branch-prob -jump-threading -S %s 2>&1 \| FileCheck %s
	; REQUIRES: asserts			; REQUIRES: asserts

	; Make sure that we set edge probabilities for bb2 as we			; Make sure that we set edge probabilities for bb2 as we
	; call DuplicateCondBranchOnPHIIntoPred(bb3, {bb2}).			; call DuplicateCondBranchOnPHIIntoPred(bb3, {bb2}).
	;			;
	; CHECK-LABEL: ---- Branch Probability Info : foo			; CHECK-LABEL: ---- Branch Probability Info : foo
	; CHECK: set edge bb2 -> 0 successor probability to 0x7fffffff / 0x80000000 = 100.00%			; CHECK: set edge bb2 -> 0 successor probability to 0x80000000 / 0x80000000 = 100.00%
	; CHECK-NEXT: set edge bb2 -> 1 successor probability to 0x00000001 / 0x80000000 = 0.00%			; CHECK-NEXT: set edge bb2 -> 1 successor probability to 0x00000000 / 0x80000000 = 0.00%
	define void @foo(i1 %f0, i1 %f1, i1 %f2) !prof !{!"function_entry_count", i64 0} {			define void @foo(i1 %f0, i1 %f1, i1 %f2) !prof !{!"function_entry_count", i64 0} {
	; CHECK-LABEL: @foo(			; CHECK-LABEL: @foo(
	bb1:			bb1:
	br i1 %f0, label %bb3, label %bb2			br i1 %f0, label %bb3, label %bb2

	bb2:			bb2:
	; CHECK: bb2:			; CHECK: bb2:
	; CHECK-NEXT: br i1 %f2, label %exit1, label %unreach			; CHECK-NEXT: br i1 %f2, label %exit1, label %unreach
	Show All 12 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[BPI] Improve static heuristics for "cold" paths.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 313558

llvm/include/llvm/Analysis/BranchProbabilityInfo.h

llvm/include/llvm/Analysis/LazyBranchProbabilityInfo.h

llvm/lib/Analysis/BranchProbabilityInfo.cpp

llvm/lib/Analysis/OptimizationRemarkEmitter.cpp

llvm/lib/Transforms/Scalar/LoopPredication.cpp

llvm/test/Analysis/BlockFrequencyInfo/redundant_edges.ll

llvm/test/Analysis/BranchProbabilityInfo/basic.ll

llvm/test/Analysis/BranchProbabilityInfo/deopt-intrinsic.ll

llvm/test/Analysis/BranchProbabilityInfo/deopt-invoke.ll

llvm/test/Analysis/BranchProbabilityInfo/loop.ll

llvm/test/Analysis/BranchProbabilityInfo/noreturn.ll

llvm/test/Analysis/BranchProbabilityInfo/unreachable.ll

llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-invoke-probabilities.ll

llvm/test/CodeGen/AMDGPU/transform-block-with-return-to-epilog.ll

llvm/test/CodeGen/ARM/ifcvt-branch-weight-bug.ll

llvm/test/CodeGen/ARM/sub-cmp-peephole.ll

llvm/test/CodeGen/ARM/v8m.base-jumptable_alignment.ll

llvm/test/CodeGen/PowerPC/p10-spill-crgt.ll

llvm/test/CodeGen/PowerPC/pr36292.ll

llvm/test/CodeGen/PowerPC/sms-cpy-1.ll

llvm/test/CodeGen/SPARC/missinglabel.ll

llvm/test/CodeGen/SystemZ/debuginstr-cgp.mir

llvm/test/CodeGen/WebAssembly/switch-unreachable-default.ll

llvm/test/CodeGen/X86/2008-04-17-CoalescerBug.ll

llvm/test/CodeGen/X86/block-placement.ll

llvm/test/CodeGen/X86/misched_phys_reg_assign_order.ll

llvm/test/CodeGen/X86/pr27501.ll

llvm/test/CodeGen/X86/pr37916.ll

llvm/test/CodeGen/X86/ragreedy-hoist-spill.ll

llvm/test/Transforms/JumpThreading/thread-prob-3.ll

[BPI] Improve static heuristics for "cold" paths.
ClosedPublic