This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
IR/
-
DiagnosticInfo.h
1
InitializePasses.h
-
LinkAllPasses.h
-
Transforms/
4/4
Instrumentation.h
-
lib/
-
IR/
-
DiagnosticInfo.cpp
-
Transforms/Instrumentation/
-
Instrumentation/
10/11
CFGMST.h
-
CMakeLists.txt
-
Instrumentation.cpp
-
LLVMBuild.txt
28/36
PGOInstrumentation.cpp
-
test/Transforms/PGOProfile/
-
Transforms/
-
PGOProfile/
-
Inputs/
-
branch1.proftext
-
branch2.proftext
-
criticaledge.proftext
-
diag.proftext
-
landingpad.proftext
-
loop1.proftext
-
loop2.proftext
-
switch.proftext
1
branch1.ll
-
branch2.ll
-
criticaledge.ll
-
diag_mismatch.ll
-
diag_no_funcprofdata.ll
-
diag_no_profile.ll
-
landingpad.ll
-
loop1.ll
2
loop2.ll
-
single_bb.ll
-
switch.ll

Differential D12781

PGO IR-level instrumentation infrastructure
ClosedPublic

Authored by xur on Sep 10 2015, 4:03 PM.

Download Raw Diff

Details

Reviewers

kcc
davidxl
silvas
bogner
dexonsmith

Commits

rGf430ae40cfb9: [PGO] Resubmit "MST based PGO instrumentation infrastructure" (r254021)
rG1b665ca707f4: [PGO] MST based PGO instrumentation infrastructure
rL255132: [PGO] Resubmit "MST based PGO instrumentation infrastructure" (r254021)
rL254021: [PGO] MST based PGO instrumentation infrastructure

Summary

This patch implements the infrastructure for PGO late (i.e. IR-level) instrumentation. (Refer to RFC: PGO Late instrumentation for LLVM http://lists.llvm.org/pipermail/llvm-dev/2015-August/089058.html)

The main part of code is in a newly added file: lib/Transforms/Instrumentation/PGOLateInstr.cpp
This file implements a module pass PGOLateInstrumeatnion. It applies the instrumentation to each function by class PGOLateInstrumentationFunc. For each function, perform the following steps:
(1) Collect all the CFG edges. Assign an estimated weight to each edge. Critical edges and back-edges are assigned to high value of weights. One fake node and a few fake edges (from the fake node to the entry node, and from the exit nodes to the fake node) are also added to the worklist.
(2) Construct the MST. The edges with the higher weight will be put to MST first, unless it forms a cycle.
(3) Traverse the CFG and compute the CFG hash.

The above three steps are the same for profile-generate and profile-use compilation.

In the next step, for profile-generation compilation:
(4-gen) Instrument all the edges that not in the MST. If this is a critical edge, split the edge first. The actual instrumentation is to generate Intrinsic::instrprof_increment() in the instrumented BB. This intrinsic will be lowed by pass createInstrProfilingPass().

For profile-use compilation,
(4-use) Read in the counters and the CFG hash from the profile file.
(5-use) If there is no error, populate the counters to all the edges in reverse topological order of the MST.
(6-use) Once having all the edge counts, set the branch weights metadata for the IR having multiple branches. Also apply the cold/hot function attributes based on function level counts.

This pass is added to PassManagerBuilder when populate module passes. see lib/Transforms/IPO/PassManagerBuilder.cpp.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

This new patch integrated David's review comments. The main changes are
(1) more efficient code for union-find algorithm.
(2) merge EdgeCount structure into Edge
(3) merge BBGroupInfo and BBCount into and rename it to BBInfo.

The reason I do not call SplitAllCriticalEdges is because I don't want to
split all critical edges. I only split these critical edges that need to be
instrumented. For critical edges, I prioritize them into MST to avoid the
split.

-Rong

A couple more comments I spotted when going back through the patch again.

include/llvm/ProfileData/InstrProfWriter.h
44–45 ↗	(On Diff #35559)	Why do we need to modify InstrProfWriter.{h,cpp}? That seems like a layering violation.
lib/Transforms/Instrumentation/PGOLateInstr.cpp
270–278 ↗	(On Diff #35559)	No naked new/delete. Use RAII (e.g. you can probably use BumpPtrAllocator to allocate these).

davidxl added inline comments.Oct 2 2015, 9:58 PM

lib/ProfileData/InstrProfWriter.cpp
86–94 ↗	(On Diff #35559)	Please add a comment.
lib/Transforms/Instrumentation/PGOLateInstr.cpp
117 ↗	(On Diff #35559)	make it std::vector<std::unique_ptr<Edge>>
150 ↗	(On Diff #35559)	make it DenseMap<const BasicBlock *, std::unique_ptr<BBInfo>> BBInfos;
272 ↗	(On Diff #35559)	No need for explicit delete with unique_ptr
336 ↗	(On Diff #35559)	You can use the ReversePostOrderTraversal Class defined in ADT/PostOrderIterator.h. Actually does the order of traverse matter here?
349 ↗	(On Diff #35559)	What is the rationale of this heuristic?
351 ↗	(On Diff #35559)	Why not compute BlockFrequencyInfo and use the real static Edge frequency (from BB freq and edge prob)? For Critical edge, the weight can be increased to infinite.
493 ↗	(On Diff #35559)	As the name linkage function, the name globalization function also needs to be a common interface which FE instrumentation can share in the future.
516 ↗	(On Diff #35559)	This code is shared with FE based instrumentation and it should be refactored as a utility function. Suggested location: include/llvm/Transformation/Instrumentation.h. Converting FE to use the common interface can be done as a follow-up patch
666 ↗	(On Diff #35559)	Make reader a shared object at module level.

Here is the updated patch that integrated David and Sean's comments and suggestions. The major changes are:

move profile reader to module level.
move minimum spanning tree into a utility class.
separate profile generate and use into to two separated passes.
use static profile to set the edge weights.
move the code that can be shared with clang instrumentation to include/llvm/Transforms/Instrumentation.h as utility functions.
name change (using IR instrumentation now)
move InstrProfWrite to a later patch.

I did not use ReversePostOrderTraversal in ADT/PostOrderiterator.h (David suggested) as compared to current version, it leads to more passes to populate the counters.

I think it is better to separate this IR level instrumentation from FE code, to avoid code duplication for other language frond-ends. So I keep to pass the file name instead of IndexedInstrProfRead to llvm. The error checking is already well encapsulated in getFunctioncounts() and ProfReader(). I kept it in middle end too.

I will add unit tests for both minimum spanning tree and IR instrumentation later.

Thanks for working on this!

Mostly nitpicking. I haven't really looked at the implementation yet.

Manman

include/llvm/InitializePasses.h
129	Nit: it is a little strange to change some of the formatting.
include/llvm/Support/CFGMST.h
35 ↗	(On Diff #39382)	Nit: It may contain
49 ↗	(On Diff #39382)	Nit: Find
165 ↗	(On Diff #39382)	This will cause "unused variable" warning when building clang without assertion.
173 ↗	(On Diff #39382)	Same here.
lib/Transforms/IPO/PassManagerBuilder.cpp
247 ↗	(On Diff #39382)	Can you commit this separately if it is necessary?
lib/Transforms/Instrumentation/PGOIRInstr.cpp
140 ↗	(On Diff #39382)	Should you add const here "infoString() const {"?
145 ↗	(On Diff #39382)	This seems wrong. StringRef does not own the string data.

manmanren added inline comments.Nov 5 2015, 4:44 PM

include/llvm/Support/CFGMST.h
36 ↗	(On Diff #39382)	Can you add some high-level comments for CFGMST? What fields does this class expect on Edge and BBInfo? And what Removed field is used for? Do all members need to be public?
92 ↗	(On Diff #39382)	Use nullptr instead of 0?
181 ↗	(On Diff #39382)	typo
187 ↗	(On Diff #39382)	make_unique?
195 ↗	(On Diff #39382)	emplace_back maybe?
lib/Transforms/IPO/PassManagerBuilder.cpp
194 ↗	(On Diff #39382)	typo: option

majnemer added a subscriber: majnemer.Nov 5 2015, 4:46 PM

majnemer added inline comments.

lib/Transforms/Instrumentation/PGOLateInstr.cpp
301–302 ↗	(On Diff #34503)	We already have a CRC implementation in LLVM, do we need another one? Ours lives in <llvm/Support/JamCRC.h> I'd imagine the one we have in-tree is considerably faster than this one...

davidxl added inline comments.Nov 6 2015, 11:50 AM

include/llvm/Support/CFGMST.h
1 ↗	(On Diff #39382)	CFGMST.h.
29 ↗	(On Diff #39382)	It seems that this class is still not general enough to be put into llvm/Support/ directory, I suggest move this file back to PGO directory for now. A more general implementation should also support MachineBB. Also what interfaces need to be exposed, and what template parameters are needed, do we really need to expose edge at interface level (for this class, instead just use a pair of BBs to query on-tree status?) do we need to pass in opaque edge info types (which only clients know about) etc are all open questions -- this is the reason I don't think it is ready to be put into Support. Add a TODO for future work there. More comments will follow later.

Thanks Manman Ren and David Majnemer for the code review. I have integrated their feedbacks to the newly upload patch.
The only thing I did not do is the "make_unque" suggestion by Manman. It seems to be a c++14 feature and I got an compile time error for that.

I'll integrate David Li's feedback in a later patch.

Thanks,

-Rong

This is looking really good!

My biggest concern overall about the implementation is that there seems to be overuse of classes (unnecessary inheritance; unusual use of constructors). It creates quite a maze.

I think that in the process of cleaning up the awkward PGOIRInstrumentationGenFunc Func(F, &M, BPI, BFI); to be free functions (see comment inline) you will find many simplifications to the entire implementation.

Remember, the purpose of a constructor is to initialize the invariants for the class's data members. Don't "do an algorithm" in a constructor or execute "steps" of a procedural computation.

include/llvm/Transforms/Instrumentation.h
180	Use correct capitalization http://llvm.org/docs/CodingStandards.html#commenting
182	http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly (here and elsewhere)
lib/Transforms/Instrumentation/PGOIRInstr.cpp
10 ↗	(On Diff #39600)	"IR level" is redundant since this is in lib/Transforms. Please give an overview here of the structure of the file. (2 module passes: "use" and "gen"). Also explain their relationship, sharing data structures, etc.
124 ↗	(On Diff #39600)	There's a lot of things here that don't seem to be used externally. So use an anonymous namespace.
161 ↗	(On Diff #39600)	Do you actually need dynamic dispatch?
167 ↗	(On Diff #39600)	Why "IR"? Are you planning to generalize this to Machine? For now just leave off "IR" (here and elsewhere).
193 ↗	(On Diff #39600)	Switch to using a trailing underscore. Names starting with underscore followed by a capital letter are reserved for the implementation (it is undefined behavior to name a variable like that).
295 ↗	(On Diff #39600)	Avoid `std::unique_ptr<T> &`. For a simple non-owning handle, use a `T *`. If you don't need to store the handle and the value is guaranteed non-null, consider using `T &`. Same applies in many other parts of this patch where `std::unique_ptr<T> &` is used.
337 ↗	(On Diff #39600)	This class is local to this file, right? No need to use such a large name; makes it sound like this is for an external interface or something. (same comment applies to many other things in this file; I think that the module passes themselves are the only things that really need "PGOInstrumentation" prefix on their name) Look at e.g. lib/Transforms/SROA.cpp with e.g. names like `Slice` or `AllocaSlices`.
340 ↗	(On Diff #39600)	I don't think wrapping behavior is intended, so avoid unsigned types.
368 ↗	(On Diff #39600)	Nit: ArrayRef
440 ↗	(On Diff #39600)	Nit: no braces.
505 ↗	(On Diff #39600)	http://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return
517 ↗	(On Diff #39600)	nullptr.
524 ↗	(On Diff #39600)	In what way? And why does it matter to know that here?
563 ↗	(On Diff #39600)	Just use `reverse` adapter from `llvm/ADT/STLExtras.h`
627 ↗	(On Diff #39600)	Just declare this as `SmallVector<unsigned, 2> EdgeCounts(Size, 0);`
631 ↗	(On Diff #39600)	Is this ever initialized?
632 ↗	(On Diff #39600)	Can you use a range-for?
638 ↗	(On Diff #39600)	Can you fix GetSuccessorNumber to take a `const BasicBlock *`? (hopefully it is a pretty obvious fix and there is no need for pre-commit review). Then you can get rid of all these const_cast's.
665 ↗	(On Diff #39600)	This is extremely awkward (declaring a local variable as a way to do a stateful mutating operation). Just use a free function. Same thing for the similar line with `PGOIRInstrumentationUseFunc` below.

davidxl added inline comments.Nov 7 2015, 1:55 PM

include/llvm/Support/CFGMST.h
78 ↗	(On Diff #39600)	Return BBInfo& instead of unique_ptr<BBInfo>&. It is more efficient for uses later .
81 ↗	(On Diff #39600)	assert It->second.get() != null. Return *It->second.get().
94 ↗	(On Diff #39600)	Add more comments explaining here why need two duplicate fake edges.
104 ↗	(On Diff #39600)	Redundant guard.
105 ↗	(On Diff #39600)	Make successors and i 'unsigned' type.
110 ↗	(On Diff #39600)	Use a macro for 1000? CRITICAL_EDGE_MULTIPLIER? Also handling overflow situation?
113 ↗	(On Diff #39600)	Stale code.
127 ↗	(On Diff #39600)	--> sortEdgesByWeight
146 ↗	(On Diff #39600)	Should this be marked unconditionally here? Otherwise, we can simply boost the weight for such edges even higher without special treatment here?
184 ↗	(On Diff #39600)	Make Edge* the return type for efficiency.
185 ↗	(On Diff #39600)	Use a const var for default weight.
186 ↗	(On Diff #39600)	Combine the find + insert pattern into one insert (with pair of source BB and null ptr). If the map is updated, a iterator is returned that can be updated with unique_ptr reset.
191 ↗	(On Diff #39600)	Same here.
197 ↗	(On Diff #39600)	return AllEdges.back().get();
lib/Transforms/Instrumentation/PGOIRInstr.cpp
167 ↗	(On Diff #39600)	There was an earlier suggestion to change 'Late' in PGOLateInstrumentation to 'IR'. To add more to the bikeshed, I suggest change the class name to class FuncPGOInstrumentation { };
181 ↗	(On Diff #39600)	Mst --> MST or MinSpanTree.
185 ↗	(On Diff #39600)	BasicBlock *getInstrBB(const Edge &E);

silvas added inline comments.Nov 9 2015, 11:20 AM

lib/Transforms/Instrumentation/PGOIRInstr.cpp
167 ↗	(On Diff #39600)	The suggestion to use "IR" instead of "Late" is just for informal discussion when needed to differentiate from what we currently have in clang. For comments in code or naming variables, all code in lib/Transforms is inherently at IR level so it is redundant. For example, we do not have a "SROAIR" pass or "InstCombineIR" pass etc. (grep for "IR" in lib/Transforms to see typical usage; it is almost never used)

davidxl added inline comments.Nov 9 2015, 11:56 AM

include/llvm/Transforms/Instrumentation.h
181	An interface is available in ProfileData/InstrProf.h: llvm::getPGOFuncName which is also used by cfe. Remove this dup.
208	Similarly llvm::createPGOFuncNameVar will do what you need. Remove this duplicate.

Thank David and Sean for the detailed comments -- they are really helpful.
I integrated the review to the patch. Could you please take a look. Thanks!

include/llvm/Support/CFGMST.h
94 ↗	(On Diff #39600)	This is a mistaken introduced in the lat minute formatting. Fixed.
104 ↗	(On Diff #39600)	why Redundant? numsuccessors can be 0 (and these is else branch)
146 ↗	(On Diff #39600)	The reason to split two passes is for the case of using profile count value as the weight, in which the values can vary drastically.
185 ↗	(On Diff #39600)	removed the default weight.
lib/Transforms/Instrumentation/PGOIRInstr.cpp
161 ↗	(On Diff #39600)	it returns different information strings. I can get rid of it. But I don't think it really matters -- it only used in debug printing.
524 ↗	(On Diff #39600)	you are right. This comment is unnecessary and does not belong here. Deleted.
631 ↗	(On Diff #39600)	good catch. fixed.
638 ↗	(On Diff #39600)	Will do this later.

Here is the latest patch that integrated David and Sean's comments.

The use of inheritance is still excessive. For example, PGOUseFunc and PGOGenFunc don't seem to need inheritance. Their "subclass state" would be better as local variables of a free function. FuncPGOInstrumentation does not need to be a base class. It can simply be a FuncInfo class that manages/caches information about a single function. Both uses of FuncPGOInstrumentation seem to call performOnFunc before they do anything else, so this appears to be establishing an invariant for the class, so this operation should be performed in the constructor (this is much more natural as FuncInfo; the complex construction was confusing when it was a base class).

lib/Transforms/Instrumentation/CFGMST.h
40	Use a lambda.
111	Move the declaration for `bool Critical = false;` down one line and simplify to `bool Critical = isCriticalEdge(TI, i);`
139	No need to mention clang here. Please file a bug report for this if there isn't one open already.
207	Avoid using reserved names.

silvas added inline comments.Nov 13 2015, 7:04 PM

lib/Transforms/Instrumentation/CFGMST.h
162	Factor the DEBUG macro usage into a single point of truth here. E.g. factor out a function `printEdges(raw_ostream &OS, const StringRef Message)`. Then change this function to be simply `DEBUG(dumpEdges(dbgs(), Message))`. (this is similar to the pattern used by Value, for example). This will also give clang-format an easier time.
187	Use std::tie
lib/Transforms/Instrumentation/PGOInstrumentation.cpp
152	Other places seem to use uint64_t for weight. Why `unsigned` here?
178	Does this actually need to be virtual? Same for all the other virtual methods in this file.
218	"dump" functions are for debugging and should be conditionalized. For example, I suggest refactoring to allow a call DEBUG( std::string Message = "Dump Function " + FuncName + " after CFGMST: \t Hash: " + std::to_string(FunctionHash); MST.dumpEdges(dbgs(), Message); )
440	Avoid having debug code be non-conditional.
450	Avoid `const std::unique_ptr<PGOUseEdge> &`.
655	Use a free function `instrumentOneFunc`
686	Use a free function `setPGOCountOnFunc`

Also, can you begin to work on tests? I think right now the core algorithm is pretty good. There are some cleanups to the implementation that I'd like to see but I can do those post-commit if necessary. Right now the main missing piece needed before this is committed are tests.

davidxl added inline comments.Nov 14 2015, 11:59 AM

lib/Transforms/Instrumentation/PGOInstrumentation.cpp
201	if there is no BB ..
409	It is better to just call it 'setEdgeCount' with same documentation -- or make it just getUnknownCountEdge() and let the client to set the edge directly.
450	Sean's comment here is not addressed -- use raw PGOUseEdge&
461	In general it is not safe to update the container (AllEdges) while iterating -- it may invalidate the iterator or element reference saved before. The code should create a worklist of edges before the count setting -- the worklist will then be 'frozen' (It is safe to update worklist, but there is no need to push new split edges into the list) and AllEdges can be safely updated.
497	Use the new getInstrProfRecord method -- eventually we will need to read value profile data too.

xur marked 17 inline comments as done.Nov 16 2015, 3:54 PM

xur added inline comments.

lib/Transforms/Instrumentation/CFGMST.h
111	reorganized the code, also handles overflow.
lib/Transforms/Instrumentation/PGOInstrumentation.cpp
152	changed to uint64_t.
440	Now guarded by DEBUG.
461	I see your point. But this is exactly the reason I did not use range-based loop like all others in the file. I get the vector size in the loop initialization and use the index to reference the vector elements in the body. So adding element to the vector will not be a problem. I could use a worklist, but the effect should be the same.

Here is the latest patch that integrated Sean and David Li's comments.
I'm working on the test case. One thing I'm not clear is that I need the passmanagerbuild change to invoke the new functionality. That part of code was split from this patch from Manman's review comments.

-Rong

Some more small comments.

lib/Transforms/Instrumentation/CFGMST.h
94	Use a real variable.
192	llvm::make_unique
194	`return *AllEdges.back()`
lib/Transforms/Instrumentation/PGOInstrumentation.cpp
32	Give intuition for why it is edges not in the MST rather than edges in the MST? Edges with high counts should have high weight and therefore not be in the MST (which tries for minimum weight); we don't want to instrument edges with high counts, and they are not in the MST, so why would we place counters there?
149	Please cite the Knuth paper. Also explain what we actually do with the MST (and give intuition for why that makes sense (it's fairly simple, should not need a long explanation)).
294	Make these two variable just local variables of `instrumentOnFunc` free function.
307	This is only called in one place, inline the code into `instrumentOnFunc`.
474	Can you use continue to reduce indentation for the "large" side of the `if`? ( http://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code )
514	Do you mean to do a copy here? Probably `std::vector<uint64_t> &` is intended.
519	`e` instead of `E`, as is common in LLVM.
544	`getBBInfo(Ei->SrcBB)`. You shouldn't need a cast here (if you do, then please fix whatever code is requiring this to cast away constness).
567	Just make the iteration variable `BB` and use `getBBInfo(&BB)`, like you do below in the loop `// Assert every BB has a valid counter.`.
634	Remove the const_cast (I think this will require making GetSuccessorNumber take `const BasicBlock *`, which you can do as a separate patch (no need for pre-commit review))
653	Do you mean `instrumentOneFunc`? `instrumentOnFunc` doesn't make sense (weird use of "on").

Thanks for the suggestion. All fixed. Please refer to the newest patch for the update.

Updated with Sean's new comments.
Also includes the unit tests.

Thanks,

-Rong

Can you also add a script to re-generate the binary profile data needed whenever profile format changes ?

These .ll files are the complete program IR that can generate the
profiles. I can embed the commands in the comment of the .ll files. Just
to make sure: You want a separated shell script in the same director to
generate the profiles?

Thanks,

-Rong

These test files look like they are just a dump of IR generated from a C/C++, which is extremely verbose and has a lot of inessential details. I also don't like checking in binary profdata files. Overall these tests seem extremely brittle. I also don't understand why the test files have names like "for" or "goto" or "ifelse". Those concepts don't exist in the IR. Surely the tests should have names like "criticaledge" etc.

It seems like this testing might be more readable as C++ unit tests. Using the new pass manager this should be easy to wire up. I think the hardest part would be to stub out IndexedInstrProfReader.

I thought about the size of the test case too. Initially I had the same
concern, but in second look, it does not seem to be brittle -- because it
just checks whether the key branches are properly annotated with the right
profile count or not, or profile counter update instruction is inserted or
not.

I think adding C++ unit tests are independent tasks which can be done once
the driver part of the changes land.

David

I changed the regression tests based on the feedbacks from Sean, David and Justin. The IR is much simplified and the profile uses the text format.
I still keep the target triplet as x86_64-unknown-linux-gnu, because this is the only platform I tested. I will try to relax it later.

Thanks,

-Rong

Looks good with the minor fix.

lib/Transforms/Instrumentation/PGOInstrumentation.cpp
310	Remove unused code.

Closed by commit rL254021: [PGO] MST based PGO instrumentation infrastructure (authored by xur). · Explain WhyNov 24 2015, 1:34 PM

This revision was automatically updated to reflect the committed changes.

Seems it behaves differently on i686. Investigating.

See also; http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3865
(It is configured as "gcc -m32")

Hi Rong,

Typically, when there have been multiple active reviewers, it is common
courtesy to wait for all of them to LGTM the patch.

I had further comments on the testing here (which the bots seem to have
caught you on anyway), so I recommend reverting this patch for the moment
and reopening the review.

Sean Silva

Fix a few issues that being exposed by the buildbot:
(1) wrong shift operation in functionhash computing,
(2) missing type cast for vector<...>size_type (which results bad functionhash in m32 host)
(3) sorting is not stable

Also improved the test checks suggested by Sean.

Thanks,

-Rong

I took another pass through, so some of the comments are nits. But most of the comments on the tests in particular are quite significant and not just "nits".

lib/Transforms/IPO/LLVMBuild.txt
23 ↗	(On Diff #41175)	Why does IPO now depend on Instrumentation, but didn't previously? What is different about the PGO passes vs the other instrumentation passes?
lib/Transforms/Instrumentation/CFGMST.h
129	Naming convention.
lib/Transforms/Instrumentation/PGOInstrumentation.cpp
2	The length of this line does not match the one below.
12	typo: two spaces after 'the' should be just one space.
32	Why are they mutually exclusive? Weren't you wanting to the the count profile in MST computation eventually?
35	typo: "is done"
42	class `PGOGenFunc` is no longer present, please update the comment.
46	These header comments easily go out of date during patch review. I would recommend reading the comment from top to bottom and verifying that it is up to date.
87	This option is only for testing, so please rename it to make that clear (also update the description).
281	Do you still need these const_cast<>'s after r253733?
494	You say "edges" (plural) here but then say "There should be one and only one".
test/Transforms/PGOProfile/branch1_gen.ll
7 ↗	(On Diff #41175)	This would be more readable with a non-mangled name. (same for the other tests)
test/Transforms/PGOProfile/branch2_use.ll
10 ↗	(On Diff #41175)	Verify that it is attached to the instruction you expect (CHECK-SAME may be useful). (same for the other "use" tests)
test/Transforms/PGOProfile/criticaledge_gen.ll
53 ↗	(On Diff #41175)	Will this test fail if the call ends up in the previous or next BB? I would recommend, for each BB, having a CHECK line verifying the BB name, followed by either CHECK or CHECK-NOT verifying the presence or absence of the increment call.
test/Transforms/PGOProfile/criticaledge_use.ll
10 ↗	(On Diff #41175)	Please have the names consistent with the switch values.
25 ↗	(On Diff #41175)	This test case can be simplified further. At the very least, the function `bar` is not needed.
49 ↗	(On Diff #41175)	The convention for FileCheck variable captures is all upper case (e.g. `[[BW2:[0-9]+]]`). Also, can you use more semantic names instead of numbers? E.g. `BW_<name>` instead of `BW<number>`.
test/Transforms/PGOProfile/loop2_use.ll
5 ↗	(On Diff #41175)	What part of the code is this test trying to cover that is not covered by loop3 or loop1? Do we need this test?
test/Transforms/PGOProfile/switch_gen.ll
6 ↗	(On Diff #41175)	You can merge the _gen and _use files by using FileCheck's option --check-prefix (e.g. --check-prefix=GEN or --check-prefix=USE). Actually, doing this is necessary so that there is a single point of truth for the IR used by both.
test/Transforms/PGOProfile/switch_use.ll
16 ↗	(On Diff #41175)	These `add` instructions are not needed. Just make the function return void.

Thanks for Sean's suggestion. They indeed make the tests cleaner and more robust. I integrated his reviews in the latest patch that I'll post soon.

lib/Transforms/IPO/LLVMBuild.txt
23 ↗	(On Diff #41175)	thanks for catching this. This should not be there -- it is only needed for the passmangerbuilder change.
test/Transforms/PGOProfile/criticaledge_use.ll
25 ↗	(On Diff #41175)	bar is a file static function and was used to check the source name is part of the profile variable name.

Integrated Sean's most recently review comments.

The tests are looking great! A couple nits and a final question about the test.

test/Transforms/PGOProfile/branch1.ll
32	Can you use CHECK-DAG directive to move these next to the place where the `BW_*` capture occurs? That would improve locality and clarify what is being checked. (same in the other files)
test/Transforms/PGOProfile/checksum_mismatch.ll
4 ↗	(On Diff #41428)	Tiny nit: could you name all the tests which are just checking diagnostics to match `diag_*.ll`? (or whatever naming convention seems appropriate) That will help to clearly identify them.
test/Transforms/PGOProfile/loop2.ll
10	What is the importance of testing a nested for loop? What part of the code are you trying to exercise that isn't covered by loop1.ll?

davidxl added inline comments.Nov 30 2015, 8:38 PM

test/Transforms/PGOProfile/loop2.ll
10	Having loop nest in the test for better coverage is fine -- but looks like we don't actually need 3-deep nest -- a 2-deep loop nest is good enough and will be easier to read. loop1.ll has a loop with top test. How about a loop test with bottom testing. Also a loop with more control flow inside the body, and a loop with early exit might be nice to have -- but those can be added later as follow ups if needed.

@silvas:
It's not clear how can I use CHECK-DAG to group together the branch weight
meta data and the related instruction. if there is a single instruction and
single meta data, I can move up the branch weight meta data next the the
instruction. But if there are multiple, I don't know how to do it. From
the document, CHECK-DAG can be used b/w two matches (or before the first
match, or after the last match). Move the branch-weight meta data and use
USE-DAG is not working.

I also tried to change all use USE to USE (except the first and last). That
won't work either.

As for the loop2, it just a more complex data flow. I'll change it to two
level nexted loop according to David's suggestion.

@david:
I used to use a buttom test loop, but I removed based on silvas's
suggestion. I'll add more tests later if necessary.

Integrated most recent review comments from Sean, David and Justin.
Let me know if I missed anything.

Thanks,

-Rong

Sean, Justin and David: Do you have any comment on the latest patch?
If it looks fine to you, I plan to submit it again this week.

Thanks!

-Rong

MaskRay mentioned this in D104060: Machine IR Profile.Jun 14 2021, 5:00 PM

Revision Contents

Path

Size

include/

llvm/

IR/

DiagnosticInfo.h

26 lines

InitializePasses.h

2 lines

LinkAllPasses.h

2 lines

Transforms/

Instrumentation.h

23 lines

lib/

IR/

DiagnosticInfo.cpp

6 lines

Transforms/

Instrumentation/

217 lines

1 line

2 lines

2 lines

PGOInstrumentation.cpp

718 lines

test/

Transforms/

PGOProfile/

Inputs/

branch1.proftext

6 lines

branch2.proftext

6 lines

criticaledge.proftext

17 lines

5 lines

14 lines

6 lines

7 lines

8 lines

30 lines

37 lines

108 lines

12 lines

diag_no_funcprofdata.ll

12 lines

9 lines

124 lines

42 lines

70 lines

12 lines

47 lines

Diff 41686

include/llvm/IR/DiagnosticInfo.h

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	enum DiagnosticKind {
DK_SampleProfile,		DK_SampleProfile,
DK_OptimizationRemark,		DK_OptimizationRemark,
DK_OptimizationRemarkMissed,		DK_OptimizationRemarkMissed,
DK_OptimizationRemarkAnalysis,		DK_OptimizationRemarkAnalysis,
DK_OptimizationRemarkAnalysisFPCommute,		DK_OptimizationRemarkAnalysisFPCommute,
DK_OptimizationRemarkAnalysisAliasing,		DK_OptimizationRemarkAnalysisAliasing,
DK_OptimizationFailure,		DK_OptimizationFailure,
DK_MIRParser,		DK_MIRParser,
		DK_PGOProfile,
DK_FirstPluginKind		DK_FirstPluginKind
};		};

/// \brief Get the next available kind ID for a plugin diagnostic.		/// \brief Get the next available kind ID for a plugin diagnostic.
/// Each time this function is called, it returns a different number.		/// Each time this function is called, it returns a different number.
/// Therefore, a plugin that wants to "identify" its own classes		/// Therefore, a plugin that wants to "identify" its own classes
/// with a dynamic identifier, just have to use this method to get a new ID		/// with a dynamic identifier, just have to use this method to get a new ID
/// and assign it to each of its classes.		/// and assign it to each of its classes.
▲ Show 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	private:
/// Line number where the diagnostic occurred. If 0, no line number will		/// Line number where the diagnostic occurred. If 0, no line number will
/// be emitted in the message.		/// be emitted in the message.
unsigned LineNum;		unsigned LineNum;

/// Message to report.		/// Message to report.
const Twine &Msg;		const Twine &Msg;
};		};

		/// Diagnostic information for the PGO profiler.
		class DiagnosticInfoPGOProfile : public DiagnosticInfo {
		public:
		DiagnosticInfoPGOProfile(const char *FileName, const Twine &Msg,
		DiagnosticSeverity Severity = DS_Error)
		: DiagnosticInfo(DK_PGOProfile, Severity), FileName(FileName), Msg(Msg) {}

		/// \see DiagnosticInfo::print.
		void print(DiagnosticPrinter &DP) const override;

		static bool classof(const DiagnosticInfo *DI) {
		return DI->getKind() == DK_PGOProfile;
		}

		const char *getFileName() const { return FileName; }
		const Twine &getMsg() const { return Msg; }

		private:
		/// Name of the input file associated with this diagnostic.
		const char *FileName;

		/// Message to report.
		const Twine &Msg;
		};

/// Common features for diagnostics dealing with optimization remarks.		/// Common features for diagnostics dealing with optimization remarks.
class DiagnosticInfoOptimizationBase : public DiagnosticInfo {		class DiagnosticInfoOptimizationBase : public DiagnosticInfo {
public:		public:
/// \p PassName is the name of the pass emitting this diagnostic.		/// \p PassName is the name of the pass emitting this diagnostic.
/// \p Fn is the function where the diagnostic is being emitted. \p DLoc is		/// \p Fn is the function where the diagnostic is being emitted. \p DLoc is
/// the location information to use in the diagnostic. If line table		/// the location information to use in the diagnostic. If line table
/// information is available, the diagnostic will include the source code		/// information is available, the diagnostic will include the source code
/// location. \p Msg is the message to show. Note that this class does not		/// location. \p Msg is the message to show. Note that this class does not
▲ Show 20 Lines • Show All 304 Lines • Show Last 20 Lines

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 111 Lines • ▼ Show 20 Lines
	void initializeDomViewerPass(PassRegistry&);			void initializeDomViewerPass(PassRegistry&);
	void initializeDominanceFrontierPass(PassRegistry&);			void initializeDominanceFrontierPass(PassRegistry&);
	void initializeDominatorTreeWrapperPassPass(PassRegistry&);			void initializeDominatorTreeWrapperPassPass(PassRegistry&);
	void initializeEarlyIfConverterPass(PassRegistry&);			void initializeEarlyIfConverterPass(PassRegistry&);
	void initializeEdgeBundlesPass(PassRegistry&);			void initializeEdgeBundlesPass(PassRegistry&);
	void initializeExpandPostRAPass(PassRegistry&);			void initializeExpandPostRAPass(PassRegistry&);
	void initializeAAResultsWrapperPassPass(PassRegistry &);			void initializeAAResultsWrapperPassPass(PassRegistry &);
	void initializeGCOVProfilerPass(PassRegistry&);			void initializeGCOVProfilerPass(PassRegistry&);
				void initializePGOInstrumentationGenPass(PassRegistry&);
				void initializePGOInstrumentationUsePass(PassRegistry&);
	void initializeInstrProfilingPass(PassRegistry&);			void initializeInstrProfilingPass(PassRegistry&);
	void initializeAddressSanitizerPass(PassRegistry&);			void initializeAddressSanitizerPass(PassRegistry&);
	void initializeAddressSanitizerModulePass(PassRegistry&);			void initializeAddressSanitizerModulePass(PassRegistry&);
	void initializeMemorySanitizerPass(PassRegistry&);			void initializeMemorySanitizerPass(PassRegistry&);
	void initializeThreadSanitizerPass(PassRegistry&);			void initializeThreadSanitizerPass(PassRegistry&);
	void initializeSanitizerCoverageModulePass(PassRegistry&);			void initializeSanitizerCoverageModulePass(PassRegistry&);
	void initializeDataFlowSanitizerPass(PassRegistry&);			void initializeDataFlowSanitizerPass(PassRegistry&);
	void initializeScalarizerPass(PassRegistry&);			void initializeScalarizerPass(PassRegistry&);
				manmanrenUnsubmitted Not Done Reply Inline Actions Nit: it is a little strange to change some of the formatting. manmanren: Nit: it is a little strange to change some of the formatting.
	void initializeEarlyCSELegacyPassPass(PassRegistry &);			void initializeEarlyCSELegacyPassPass(PassRegistry &);
	void initializeEliminateAvailableExternallyPass(PassRegistry&);			void initializeEliminateAvailableExternallyPass(PassRegistry&);
	void initializeExpandISelPseudosPass(PassRegistry&);			void initializeExpandISelPseudosPass(PassRegistry&);
	void initializeFunctionAttrsPass(PassRegistry&);			void initializeFunctionAttrsPass(PassRegistry&);
	void initializeGCMachineCodeAnalysisPass(PassRegistry&);			void initializeGCMachineCodeAnalysisPass(PassRegistry&);
	void initializeGCModuleInfoPass(PassRegistry&);			void initializeGCModuleInfoPass(PassRegistry&);
	void initializeGVNPass(PassRegistry&);			void initializeGVNPass(PassRegistry&);
	void initializeGlobalDCEPass(PassRegistry&);			void initializeGlobalDCEPass(PassRegistry&);
	▲ Show 20 Lines • Show All 173 Lines • Show Last 20 Lines

include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void) llvm::createDeadStoreEliminationPass();		(void) llvm::createDeadStoreEliminationPass();
(void) llvm::createDependenceAnalysisPass();		(void) llvm::createDependenceAnalysisPass();
(void) llvm::createDivergenceAnalysisPass();		(void) llvm::createDivergenceAnalysisPass();
(void) llvm::createDomOnlyPrinterPass();		(void) llvm::createDomOnlyPrinterPass();
(void) llvm::createDomPrinterPass();		(void) llvm::createDomPrinterPass();
(void) llvm::createDomOnlyViewerPass();		(void) llvm::createDomOnlyViewerPass();
(void) llvm::createDomViewerPass();		(void) llvm::createDomViewerPass();
(void) llvm::createGCOVProfilerPass();		(void) llvm::createGCOVProfilerPass();
		(void) llvm::createPGOInstrumentationGenPass();
		(void) llvm::createPGOInstrumentationUsePass();
(void) llvm::createInstrProfilingPass();		(void) llvm::createInstrProfilingPass();
(void) llvm::createFunctionInliningPass();		(void) llvm::createFunctionInliningPass();
(void) llvm::createAlwaysInlinerPass();		(void) llvm::createAlwaysInlinerPass();
(void) llvm::createGlobalDCEPass();		(void) llvm::createGlobalDCEPass();
(void) llvm::createGlobalOptimizerPass();		(void) llvm::createGlobalOptimizerPass();
(void) llvm::createGlobalsAAWrapperPass();		(void) llvm::createGlobalsAAWrapperPass();
(void) llvm::createIPConstantPropagationPass();		(void) llvm::createIPConstantPropagationPass();
(void) llvm::createIPSCCPPass();		(void) llvm::createIPSCCPPass();
▲ Show 20 Lines • Show All 101 Lines • Show Last 20 Lines

include/llvm/Transforms/Instrumentation.h

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	struct GCOVOptions {

// Emit the exit block immediately after the start block, rather than after		// Emit the exit block immediately after the start block, rather than after
// all of the function body's blocks.		// all of the function body's blocks.
bool ExitBlockBeforeBody;		bool ExitBlockBeforeBody;
};		};
ModulePass *createGCOVProfilerPass(const GCOVOptions &Options =		ModulePass *createGCOVProfilerPass(const GCOVOptions &Options =
GCOVOptions::getDefault());		GCOVOptions::getDefault());

		// PGO Instrumention
		ModulePass *createPGOInstrumentationGenPass();
		ModulePass *
		createPGOInstrumentationUsePass(StringRef Filename = StringRef(""));

/// Options for the frontend instrumentation based profiling pass.		/// Options for the frontend instrumentation based profiling pass.
struct InstrProfOptions {		struct InstrProfOptions {
InstrProfOptions() : NoRedZone(false) {}		InstrProfOptions() : NoRedZone(false) {}

// Add the 'noredzone' attribute to added runtime library calls.		// Add the 'noredzone' attribute to added runtime library calls.
bool NoRedZone;		bool NoRedZone;

// Name of the profile file to use as output		// Name of the profile file to use as output
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
// BoundsChecking - This pass instruments the code to perform run-time bounds		// BoundsChecking - This pass instruments the code to perform run-time bounds
// checking on loads, stores, and other memory intrinsics.		// checking on loads, stores, and other memory intrinsics.
FunctionPass *createBoundsCheckingPass();		FunctionPass *createBoundsCheckingPass();

/// \brief This pass splits the stack into a safe stack and an unsafe stack to		/// \brief This pass splits the stack into a safe stack and an unsafe stack to
/// protect against stack-based overflow vulnerabilities.		/// protect against stack-based overflow vulnerabilities.
FunctionPass createSafeStackPass(const TargetMachine TM = nullptr);		FunctionPass createSafeStackPass(const TargetMachine TM = nullptr);

		/// \brief Calculate what to divide by to scale counts.
		///
		/// Given the maximum count, calculate a divisor that will scale all the
		/// weights to strictly less than UINT32_MAX.
		static inline uint64_t calculateCountScale(uint64_t MaxCount) {
		return MaxCount < UINT32_MAX ? 1 : MaxCount / UINT32_MAX + 1;
		}

		/// \brief Scale an individual branch count.
		///
		/// Scale a 64-bit weight down to 32-bits using \c Scale.
		///
		static inline uint32_t scaleBranchCount(uint64_t Count, uint64_t Scale) {
		uint64_t Scaled = Count / Scale;
		assert(Scaled <= UINT32_MAX && "overflow 32-bits");
		return Scaled;
		}

} // End llvm namespace		} // End llvm namespace

#endif		#endif
		silvasUnsubmitted Done Reply Inline Actions http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly (here and elsewhere) silvas: http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators…
		silvasUnsubmitted Done Reply Inline Actions Use correct capitalization http://llvm.org/docs/CodingStandards.html#commenting silvas: Use correct capitalization http://llvm.org/docs/CodingStandards.html#commenting
		davidxlUnsubmitted Done Reply Inline Actions An interface is available in ProfileData/InstrProf.h: llvm::getPGOFuncName which is also used by cfe. Remove this dup. davidxl: An interface is available in ProfileData/InstrProf.h: llvm::getPGOFuncName which is also used…
		davidxlUnsubmitted Done Reply Inline Actions Similarly llvm::createPGOFuncNameVar will do what you need. Remove this duplicate. davidxl: Similarly llvm::createPGOFuncNameVar will do what you need. Remove this duplicate.

lib/IR/DiagnosticInfo.cpp

Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	if (!FileName.empty()) {
DP << getFileName();		DP << getFileName();
if (LineNum > 0)		if (LineNum > 0)
DP << ":" << getLineNum();		DP << ":" << getLineNum();
DP << ": ";		DP << ": ";
}		}
DP << getMsg();		DP << getMsg();
}		}

		void DiagnosticInfoPGOProfile::print(DiagnosticPrinter &DP) const {
		if (getFileName())
		DP << getFileName() << ": ";
		DP << getMsg();
		}

bool DiagnosticInfoOptimizationBase::isLocationAvailable() const {		bool DiagnosticInfoOptimizationBase::isLocationAvailable() const {
return getDebugLoc();		return getDebugLoc();
}		}

void DiagnosticInfoOptimizationBase::getLocation(StringRef *Filename,		void DiagnosticInfoOptimizationBase::getLocation(StringRef *Filename,
unsigned *Line,		unsigned *Line,
unsigned *Column) const {		unsigned *Column) const {
DILocation *L = getDebugLoc();		DILocation *L = getDebugLoc();
▲ Show 20 Lines • Show All 95 Lines • Show Last 20 Lines

lib/Transforms/Instrumentation/CFGMST.h

This file was added.

				//===-- CFGMST.h - Minimum Spanning Tree for CFG ----------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements a Union-find algorithm to compute Minimum Spanning Tree
				// for a given CFG.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/Analysis/BlockFrequencyInfo.h"
				#include "llvm/Analysis/BranchProbabilityInfo.h"
				#include "llvm/Analysis/CFG.h"
				#include "llvm/Support/BranchProbability.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/raw_ostream.h"
				#include "llvm/Transforms/Utils/BasicBlockUtils.h"
				#include <string>
				#include <utility>
				#include <vector>

				namespace llvm {

				#define DEBUG_TYPE "cfgmst"

				/// \brief An union-find based Minimum Spanning Tree for CFG
				///
				/// Implements a Union-find algorithm to compute Minimum Spanning Tree
				/// for a given CFG.
				template <class Edge, class BBInfo> class CFGMST {
				public:
				Function &F;

				// Store all the edges in CFG. It may contain some stale edges
				silvasUnsubmitted Done Reply Inline Actions Use a lambda. silvas: Use a lambda.
				// when Removed is set.
				std::vector<std::unique_ptr<Edge>> AllEdges;

				// This map records the auxiliary information for each BB.
				DenseMap<const BasicBlock *, std::unique_ptr<BBInfo>> BBInfos;

				// Find the root group of the G and compress the path from G to the root.
				BBInfo findAndCompressGroup(BBInfo G) {
				if (G->Group != G)
				G->Group = findAndCompressGroup(static_cast<BBInfo *>(G->Group));
				return static_cast<BBInfo *>(G->Group);
				}

				// Union BB1 and BB2 into the same group and return true.
				// Returns false if BB1 and BB2 are already in the same group.
				bool unionGroups(const BasicBlock BB1, const BasicBlock BB2) {
				BBInfo *BB1G = findAndCompressGroup(&getBBInfo(BB1));
				BBInfo *BB2G = findAndCompressGroup(&getBBInfo(BB2));

				if (BB1G == BB2G)
				return false;

				// Make the smaller rank tree a direct child or the root of high rank tree.
				if (BB1G->Rank < BB2G->Rank)
				BB1G->Group = BB2G;
				else {
				BB2G->Group = BB1G;
				// If the ranks are the same, increment root of one tree by one.
				if (BB1G->Rank == BB2G->Rank)
				BB1G->Rank++;
				}
				return true;
				}

				// Give BB, return the auxiliary information.
				BBInfo &getBBInfo(const BasicBlock *BB) const {
				auto It = BBInfos.find(BB);
				assert(It->second.get() != nullptr);
				return *It->second.get();
				}

				// Traverse the CFG using a stack. Find all the edges and assign the weight.
				// Edges with large weight will be put into MST first so they are less likely
				// to be instrumented.
				void buildEdges() {
				DEBUG(dbgs() << "Build Edge on " << F.getName() << "\n");

				const BasicBlock *BB = &(F.getEntryBlock());
				uint64_t EntryWeight = (BFI != nullptr ? BFI->getEntryFreq() : 2);
				// Add a fake edge to the entry.
				addEdge(nullptr, BB, EntryWeight);

				// Special handling for single BB functions.
				if (succ_empty(BB)) {
				silvasUnsubmitted Done Reply Inline Actions Use a real variable. silvas: Use a real variable.
				addEdge(BB, nullptr, EntryWeight);
				return;
				}

				static const uint32_t CriticalEdgeMultiplier = 1000;

				for (Function::iterator BB = F.begin(), E = F.end(); BB != E; ++BB) {
				TerminatorInst *TI = BB->getTerminator();
				uint64_t BBWeight =
				(BFI != nullptr ? BFI->getBlockFreq(&*BB).getFrequency() : 2);
				uint64_t Weight = 2;
				if (int successors = TI->getNumSuccessors()) {
				for (int i = 0; i != successors; ++i) {
				BasicBlock *TargetBB = TI->getSuccessor(i);
				bool Critical = isCriticalEdge(TI, i);
				uint64_t scaleFactor = BBWeight;
				if (Critical) {
				silvasUnsubmitted Done Reply Inline Actions Move the declaration for `bool Critical = false;` down one line and simplify to `bool Critical = isCriticalEdge(TI, i);` silvas: Move the declaration for `bool Critical = false;` down one line and simplify to `bool Critical…
				xurAuthorUnsubmitted Not Done Reply Inline Actions reorganized the code, also handles overflow. xur: reorganized the code, also handles overflow.
				if (scaleFactor < UINT64_MAX / CriticalEdgeMultiplier)
				scaleFactor *= CriticalEdgeMultiplier;
				else
				scaleFactor = UINT64_MAX;
				}
				if (BPI != nullptr)
				Weight = BPI->getEdgeProbability(&*BB, TargetBB).scale(scaleFactor);
				addEdge(&*BB, TargetBB, Weight).IsCritical = Critical;
				DEBUG(dbgs() << " Edge: from " << BB->getName() << " to "
				<< TargetBB->getName() << " w=" << Weight << "\n");
				}
				} else {
				addEdge(&*BB, nullptr, BBWeight);
				DEBUG(dbgs() << " Edge: from " << BB->getName() << " to exit"
				<< " w = " << BBWeight << "\n");
				}
				}
				}
				silvasUnsubmitted Done Reply Inline Actions Naming convention. silvas: Naming convention.

				// Sort CFG edges based on its weight.
				void sortEdgesByWeight() {
				std::stable_sort(AllEdges.begin(), AllEdges.end(),
				[](const std::unique_ptr<Edge> &Edge1,
				const std::unique_ptr<Edge> &Edge2) {
				return Edge1->Weight > Edge2->Weight;
				});
				}

				silvasUnsubmitted Done Reply Inline Actions No need to mention clang here. Please file a bug report for this if there isn't one open already. silvas: No need to mention clang here. Please file a bug report for this if there isn't one open…
				// Traverse all the edges and compute the Minimum Weight Spanning Tree
				// using union-find algorithm.
				void computeMinimumSpanningTree() {
				// First, put all the critical edge with landing-pad as the Dest to MST.
				// This works around the insufficient support of critical edges split
				// when destination BB is a landing pad.
				for (auto &Ei : AllEdges) {
				if (Ei->Removed)
				continue;
				if (Ei->IsCritical) {
				if (Ei->DestBB && Ei->DestBB->isLandingPad()) {
				if (unionGroups(Ei->SrcBB, Ei->DestBB))
				Ei->InMST = true;
				}
				}
				}

				for (auto &Ei : AllEdges) {
				if (Ei->Removed)
				continue;
				if (unionGroups(Ei->SrcBB, Ei->DestBB))
				Ei->InMST = true;
				}
				silvasUnsubmitted Done Reply Inline Actions Factor the DEBUG macro usage into a single point of truth here. E.g. factor out a function `printEdges(raw_ostream &OS, const StringRef Message)`. Then change this function to be simply `DEBUG(dumpEdges(dbgs(), Message))`. (this is similar to the pattern used by Value, for example). This will also give clang-format an easier time. silvas: Factor the DEBUG macro usage into a single point of truth here. E.g. factor out a function…
				}

				// Dump the Debug information about the instrumentation.
				void dumpEdges(raw_ostream &OS, const Twine &Message) const {
				if (!Message.str().empty())
				OS << Message << "\n";
				OS << " Number of Basic Blocks: " << BBInfos.size() << "\n";
				for (auto &BI : BBInfos) {
				const BasicBlock *BB = BI.first;
				OS << " BB: " << (BB == nullptr ? "FakeNode" : BB->getName()) << " "
				<< BI.second->infoString() << "\n";
				}

				OS << " Number of Edges: " << AllEdges.size()
				<< " (*: Instrument, C: CriticalEdge, -: Removed)\n";
				uint32_t Count = 0;
				for (auto &EI : AllEdges)
				OS << " Edge " << Count++ << ": " << getBBInfo(EI->SrcBB).Index << "-->"
				<< getBBInfo(EI->DestBB).Index << EI->infoString() << "\n";
				}

				// Add an edge to AllEdges with weight W.
				Edge &addEdge(const BasicBlock Src, const BasicBlock Dest, uint64_t W) {
				uint32_t Index = BBInfos.size();
				auto Iter = BBInfos.end();
				silvasUnsubmitted Done Reply Inline Actions Use std::tie silvas: Use std::tie
				bool Inserted;
				std::tie(Iter, Inserted) = BBInfos.insert(std::make_pair(Src, nullptr));
				if (Inserted) {
				// Newly inserted, update the real info.
				Iter->second = std::move(llvm::make_unique<BBInfo>(Index));
				silvasUnsubmitted Done Reply Inline Actions llvm::make_unique silvas: llvm::make_unique
				Index++;
				}
				silvasUnsubmitted Done Reply Inline Actions `return AllEdges.back()` silvas:* `return *AllEdges.back()`
				std::tie(Iter, Inserted) = BBInfos.insert(std::make_pair(Dest, nullptr));
				if (Inserted)
				// Newly inserted, update the real info.
				Iter->second = std::move(llvm::make_unique<BBInfo>(Index));
				AllEdges.emplace_back(new Edge(Src, Dest, W));
				return *AllEdges.back();
				}

				BranchProbabilityInfo *BPI;
				BlockFrequencyInfo *BFI;

				public:
				CFGMST(Function &Func, BranchProbabilityInfo *BPI_ = nullptr,
				silvasUnsubmitted Done Reply Inline Actions Avoid using reserved names. silvas: Avoid using reserved names.
				BlockFrequencyInfo *BFI_ = nullptr)
				: F(Func), BPI(BPI_), BFI(BFI_) {
				buildEdges();
				sortEdgesByWeight();
				computeMinimumSpanningTree();
				}
				};

				#undef DEBUG_TYPE // "cfgmst"
				} // end namespace llvm

lib/Transforms/Instrumentation/CMakeLists.txt

	add_llvm_library(LLVMInstrumentation			add_llvm_library(LLVMInstrumentation
	AddressSanitizer.cpp			AddressSanitizer.cpp
	BoundsChecking.cpp			BoundsChecking.cpp
	DataFlowSanitizer.cpp			DataFlowSanitizer.cpp
	GCOVProfiling.cpp			GCOVProfiling.cpp
	MemorySanitizer.cpp			MemorySanitizer.cpp
	Instrumentation.cpp			Instrumentation.cpp
	InstrProfiling.cpp			InstrProfiling.cpp
				PGOInstrumentation.cpp
	SafeStack.cpp			SafeStack.cpp
	SanitizerCoverage.cpp			SanitizerCoverage.cpp
	ThreadSanitizer.cpp			ThreadSanitizer.cpp

	ADDITIONAL_HEADER_DIRS			ADDITIONAL_HEADER_DIRS
	${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms			${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms
	)			)

	add_dependencies(LLVMInstrumentation intrinsics_gen)			add_dependencies(LLVMInstrumentation intrinsics_gen)

lib/Transforms/Instrumentation/Instrumentation.cpp

	Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines

	/// initializeInstrumentation - Initialize all passes in the TransformUtils			/// initializeInstrumentation - Initialize all passes in the TransformUtils
	/// library.			/// library.
	void llvm::initializeInstrumentation(PassRegistry &Registry) {			void llvm::initializeInstrumentation(PassRegistry &Registry) {
	initializeAddressSanitizerPass(Registry);			initializeAddressSanitizerPass(Registry);
	initializeAddressSanitizerModulePass(Registry);			initializeAddressSanitizerModulePass(Registry);
	initializeBoundsCheckingPass(Registry);			initializeBoundsCheckingPass(Registry);
	initializeGCOVProfilerPass(Registry);			initializeGCOVProfilerPass(Registry);
				initializePGOInstrumentationGenPass(Registry);
				initializePGOInstrumentationUsePass(Registry);
	initializeInstrProfilingPass(Registry);			initializeInstrProfilingPass(Registry);
	initializeMemorySanitizerPass(Registry);			initializeMemorySanitizerPass(Registry);
	initializeThreadSanitizerPass(Registry);			initializeThreadSanitizerPass(Registry);
	initializeSanitizerCoverageModulePass(Registry);			initializeSanitizerCoverageModulePass(Registry);
	initializeDataFlowSanitizerPass(Registry);			initializeDataFlowSanitizerPass(Registry);
	initializeSafeStackPass(Registry);			initializeSafeStackPass(Registry);
	}			}

	/// LLVMInitializeInstrumentation - C binding for			/// LLVMInitializeInstrumentation - C binding for
	/// initializeInstrumentation.			/// initializeInstrumentation.
	void LLVMInitializeInstrumentation(LLVMPassRegistryRef R) {			void LLVMInitializeInstrumentation(LLVMPassRegistryRef R) {
	initializeInstrumentation(*unwrap(R));			initializeInstrumentation(*unwrap(R));
	}			}

lib/Transforms/Instrumentation/LLVMBuild.txt

	Show All 13 Lines
	; http://llvm.org/docs/LLVMBuild.html			; http://llvm.org/docs/LLVMBuild.html
	;			;
	;===------------------------------------------------------------------------===;			;===------------------------------------------------------------------------===;

	[component_0]			[component_0]
	type = Library			type = Library
	name = Instrumentation			name = Instrumentation
	parent = Transforms			parent = Transforms
	required_libraries = Analysis Core MC Support TransformUtils			required_libraries = Analysis Core MC Support TransformUtils ProfileData

lib/Transforms/Instrumentation/PGOInstrumentation.cpp

This file was added.

				//===-- PGOInstrumentation.cpp - MST-based PGO Instrumentation ------------===//
				//
				silvasUnsubmitted Done Reply Inline Actions The length of this line does not match the one below. silvas: The length of this line does not match the one below.
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements PGO instrumentation using a minimum spanning tree based
				// on the following paper:
				// [1] Donald E. Knuth, Francis R. Stevenson. Optimal measurement of points
				silvasUnsubmitted Done Reply Inline Actions typo: two spaces after 'the' should be just one space. silvas: typo: two spaces after 'the' should be just one space.
				// for program frequency counts. BIT Numerical Mathematics 1973, Volume 13,
				// Issue 3, pp 313-322
				// The idea of the algorithm based on the fact that for each node (except for
				// the entry and exit), the sum of incoming edge counts equals the sum of
				// outgoing edge counts. The count of edge on spanning tree can be derived from
				// those edges not on the spanning tree. Knuth proves this method instruments
				// the minimum number of edges.
				//
				// The minimal spanning tree here is actually a maximum weight tree -- on-tree
				// edges have higher frequencies (more likely to execute). The idea is to
				// instrument those less frequently executed edges to reduce the runtime
				// overhead of instrumented binaries.
				//
				// This file contains two passes:
				// (1) Pass PGOInstrumentationGen which instruments the IR to generate edge
				// count profile, and
				// (2) Pass PGOInstrumentationUse which reads the edge count profile and
				// annotates the branch weights.
				// To get the precise counter information, These two passes need to invoke at
				// the same compilation point (so they see the same IR). For pass
				silvasUnsubmitted Done Reply Inline Actions Give intuition for why it is edges not in the MST rather than edges in the MST? Edges with high counts should have high weight and therefore not be in the MST (which tries for minimum weight); we don't want to instrument edges with high counts, and they are not in the MST, so why would we place counters there? silvas: Give intuition for why it is edges not in the MST rather than edges in the MST? Edges with…
				silvasUnsubmitted Not Done Reply Inline Actions Why are they mutually exclusive? Weren't you wanting to the the count profile in MST computation eventually? silvas: Why are they mutually exclusive? Weren't you wanting to the the count profile in MST…
				// PGOInstrumentationGen, the real work is done in instrumentOneFunc(). For
				// pass PGOInstrumentationUse, the real work in done in class PGOUseFunc and
				// the profile is opened in module level and passed to each PGOUseFunc instance.
				silvasUnsubmitted Not Done Reply Inline Actions typo: "is done" silvas: typo: "is done"
				// The shared code for PGOInstrumentationGen and PGOInstrumentationUse is put
				// in class FuncPGOInstrumentation.
				//
				// Class PGOEdge represents a CFG edge and some auxiliary information. Class
				// BBInfo contains auxiliary information for each BB. These two classes are used
				// in pass PGOInstrumentationGen. Class PGOUseEdge and UseBBInfo are the derived
				// class of PGOEdge and BBInfo, respectively. They contains extra data structure
				silvasUnsubmitted Done Reply Inline Actions class `PGOGenFunc` is no longer present, please update the comment. silvas: class `PGOGenFunc` is no longer present, please update the comment.
				// used in populating profile counters.
				// The MST implementation is in Class CFGMST (CFGMST.h).
				//
				//===----------------------------------------------------------------------===//
				silvasUnsubmitted Not Done Reply Inline Actions These header comments easily go out of date during patch review. I would recommend reading the comment from top to bottom and verifying that it is up to date. silvas: These header comments easily go out of date during patch review. I would recommend reading the…

				#include "llvm/Transforms/Instrumentation.h"
				#include "CFGMST.h"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/BlockFrequencyInfo.h"
				#include "llvm/Analysis/BranchProbabilityInfo.h"
				#include "llvm/Analysis/CFG.h"
				#include "llvm/IR/DiagnosticInfo.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/InstIterator.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/MDBuilder.h"
				#include "llvm/IR/Module.h"
				#include "llvm/Pass.h"
				#include "llvm/ProfileData/InstrProfReader.h"
				#include "llvm/Support/BranchProbability.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/JamCRC.h"
				#include "llvm/Transforms/Utils/BasicBlockUtils.h"
				#include <string>
				#include <utility>
				#include <vector>

				using namespace llvm;

				#define DEBUG_TYPE "pgo-instrumentation"

				STATISTIC(NumOfPGOInstrument, "Number of edges instrumented.");
				STATISTIC(NumOfPGOEdge, "Number of edges.");
				STATISTIC(NumOfPGOBB, "Number of basic-blocks.");
				STATISTIC(NumOfPGOSplit, "Number of critical edge splits.");
				STATISTIC(NumOfPGOFunc, "Number of functions having valid profile counts.");
				STATISTIC(NumOfPGOMismatch, "Number of functions having mismatch profile.");
				STATISTIC(NumOfPGOMissing, "Number of functions without profile.");

				// Command line option to specify the file to read profile from. This is
				// mainly used for testing.
				static cl::opt<std::string>
				silvasUnsubmitted Done Reply Inline Actions This option is only for testing, so please rename it to make that clear (also update the description). silvas: This option is only for testing, so please rename it to make that clear (also update the…
				PGOTestProfileFile("pgo-test-profile-file", cl::init(""), cl::Hidden,
				cl::value_desc("filename"),
				cl::desc("Specify the path of profile data file. This is"
				"mainly for test purpose."));

				namespace {
				class PGOInstrumentationGen : public ModulePass {
				public:
				static char ID;

				PGOInstrumentationGen() : ModulePass(ID) {
				initializePGOInstrumentationGenPass(*PassRegistry::getPassRegistry());
				}

				const char *getPassName() const override {
				return "PGOInstrumentationGenPass";
				}

				private:
				bool runOnModule(Module &M) override;

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<BlockFrequencyInfoWrapperPass>();
				}
				};

				class PGOInstrumentationUse : public ModulePass {
				public:
				static char ID;

				// Provide the profile filename as the parameter.
				PGOInstrumentationUse(std::string Filename = "")
				: ModulePass(ID), ProfileFileName(Filename) {
				if (!PGOTestProfileFile.empty())
				ProfileFileName = PGOTestProfileFile;
				initializePGOInstrumentationUsePass(*PassRegistry::getPassRegistry());
				}

				const char *getPassName() const override {
				return "PGOInstrumentationUsePass";
				}

				private:
				std::string ProfileFileName;
				std::unique_ptr<IndexedInstrProfReader> PGOReader;
				bool runOnModule(Module &M) override;

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<BlockFrequencyInfoWrapperPass>();
				}
				};
				} // end anonymous namespace

				char PGOInstrumentationGen::ID = 0;
				INITIALIZE_PASS_BEGIN(PGOInstrumentationGen, "pgo-instr-gen",
				"PGO instrumentation.", false, false)
				INITIALIZE_PASS_DEPENDENCY(BlockFrequencyInfoWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(BranchProbabilityInfoWrapperPass)
				INITIALIZE_PASS_END(PGOInstrumentationGen, "pgo-instr-gen",
				"PGO instrumentation.", false, false)

				ModulePass *llvm::createPGOInstrumentationGenPass() {
				silvasUnsubmitted Done Reply Inline Actions Please cite the Knuth paper. Also explain what we actually do with the MST (and give intuition for why that makes sense (it's fairly simple, should not need a long explanation)). silvas: Please cite the Knuth paper. Also explain what we actually do with the MST (and give intuition…
				return new PGOInstrumentationGen();
				}

				silvasUnsubmitted Done Reply Inline Actions Other places seem to use uint64_t for weight. Why `unsigned` here? silvas: Other places seem to use uint64_t for weight. Why `unsigned` here?
				xurAuthorUnsubmitted Not Done Reply Inline Actions changed to uint64_t. xur: changed to uint64_t.
				char PGOInstrumentationUse::ID = 0;
				INITIALIZE_PASS_BEGIN(PGOInstrumentationUse, "pgo-instr-use",
				"Read PGO instrumentation profile.", false, false)
				INITIALIZE_PASS_DEPENDENCY(BlockFrequencyInfoWrapperPass)
				INITIALIZE_PASS_DEPENDENCY(BranchProbabilityInfoWrapperPass)
				INITIALIZE_PASS_END(PGOInstrumentationUse, "pgo-instr-use",
				"Read PGO instrumentation profile.", false, false)

				ModulePass *llvm::createPGOInstrumentationUsePass(StringRef Filename) {
				return new PGOInstrumentationUse(Filename.str());
				}

				namespace {
				/// \brief An MST based instrumentation for PGO
				///
				/// Implements a Minimum Spanning Tree (MST) based instrumentation for PGO
				/// in the function level.
				struct PGOEdge {
				// This class implements the CFG edges. Note the CFG can be a multi-graph.
				// So there might be multiple edges with same SrcBB and DestBB.
				const BasicBlock *SrcBB;
				const BasicBlock *DestBB;
				uint64_t Weight;
				bool InMST;
				bool Removed;
				bool IsCritical;
				silvasUnsubmitted Done Reply Inline Actions Does this actually need to be virtual? Same for all the other virtual methods in this file. silvas: Does this actually need to be virtual? Same for all the other virtual methods in this file.
				PGOEdge(const BasicBlock Src, const BasicBlock Dest, unsigned W = 1)
				: SrcBB(Src), DestBB(Dest), Weight(W), InMST(false), Removed(false),
				IsCritical(false) {}
				// Return the information string of an edge.
				const std::string infoString() const {
				return (Twine(Removed ? "-" : " ") + (InMST ? " " : "*") +
				(IsCritical ? "c" : " ") + " W=" + Twine(Weight)).str();
				}
				};

				// This class stores the auxiliary information for each BB.
				struct BBInfo {
				BBInfo *Group;
				uint32_t Index;
				uint32_t Rank;

				BBInfo(unsigned IX) : Group(this), Index(IX), Rank(0) {}

				// Return the information string of this object.
				const std::string infoString() const {
				return (Twine("Index=") + Twine(Index)).str();
				}
				};
				davidxlUnsubmitted Done Reply Inline Actions if there is no BB .. davidxl: if there is no BB ..

				// This class implements the CFG edges. Note the CFG can be a multi-graph.
				template <class Edge, class BBInfo> class FuncPGOInstrumentation {
				private:
				Function &F;
				void computeCFGHash();

				public:
				std::string FuncName;
				GlobalVariable *FuncNameVar;
				// CFG hash value for this function.
				uint64_t FunctionHash;

				// The Minimum Spanning Tree of function CFG.
				CFGMST<Edge, BBInfo> MST;

				// Give an edge, find the BB that will be instrumented.
				silvasUnsubmitted Done Reply Inline Actions "dump" functions are for debugging and should be conditionalized. For example, I suggest refactoring to allow a call DEBUG( std::string Message = "Dump Function " + FuncName + " after CFGMST: \t Hash: " + std::to_string(FunctionHash); MST.dumpEdges(dbgs(), Message); ) silvas: "dump" functions are for debugging and should be conditionalized. For example, I suggest…
				// Return nullptr if there is no BB to be instrumented.
				BasicBlock getInstrBB(Edge E);

				// Return the auxiliary BB information.
				BBInfo &getBBInfo(const BasicBlock *BB) const { return MST.getBBInfo(BB); }

				// Dump edges and BB information.
				void dumpInfo(std::string Str = "") const {
				MST.dumpEdges(dbgs(), Twine("Dump Function ") + FuncName + " Hash: " +
				Twine(FunctionHash) + "\t" + Str);
				}

				FuncPGOInstrumentation(Function &Func, bool CreateGlobalVar = false,
				BranchProbabilityInfo *BPI = nullptr,
				BlockFrequencyInfo *BFI = nullptr)
				: F(Func), FunctionHash(0), MST(F, BPI, BFI) {
				FuncName = getPGOFuncName(F);
				computeCFGHash();
				DEBUG(dumpInfo("after CFGMST"));

				NumOfPGOBB += MST.BBInfos.size();
				for (auto &E : MST.AllEdges) {
				if (E->Removed)
				continue;
				NumOfPGOEdge++;
				if (!E->InMST)
				NumOfPGOInstrument++;
				}

				if (CreateGlobalVar)
				FuncNameVar = createPGOFuncNameVar(F, FuncName);
				};
				};

				// Compute Hash value for the CFG: the lower 32 bits are CRC32 of the index
				// value of each BB in the CFG. The higher 32 bits record the number of edges.
				template <class Edge, class BBInfo>
				void FuncPGOInstrumentation<Edge, BBInfo>::computeCFGHash() {
				std::vector<char> Indexes;
				JamCRC JC;
				for (auto &BB : F) {
				const TerminatorInst *TI = BB.getTerminator();
				for (unsigned I = 0, E = TI->getNumSuccessors(); I != E; ++I) {
				BasicBlock *Succ = TI->getSuccessor(I);
				uint32_t Index = getBBInfo(Succ).Index;
				for (int J = 0; J < 4; J++)
				Indexes.push_back((char)(Index >> (J * 8)));
				}
				}
				JC.update(Indexes);
				FunctionHash = (uint64_t)MST.AllEdges.size() << 32 \| JC.getCRC();
				}

				// Given a CFG E to be instrumented, find which BB to place the instrumented
				// code. The function will split the critical edge if necessary.
				template <class Edge, class BBInfo>
				BasicBlock FuncPGOInstrumentation<Edge, BBInfo>::getInstrBB(Edge E) {
				if (E->InMST \|\| E->Removed)
				return nullptr;

				BasicBlock SrcBB = const_cast<BasicBlock >(E->SrcBB);
				BasicBlock DestBB = const_cast<BasicBlock >(E->DestBB);
				// For a fake edge, instrument the real BB.
				silvasUnsubmitted Not Done Reply Inline Actions Do you still need these const_cast<>'s after r253733? silvas: Do you still need these const_cast<>'s after r253733?
				if (SrcBB == nullptr)
				return DestBB;
				if (DestBB == nullptr)
				return SrcBB;

				// Instrument the SrcBB if it has a single successor,
				// otherwise, the DestBB if this is not a critical edge.
				TerminatorInst *TI = SrcBB->getTerminator();
				if (TI->getNumSuccessors() <= 1)
				return SrcBB;
				if (!E->IsCritical)
				return DestBB;

				silvasUnsubmitted Done Reply Inline Actions Make these two variable just local variables of `instrumentOnFunc` free function. silvas: Make these two variable just local variables of `instrumentOnFunc` free function.
				// For a critical edge, we have to split. Instrument the newly
				// created BB.
				NumOfPGOSplit++;
				DEBUG(dbgs() << "Split critical edge: " << getBBInfo(SrcBB).Index << " --> "
				<< getBBInfo(DestBB).Index << "\n");
				unsigned SuccNum = GetSuccessorNumber(SrcBB, DestBB);
				BasicBlock *InstrBB = SplitCriticalEdge(TI, SuccNum);
				assert(InstrBB && "Critical edge is not split");

				E->Removed = true;
				return InstrBB;
				}

				silvasUnsubmitted Done Reply Inline Actions This is only called in one place, inline the code into `instrumentOnFunc`. silvas: This is only called in one place, inline the code into `instrumentOnFunc`.
				// Visit all edge and instrument the edges not in MST.
				// Critical edges will be split.
				static void instrumentOneFunc(Function &F, Module *M,
				davidxlUnsubmitted Done Reply Inline Actions Remove unused code. davidxl: Remove unused code.
				BranchProbabilityInfo *BPI,
				BlockFrequencyInfo *BFI) {
				unsigned NumCounters = 0;
				FuncPGOInstrumentation<PGOEdge, BBInfo> FuncInfo(F, true, BPI, BFI);
				for (auto &E : FuncInfo.MST.AllEdges) {
				if (!E->InMST && !E->Removed)
				NumCounters++;
				}

				uint32_t I = 0;
				for (auto &E : FuncInfo.MST.AllEdges) {
				BasicBlock *InstrBB = FuncInfo.getInstrBB(E.get());
				if (!InstrBB)
				continue;

				IRBuilder<> Builder(InstrBB, InstrBB->getFirstInsertionPt());
				assert(Builder.GetInsertPoint() != InstrBB->end() &&
				"Cannot get the Instrumentation point");
				Type *I8PtrTy = Type::getInt8PtrTy(M->getContext());
				Builder.CreateCall(
				Intrinsic::getDeclaration(M, Intrinsic::instrprof_increment),
				{llvm::ConstantExpr::getBitCast(FuncInfo.FuncNameVar, I8PtrTy),
				Builder.getInt64(FuncInfo.FunctionHash), Builder.getInt32(NumCounters),
				Builder.getInt32(I++)});
				}
				}

				// This class represents a CFG edge in profile use compilation.
				struct PGOUseEdge : public PGOEdge {
				bool CountValid;
				uint64_t CountValue;
				PGOUseEdge(const BasicBlock Src, const BasicBlock Dest, unsigned W = 1)
				: PGOEdge(Src, Dest, W), CountValid(false), CountValue(0) {}

				// Set edge count value
				void setEdgeCount(uint64_t Value) {
				CountValue = Value;
				CountValid = true;
				}

				// Return the information string for this object.
				const std::string infoString() const {
				if (!CountValid)
				return PGOEdge::infoString();
				return (Twine(PGOEdge::infoString()) + " Count=" + Twine(CountValue)).str();
				}
				};

				typedef SmallVector<PGOUseEdge *, 2> DirectEdges;

				// This class stores the auxiliary information for each BB.
				struct UseBBInfo : public BBInfo {
				uint64_t CountValue;
				bool CountValid;
				int32_t UnknownCountInEdge;
				int32_t UnknownCountOutEdge;
				DirectEdges InEdges;
				DirectEdges OutEdges;
				UseBBInfo(unsigned IX)
				: BBInfo(IX), CountValue(0), CountValid(false), UnknownCountInEdge(0),
				UnknownCountOutEdge(0) {}
				UseBBInfo(unsigned IX, uint64_t C)
				: BBInfo(IX), CountValue(C), CountValid(true), UnknownCountInEdge(0),
				UnknownCountOutEdge(0) {}

				// Set the profile count value for this BB.
				void setBBInfoCount(uint64_t Value) {
				CountValue = Value;
				CountValid = true;
				}

				// Return the information string of this object.
				const std::string infoString() const {
				if (!CountValid)
				return BBInfo::infoString();
				return (Twine(BBInfo::infoString()) + " Count=" + Twine(CountValue)).str();
				}
				};

				// Sum up the count values for all the edges.
				static uint64_t sumEdgeCount(const ArrayRef<PGOUseEdge *> Edges) {
				uint64_t Total = 0;
				for (auto &E : Edges) {
				if (E->Removed)
				continue;
				Total += E->CountValue;
				}
				return Total;
				}

				class PGOUseFunc {
				private:
				Function &F;
				Module *M;
				// This member stores the shared information with class PGOGenFunc.
				FuncPGOInstrumentation<PGOUseEdge, UseBBInfo> FuncInfo;

				// Return the auxiliary BB information.
				UseBBInfo &getBBInfo(const BasicBlock *BB) const {
				davidxlUnsubmitted Done Reply Inline Actions It is better to just call it 'setEdgeCount' with same documentation -- or make it just getUnknownCountEdge() and let the client to set the edge directly. davidxl: It is better to just call it 'setEdgeCount' with same documentation -- or make it just…
				return FuncInfo.getBBInfo(BB);
				}

				// The maximum count value in the profile. This is only used in PGO use
				// compilation.
				uint64_t ProgramMaxCount;

				// Find the Instrumented BB and set the value.
				void setInstrumentedCounts(const std::vector<uint64_t> &CountFromProfile);

				// Set the edge counter value for the unknown edge -- there should be only
				// one unknown edge.
				void setEdgeCount(DirectEdges &Edges, uint64_t Value);

				// Return FuncName string;
				const std::string getFuncName() const { return FuncInfo.FuncName; }

				// Set the hot/cold inline hints based on the count values.
				// FIXME: This function should be removed once the functionality in
				// the inliner is implemented.
				void applyFunctionAttributes(uint64_t EntryCount, uint64_t MaxCount) {
				if (ProgramMaxCount == 0)
				return;
				// Threshold of the hot functions.
				const BranchProbability HotFunctionThreshold(1, 100);
				// Threshold of the cold functions.
				const BranchProbability ColdFunctionThreshold(2, 10000);
				if (EntryCount >= HotFunctionThreshold.scale(ProgramMaxCount))
				F.addFnAttr(llvm::Attribute::InlineHint);
				else if (MaxCount <= ColdFunctionThreshold.scale(ProgramMaxCount))
				F.addFnAttr(llvm::Attribute::Cold);
				silvasUnsubmitted Done Reply Inline Actions Avoid having debug code be non-conditional. silvas: Avoid having debug code be non-conditional.
				xurAuthorUnsubmitted Not Done Reply Inline Actions Now guarded by DEBUG. xur: Now guarded by DEBUG.
				}

				public:
				PGOUseFunc(Function &Func, Module Modu, BranchProbabilityInfo BPI = nullptr,
				BlockFrequencyInfo *BFI = nullptr)
				: F(Func), M(Modu), FuncInfo(Func, false, BPI, BFI) {}

				// Read counts for the instrumented BB from profile.
				bool readCounters(IndexedInstrProfReader *PGOReader);

				silvasUnsubmitted Done Reply Inline Actions Avoid `const std::unique_ptr<PGOUseEdge> &`. silvas: Avoid `const std::unique_ptr<PGOUseEdge> &`.
				davidxlUnsubmitted Done Reply Inline Actions Sean's comment here is not addressed -- use raw PGOUseEdge& davidxl: Sean's comment here is not addressed -- use raw PGOUseEdge&
				// Populate the counts for all BBs.
				void populateCounters();

				// Set the branch weights based on the count values.
				void setBranchWeights();
				};

				// Visit all the edges and assign the count value for the instrumented
				// edges and the BB.
				void PGOUseFunc::setInstrumentedCounts(
				const std::vector<uint64_t> &CountFromProfile) {
				davidxlUnsubmitted Not Done Reply Inline Actions In general it is not safe to update the container (AllEdges) while iterating -- it may invalidate the iterator or element reference saved before. The code should create a worklist of edges before the count setting -- the worklist will then be 'frozen' (It is safe to update worklist, but there is no need to push new split edges into the list) and AllEdges can be safely updated. davidxl: In general it is not safe to update the container (AllEdges) while iterating -- it may…
				xurAuthorUnsubmitted Not Done Reply Inline Actions I see your point. But this is exactly the reason I did not use range-based loop like all others in the file. I get the vector size in the loop initialization and use the index to reference the vector elements in the body. So adding element to the vector will not be a problem. I could use a worklist, but the effect should be the same. xur: I see your point. But this is exactly the reason I did not use range-based loop like all others…

				// Use a worklist as we will update the vector during the iteration.
				std::vector<PGOUseEdge *> WorkList;
				for (auto &E : FuncInfo.MST.AllEdges)
				WorkList.push_back(E.get());

				uint32_t I = 0;
				for (auto &E : WorkList) {
				BasicBlock *InstrBB = FuncInfo.getInstrBB(E);
				if (!InstrBB)
				continue;
				uint64_t CountValue = CountFromProfile[I++];
				if (!E->Removed) {
				silvasUnsubmitted Done Reply Inline Actions Can you use continue to reduce indentation for the "large" side of the `if`? ( http://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code ) silvas: Can you use continue to reduce indentation for the "large" side of the `if`? ( http://llvm.
				getBBInfo(InstrBB).setBBInfoCount(CountValue);
				E->setEdgeCount(CountValue);
				continue;
				}

				// Need to add two new edges.
				BasicBlock SrcBB = const_cast<BasicBlock >(E->SrcBB);
				BasicBlock DestBB = const_cast<BasicBlock >(E->DestBB);
				// Add new edge of SrcBB->InstrBB.
				PGOUseEdge &NewEdge = FuncInfo.MST.addEdge(SrcBB, InstrBB, 0);
				NewEdge.setEdgeCount(CountValue);
				// Add new edge of InstrBB->DestBB.
				PGOUseEdge &NewEdge1 = FuncInfo.MST.addEdge(InstrBB, DestBB, 0);
				NewEdge1.setEdgeCount(CountValue);
				NewEdge1.InMST = true;
				getBBInfo(InstrBB).setBBInfoCount(CountValue);
				}
				}

				// Set the count value for the unknown edge. There should be one and only one
				silvasUnsubmitted Done Reply Inline Actions You say "edges" (plural) here but then say "There should be one and only one". silvas: You say "edges" (plural) here but then say "There should be one and only one".
				// unknown edge in Edges vector.
				void PGOUseFunc::setEdgeCount(DirectEdges &Edges, uint64_t Value) {
				for (auto &E : Edges) {
				davidxlUnsubmitted Done Reply Inline Actions Use the new getInstrProfRecord method -- eventually we will need to read value profile data too. davidxl: Use the new getInstrProfRecord method -- eventually we will need to read value profile data too.
				if (E->CountValid)
				continue;
				E->setEdgeCount(Value);

				getBBInfo(E->SrcBB).UnknownCountOutEdge--;
				getBBInfo(E->DestBB).UnknownCountInEdge--;
				return;
				}
				llvm_unreachable("Cannot find the unknown count edge");
				}

				// Read the profile from ProfileFileName and assign the value to the
				// instrumented BB and the edges. This function also updates ProgramMaxCount.
				// Return true if the profile are successfully read, and false on errors.
				bool PGOUseFunc::readCounters(IndexedInstrProfReader *PGOReader) {
				auto &Ctx = M->getContext();
				ErrorOr<InstrProfRecord> Result =
				silvasUnsubmitted Done Reply Inline Actions Do you mean to do a copy here? Probably `std::vector<uint64_t> &` is intended. silvas: Do you mean to do a copy here? Probably `std::vector<uint64_t> &` is intended.
				PGOReader->getInstrProfRecord(FuncInfo.FuncName, FuncInfo.FunctionHash);
				if (std::error_code EC = Result.getError()) {
				if (EC == instrprof_error::unknown_function)
				NumOfPGOMissing++;
				else if (EC == instrprof_error::hash_mismatch \|\|
				silvasUnsubmitted Done Reply Inline Actions `e` instead of `E`, as is common in LLVM. silvas: `e` instead of `E`, as is common in LLVM.
				EC == llvm::instrprof_error::malformed)
				NumOfPGOMismatch++;

				std::string Msg = EC.message() + std::string(" ") + F.getName().str();
				Ctx.diagnose(
				DiagnosticInfoPGOProfile(M->getName().data(), Msg, DS_Warning));
				return false;
				}
				std::vector<uint64_t> &CountFromProfile = Result.get().Counts;

				NumOfPGOFunc++;
				DEBUG(dbgs() << CountFromProfile.size() << " counts\n");
				uint64_t ValueSum = 0;
				for (unsigned I = 0, S = CountFromProfile.size(); I < S; I++) {
				DEBUG(dbgs() << " " << I << ": " << CountFromProfile[I] << "\n");
				ValueSum += CountFromProfile[I];
				}

				DEBUG(dbgs() << "SUM = " << ValueSum << "\n");

				getBBInfo(nullptr).UnknownCountOutEdge = 2;
				getBBInfo(nullptr).UnknownCountInEdge = 2;

				setInstrumentedCounts(CountFromProfile);
				ProgramMaxCount = PGOReader->getMaximumFunctionCount();
				silvasUnsubmitted Done Reply Inline Actions `getBBInfo(Ei->SrcBB)`. You shouldn't need a cast here (if you do, then please fix whatever code is requiring this to cast away constness). silvas: `getBBInfo(Ei->SrcBB)`. You shouldn't need a cast here (if you do, then please fix whatever…
				return true;
				}

				// Populate the counters from instrumented BBs to all BBs.
				// In the end of this operation, all BBs should have a valid count value.
				void PGOUseFunc::populateCounters() {
				// First set up Count variable for all BBs.
				for (auto &E : FuncInfo.MST.AllEdges) {
				if (E->Removed)
				continue;

				const BasicBlock *SrcBB = E->SrcBB;
				const BasicBlock *DestBB = E->DestBB;
				UseBBInfo &SrcInfo = getBBInfo(SrcBB);
				UseBBInfo &DestInfo = getBBInfo(DestBB);
				SrcInfo.OutEdges.push_back(E.get());
				DestInfo.InEdges.push_back(E.get());
				SrcInfo.UnknownCountOutEdge++;
				DestInfo.UnknownCountInEdge++;

				if (!E->CountValid)
				continue;
				DestInfo.UnknownCountInEdge--;
				silvasUnsubmitted Done Reply Inline Actions Just make the iteration variable `BB` and use `getBBInfo(&BB)`, like you do below in the loop `// Assert every BB has a valid counter.`. silvas: Just make the iteration variable `BB` and use `getBBInfo(&BB)`, like you do below in the loop…
				SrcInfo.UnknownCountOutEdge--;
				}

				bool Changes = true;
				unsigned NumPasses = 0;
				while (Changes) {
				NumPasses++;
				Changes = false;

				// For efficient traversal, it's better to start from the end as most
				// of the instrumented edges are at the end.
				for (auto &BB : reverse(F)) {
				UseBBInfo &Count = getBBInfo(&BB);
				if (!Count.CountValid) {
				if (Count.UnknownCountOutEdge == 0) {
				Count.CountValue = sumEdgeCount(Count.OutEdges);
				Count.CountValid = true;
				Changes = true;
				} else if (Count.UnknownCountInEdge == 0) {
				Count.CountValue = sumEdgeCount(Count.InEdges);
				Count.CountValid = true;
				Changes = true;
				}
				}
				if (Count.CountValid) {
				if (Count.UnknownCountOutEdge == 1) {
				uint64_t Total = Count.CountValue - sumEdgeCount(Count.OutEdges);
				setEdgeCount(Count.OutEdges, Total);
				Changes = true;
				}
				if (Count.UnknownCountInEdge == 1) {
				uint64_t Total = Count.CountValue - sumEdgeCount(Count.InEdges);
				setEdgeCount(Count.InEdges, Total);
				Changes = true;
				}
				}
				}
				}

				DEBUG(dbgs() << "Populate counts in " << NumPasses << " passes.\n");
				// Assert every BB has a valid counter.
				uint64_t FuncEntryCount = getBBInfo(&*F.begin()).CountValue;
				uint64_t FuncMaxCount = FuncEntryCount;
				for (auto &BB : F) {
				assert(getBBInfo(&BB).CountValid && "BB count is not valid");
				uint64_t Count = getBBInfo(&BB).CountValue;
				if (Count > FuncMaxCount)
				FuncMaxCount = Count;
				}
				applyFunctionAttributes(FuncEntryCount, FuncMaxCount);

				DEBUG(FuncInfo.dumpInfo("after reading profile."));
				}

				// Assign the scaled count values to the BB with multiple out edges.
				void PGOUseFunc::setBranchWeights() {
				// Generate MD_prof metadata for every branch instruction.
				DEBUG(dbgs() << "\nSetting branch weights.\n");
				MDBuilder MDB(M->getContext());
				for (auto &BB : F) {
				TerminatorInst *TI = BB.getTerminator();
				if (TI->getNumSuccessors() < 2)
				continue;
				if (!isa<BranchInst>(TI) && !isa<SwitchInst>(TI))
				continue;
				if (getBBInfo(&BB).CountValue == 0)
				continue;
				silvasUnsubmitted Done Reply Inline Actions Remove the const_cast (I think this will require making GetSuccessorNumber take `const BasicBlock `, which you can do as a separate patch (no need for pre-commit review)) silvas:* Remove the const_cast (I think this will require making GetSuccessorNumber take `const…

				// We have a non-zero Branch BB.
				const UseBBInfo &BBCountInfo = getBBInfo(&BB);
				unsigned Size = BBCountInfo.OutEdges.size();
				SmallVector<unsigned, 2> EdgeCounts(Size, 0);
				uint64_t MaxCount = 0;
				for (unsigned s = 0; s < Size; s++) {
				const PGOUseEdge *E = BBCountInfo.OutEdges[s];
				const BasicBlock *SrcBB = E->SrcBB;
				const BasicBlock *DestBB = E->DestBB;
				if (DestBB == 0)
				continue;
				unsigned SuccNum = GetSuccessorNumber(SrcBB, DestBB);
				uint64_t EdgeCount = E->CountValue;
				if (EdgeCount > MaxCount)
				MaxCount = EdgeCount;
				EdgeCounts[SuccNum] = EdgeCount;
				}
				assert(MaxCount > 0 && "Bad max count");
				silvasUnsubmitted Done Reply Inline Actions Do you mean `instrumentOneFunc`? `instrumentOnFunc` doesn't make sense (weird use of "on"). silvas: Do you mean `instrumentOneFunc`? `instrumentOnFunc` doesn't make sense (weird use of "on").
				uint64_t Scale = calculateCountScale(MaxCount);
				SmallVector<unsigned, 4> Weights;
				silvasUnsubmitted Done Reply Inline Actions Use a free function `instrumentOneFunc` silvas: Use a free function `instrumentOneFunc`
				for (const auto &ECI : EdgeCounts)
				Weights.push_back(scaleBranchCount(ECI, Scale));

				TI->setMetadata(llvm::LLVMContext::MD_prof,
				MDB.createBranchWeights(Weights));
				DEBUG(dbgs() << "Weight is: ";
				for (const auto &W : Weights) { dbgs() << W << " "; }
				dbgs() << "\n";);
				}
				}
				} // end anonymous namespace

				bool PGOInstrumentationGen::runOnModule(Module &M) {
				for (auto &F : M) {
				if (F.isDeclaration())
				continue;
				BranchProbabilityInfo *BPI =
				&(getAnalysis<BranchProbabilityInfoWrapperPass>(F).getBPI());
				BlockFrequencyInfo *BFI =
				&(getAnalysis<BlockFrequencyInfoWrapperPass>(F).getBFI());
				instrumentOneFunc(F, &M, BPI, BFI);
				}
				return true;
				}

				static void setPGOCountOnFunc(PGOUseFunc &Func,
				IndexedInstrProfReader *PGOReader) {
				if (Func.readCounters(PGOReader)) {
				Func.populateCounters();
				Func.setBranchWeights();
				}
				silvasUnsubmitted Done Reply Inline Actions Use a free function `setPGOCountOnFunc` silvas: Use a free function `setPGOCountOnFunc`
				}

				bool PGOInstrumentationUse::runOnModule(Module &M) {
				DEBUG(dbgs() << "Read in profile counters: ");
				auto &Ctx = M.getContext();
				// Read the counter array from file.
				auto ReaderOrErr = IndexedInstrProfReader::create(ProfileFileName);
				if (std::error_code EC = ReaderOrErr.getError()) {
				Ctx.diagnose(
				DiagnosticInfoPGOProfile(ProfileFileName.data(), EC.message()));
				return false;
				}

				PGOReader = std::move(ReaderOrErr.get());
				if (!PGOReader) {
				Ctx.diagnose(DiagnosticInfoPGOProfile(ProfileFileName.data(),
				"Cannot get PGOReader"));
				return false;
				}

				for (auto &F : M) {
				if (F.isDeclaration())
				continue;
				BranchProbabilityInfo *BPI =
				&(getAnalysis<BranchProbabilityInfoWrapperPass>(F).getBPI());
				BlockFrequencyInfo *BFI =
				&(getAnalysis<BlockFrequencyInfoWrapperPass>(F).getBFI());
				PGOUseFunc Func(F, &M, BPI, BFI);
				setPGOCountOnFunc(Func, PGOReader.get());
				}
				return true;
				}

test/Transforms/PGOProfile/Inputs/branch1.proftext

This file was added.

				test_br_1
				25571299074
				2
				3
				2

test/Transforms/PGOProfile/Inputs/branch2.proftext

This file was added.

				test_br_2
				29667547796
				2
				1
				1

test/Transforms/PGOProfile/Inputs/criticaledge.proftext

This file was added.

				test_criticalEdge
				82323253069
				8
				2
				1
				2
				2
				0
				1
				2
				1

				<stdin>:bar
				12884901887
				1
				7

test/Transforms/PGOProfile/Inputs/diag.proftext

This file was added.

				foo
				12884999999
				1
				1

test/Transforms/PGOProfile/Inputs/landingpad.proftext

This file was added.

				foo
				59130013419
				4
				3
				1
				2
				0

				bar
				24868915205
				2
				1
				2

test/Transforms/PGOProfile/Inputs/loop1.proftext

This file was added.

				test_simple_for
				34137660316
				2
				96
				4

test/Transforms/PGOProfile/Inputs/loop2.proftext

This file was added.

				test_nested_for
				53929068288
				3
				33
				10
				6

test/Transforms/PGOProfile/Inputs/switch.proftext

This file was added.

				test_switch
				46200943743
				4
				0
				5
				2
				3

test/Transforms/PGOProfile/branch1.ll

This file was added.

				; RUN: opt < %s -pgo-instr-gen -S \| FileCheck %s --check-prefix=GEN
				; RUN: llvm-profdata merge %S/Inputs/branch1.proftext -o %T/branch1.profdata
				; RUN: opt < %s -pgo-instr-use -pgo-test-profile-file=%T/branch1.profdata -S \| FileCheck %s --check-prefix=USE
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; GEN: @__llvm_profile_name_test_br_1 = private constant [9 x i8] c"test_br_1"

				define i32 @test_br_1(i32 %i) {
				entry:
				; GEN: entry:
				; GEN-NOT: llvm.instrprof.increment
				%cmp = icmp sgt i32 %i, 0
				br i1 %cmp, label %if.then, label %if.end
				; USE: br i1 %cmp, label %if.then, label %if.end
				; USE-SAME: !prof ![[BW_ENTRY:[0-9]+]]
				; USE: ![[BW_ENTRY]] = !{!"branch_weights", i32 2, i32 1}

				if.then:
				; GEN: if.then:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([9 x i8], [9 x i8]* @__llvm_profile_name_test_br_1, i32 0, i32 0), i64 25571299074, i32 2, i32 1)
				%add = add nsw i32 %i, 2
				br label %if.end

				if.end:
				; GEN: if.end:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([9 x i8], [9 x i8]* @__llvm_profile_name_test_br_1, i32 0, i32 0), i64 25571299074, i32 2, i32 0)
				%retv = phi i32 [ %add, %if.then ], [ %i, %entry ]
				ret i32 %retv
				}
				silvasUnsubmitted Not Done Reply Inline Actions Can you use CHECK-DAG directive to move these next to the place where the `BW_` capture occurs? That would improve locality and clarify what is being checked. (same in the other files) silvas:* Can you use CHECK-DAG directive to move these next to the place where the `BW_*` capture occurs?

test/Transforms/PGOProfile/branch2.ll

This file was added.

				; RUN: opt < %s -pgo-instr-gen -S \| FileCheck %s --check-prefix=GEN
				; RUN: llvm-profdata merge %S/Inputs/branch2.proftext -o %T/branch2.profdata
				; RUN: opt < %s -pgo-instr-use -pgo-test-profile-file=%T/branch2.profdata -S \| FileCheck %s --check-prefix=USE
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; GEN: @__llvm_profile_name_test_br_2 = private constant [9 x i8] c"test_br_2"

				define i32 @test_br_2(i32 %i) {
				entry:
				; GEN: entry:
				; GEN-NOT: llvm.instrprof.increment
				%cmp = icmp sgt i32 %i, 0
				br i1 %cmp, label %if.then, label %if.else
				; USE: br i1 %cmp, label %if.then, label %if.else
				; USE-SAME: !prof ![[BW_ENTRY:[0-9]+]]
				; USE: ![[BW_ENTRY]] = !{!"branch_weights", i32 1, i32 1}

				if.then:
				; GEN: if.then:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([9 x i8], [9 x i8]* @__llvm_profile_name_test_br_2, i32 0, i32 0), i64 29667547796, i32 2, i32 0)
				%add = add nsw i32 %i, 2
				br label %if.end

				if.else:
				; GEN: if.else:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([9 x i8], [9 x i8]* @__llvm_profile_name_test_br_2, i32 0, i32 0), i64 29667547796, i32 2, i32 1)
				%sub = sub nsw i32 %i, 2
				br label %if.end

				if.end:
				; GEN: if.end:
				; GEN-NOT: llvm.instrprof.increment
				%retv = phi i32 [ %add, %if.then ], [ %sub, %if.else ]
				ret i32 %retv
				; GEN: ret
				}

test/Transforms/PGOProfile/criticaledge.ll

This file was added.

				; RUN: opt < %s -pgo-instr-gen -S \| FileCheck %s --check-prefix=GEN
				; RUN: llvm-profdata merge %S/Inputs/criticaledge.proftext -o %T/criticaledge.profdata
				; RUN: opt < %s -pgo-instr-use -pgo-test-profile-file=%T/criticaledge.profdata -S \| FileCheck %s --check-prefix=USE
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; GEN: @__llvm_profile_name_test_criticalEdge = private constant [17 x i8] c"test_criticalEdge"
				; GEN: @"__llvm_profile_name_<stdin>:bar" = private constant [11 x i8] c"<stdin>:bar"

				define i32 @test_criticalEdge(i32 %i, i32 %j) {
				entry:
				; CHECK: entry:
				; GEN-NOT: call void @llvm.instrprof.increment
				switch i32 %i, label %sw.default [
				i32 1, label %sw.bb
				i32 2, label %sw.bb1
				i32 3, label %sw.bb2
				i32 4, label %sw.bb2
				; CHECK: i32 3, label %entry.sw.bb2_crit_edge
				; CHECK: i32 4, label %entry.sw.bb2_crit_edge1
				i32 5, label %sw.bb2
				]
				; USE: ]
				; USE-SAME: !prof ![[BW_SWITCH:[0-9]+]]

				; CHECK: entry.sw.bb2_crit_edge1:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([17 x i8], [17 x i8]* @__llvm_profile_name_test_criticalEdge, i32 0, i32 0), i64 82323253069, i32 8, i32 1)
				; CHECK: br label %sw.bb2

				; CHECK: entry.sw.bb2_crit_edge:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([17 x i8], [17 x i8]* @__llvm_profile_name_test_criticalEdge, i32 0, i32 0), i64 82323253069, i32 8, i32 0)
				; CHECK: br label %sw.bb2

				sw.bb:
				; GEN: sw.bb:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([17 x i8], [17 x i8]* @__llvm_profile_name_test_criticalEdge, i32 0, i32 0), i64 82323253069, i32 8, i32 5)
				%call = call i32 @bar(i32 2)
				br label %sw.epilog

				sw.bb1:
				; GEN: sw.bb1:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([17 x i8], [17 x i8]* @__llvm_profile_name_test_criticalEdge, i32 0, i32 0), i64 82323253069, i32 8, i32 4)
				%call2 = call i32 @bar(i32 1024)
				br label %sw.epilog

				sw.bb2:
				; GEN: sw.bb2:
				; GEN-NOT: call void @llvm.instrprof.increment
				%cmp = icmp eq i32 %j, 2
				br i1 %cmp, label %if.then, label %if.end
				; USE: br i1 %cmp, label %if.then, label %if.end
				; USE-SAME: !prof ![[BW_SW_BB2:[0-9]+]]

				if.then:
				; GEN: if.then:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([17 x i8], [17 x i8]* @__llvm_profile_name_test_criticalEdge, i32 0, i32 0), i64 82323253069, i32 8, i32 2)
				%call4 = call i32 @bar(i32 4)
				br label %return

				if.end:
				; GEN: if.end:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([17 x i8], [17 x i8]* @__llvm_profile_name_test_criticalEdge, i32 0, i32 0), i64 82323253069, i32 8, i32 3)
				%call5 = call i32 @bar(i32 8)
				br label %sw.epilog

				sw.default:
				; GEN: sw.default:
				; GEN-NOT: call void @llvm.instrprof.increment
				%call6 = call i32 @bar(i32 32)
				%cmp7 = icmp sgt i32 %j, 10
				br i1 %cmp7, label %if.then8, label %if.end9
				; USE: br i1 %cmp7, label %if.then8, label %if.end9
				; USE-SAME: !prof ![[BW_SW_DEFAULT:[0-9]+]]

				if.then8:
				; GEN: if.then8:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([17 x i8], [17 x i8]* @__llvm_profile_name_test_criticalEdge, i32 0, i32 0), i64 82323253069, i32 8, i32 7)
				%add = add nsw i32 %call6, 10
				br label %if.end9

				if.end9:
				; GEN: if.end9:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([17 x i8], [17 x i8]* @__llvm_profile_name_test_criticalEdge, i32 0, i32 0), i64 82323253069, i32 8, i32 6)
				%res.0 = phi i32 [ %add, %if.then8 ], [ %call6, %sw.default ]
				br label %sw.epilog

				sw.epilog:
				; GEN: sw.epilog:
				; GEN-NOT: call void @llvm.instrprof.increment
				%res.1 = phi i32 [ %res.0, %if.end9 ], [ %call5, %if.end ], [ %call2, %sw.bb1 ], [ %call, %sw.bb ]
				br label %return

				return:
				; GEN: return:
				; GEN-NOT: call void @llvm.instrprof.increment
				%retval = phi i32 [ %res.1, %sw.epilog ], [ %call4, %if.then ]
				ret i32 %retval
				}

				define internal i32 @bar(i32 %i) {
				entry:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([11 x i8], [11 x i8]* @"__llvm_profile_name_<stdin>:bar", i32 0, i32 0), i64 12884901887, i32 1, i32 0)
				ret i32 %i
				}

				; USE: ![[BW_SWITCH]] = !{!"branch_weights", i32 2, i32 1, i32 0, i32 2, i32 1, i32 1}
				; USE: ![[BW_SW_BB2]] = !{!"branch_weights", i32 2, i32 2}
				; USE: ![[BW_SW_DEFAULT]] = !{!"branch_weights", i32 1, i32 1}

test/Transforms/PGOProfile/diag_mismatch.ll

This file was added.

				; RUN: llvm-profdata merge %S/Inputs/diag.proftext -o %T/diag.profdata
				; RUN: opt < %s -pgo-instr-use -pgo-test-profile-file=%T/diag.profdata -S 2>&1 \| FileCheck %s

				; CHECK: Function control flow change detected (hash mismatch) foo

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				define i32 @foo() {
				entry:
				ret i32 0
				}

test/Transforms/PGOProfile/diag_no_funcprofdata.ll

This file was added.

				; RUN: llvm-profdata merge %S/Inputs/diag.proftext -o %T/diag.profdata
				; RUN: opt < %s -pgo-instr-use -pgo-test-profile-file=%T/diag.profdata -S 2>&1 \| FileCheck %s

				; CHECK: No profile data available for function bar

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				define i32 @bar() {
				entry:
				ret i32 0
				}

test/Transforms/PGOProfile/diag_no_profile.ll

This file was added.

				; RUN: not opt < %s -pgo-instr-use -pgo-test-profile-file=%T/notexisting.profdata -S 2>&1

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				define i32 @foo() {
				entry:
				ret i32 0
				}

test/Transforms/PGOProfile/landingpad.ll

This file was added.

				; RUN: opt < %s -pgo-instr-gen -S \| FileCheck %s --check-prefix=GEN
				; RUN: llvm-profdata merge %S/Inputs/landingpad.proftext -o %T/landingpad.profdata
				; RUN: opt < %s -pgo-instr-use -pgo-test-profile-file=%T/landingpad.profdata -S \| FileCheck %s --check-prefix=USE
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				@val = global i32 0, align 4
				@_ZTIi = external constant i8*
				; GEN: @__llvm_profile_name_bar = private constant [3 x i8] c"bar"
				; GEN: @__llvm_profile_name_foo = private constant [3 x i8] c"foo"

				define i32 @bar(i32 %i) {
				entry:
				; GEN: entry:
				; GEN-NOT: call void @llvm.instrprof.increment
				%rem = srem i32 %i, 3
				%tobool = icmp ne i32 %rem, 0
				br i1 %tobool, label %if.then, label %if.end
				; USE: br i1 %tobool, label %if.then, label %if.end
				; USE-SAME: !prof ![[BW_BAR_ENTRY:[0-9]+]]

				if.then:
				; GEN: if.then:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__llvm_profile_name_bar, i32 0, i32 0), i64 24868915205, i32 2, i32 1)
				%exception = call i8* @__cxa_allocate_exception(i64 4)
				%tmp = bitcast i8* %exception to i32*
				store i32 %i, i32* %tmp, align 16
				call void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null)
				unreachable

				if.end:
				; GEN: if.end:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__llvm_profile_name_bar, i32 0, i32 0), i64 24868915205, i32 2, i32 0)
				ret i32 0
				}

				declare i8* @__cxa_allocate_exception(i64)

				declare void @__cxa_throw(i8, i8, i8*)

				define i32 @foo(i32 %i) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
				entry:
				; GEN: entry:
				; GEN-NOT: call void @llvm.instrprof.increment
				%rem = srem i32 %i, 2
				%tobool = icmp ne i32 %rem, 0
				br i1 %tobool, label %if.then, label %if.end
				; USE: br i1 %tobool, label %if.then, label %if.end
				; USE-SAME: !prof ![[BW_FOO_ENTRY:[0-9]+]]

				if.then:
				; GEN: if.then:
				; GEN-NOT: call void @llvm.instrprof.increment
				%mul = mul nsw i32 %i, 7
				%call = invoke i32 @bar(i32 %mul)
				to label %invoke.cont unwind label %lpad

				invoke.cont:
				; GEN: invoke.cont:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__llvm_profile_name_foo, i32 0, i32 0), i64 59130013419, i32 4, i32 1)
				br label %if.end

				lpad:
				; GEN: lpad:
				; GEN-NOT: call void @llvm.instrprof.increment
				%tmp = landingpad { i8*, i32 }
				catch i8* bitcast (i8** @_ZTIi to i8*)
				%tmp1 = extractvalue { i8*, i32 } %tmp, 0
				%tmp2 = extractvalue { i8*, i32 } %tmp, 1
				br label %catch.dispatch

				catch.dispatch:
				; GEN: catch.dispatch:
				; GEN-NOT: call void @llvm.instrprof.increment
				%tmp3 = call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*))
				%matches = icmp eq i32 %tmp2, %tmp3
				br i1 %matches, label %catch, label %eh.resume
				; USE: br i1 %matches, label %catch, label %eh.resume
				; USE-SAME: !prof ![[BW_CATCH_DISPATCH:[0-9]+]]

				catch:
				; GEN: catch:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__llvm_profile_name_foo, i32 0, i32 0), i64 59130013419, i32 4, i32 2)
				%tmp4 = call i8* @__cxa_begin_catch(i8* %tmp1)
				%tmp5 = bitcast i8* %tmp4 to i32*
				%tmp6 = load i32, i32* %tmp5, align 4
				%tmp7 = load i32, i32* @val, align 4
				%sub = sub nsw i32 %tmp7, %tmp6
				store i32 %sub, i32* @val, align 4
				call void @__cxa_end_catch()
				br label %try.cont

				try.cont:
				; GEN: try.cont:
				; GEN-NOT: call void @llvm.instrprof.increment
				ret i32 -1

				if.end:
				; GEN: if.end:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__llvm_profile_name_foo, i32 0, i32 0), i64 59130013419, i32 4, i32 0)
				%tmp8 = load i32, i32* @val, align 4
				%add = add nsw i32 %tmp8, %i
				store i32 %add, i32* @val, align 4
				br label %try.cont

				eh.resume:
				; GEN: eh.resume:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__llvm_profile_name_foo, i32 0, i32 0), i64 59130013419, i32 4, i32 3)
				%lpad.val = insertvalue { i8, i32 } undef, i8 %tmp1, 0
				%lpad.val3 = insertvalue { i8*, i32 } %lpad.val, i32 %tmp2, 1
				resume { i8*, i32 } %lpad.val3
				}

				declare i32 @__gxx_personality_v0(...)

				declare i32 @llvm.eh.typeid.for(i8*)

				declare i8* @__cxa_begin_catch(i8*)

				declare void @__cxa_end_catch()

				; USE: ![[BW_BAR_ENTRY]] = !{!"branch_weights", i32 2, i32 1}
				; USE: ![[BW_FOO_ENTRY]] = !{!"branch_weights", i32 3, i32 2}
				; USE: ![[BW_CATCH_DISPATCH]] = !{!"branch_weights", i32 2, i32 0}

test/Transforms/PGOProfile/loop1.ll

This file was added.

				; RUN: opt < %s -pgo-instr-gen -S \| FileCheck %s --check-prefix=GEN
				; RUN: llvm-profdata merge %S/Inputs/loop1.proftext -o %T/loop1.profdata
				; RUN: opt < %s -pgo-instr-use -pgo-test-profile-file=%T/loop1.profdata -S \| FileCheck %s --check-prefix=USE
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; GEN: @__llvm_profile_name_test_simple_for = private constant [15 x i8] c"test_simple_for"

				define i32 @test_simple_for(i32 %n) {
				entry:
				; GEN: entry:
				; GEN-NOT: call void @llvm.instrprof.increment
				br label %for.cond

				for.cond:
				; GEN: for.cond:
				; GEN-NOT: call void @llvm.instrprof.increment
				%i = phi i32 [ 0, %entry ], [ %inc1, %for.inc ]
				%sum = phi i32 [ 1, %entry ], [ %inc, %for.inc ]
				%cmp = icmp slt i32 %i, %n
				br i1 %cmp, label %for.body, label %for.end
				; USE: br i1 %cmp, label %for.body, label %for.end
				; USE-SAME: !prof ![[BW_FOR_COND:[0-9]+]]
				; USE: ![[BW_FOR_COND]] = !{!"branch_weights", i32 96, i32 4}

				for.body:
				; GEN: for.body:
				; GEN-NOT: call void @llvm.instrprof.increment
				%inc = add nsw i32 %sum, 1
				br label %for.inc

				for.inc:
				; GEN: for.inc:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([15 x i8], [15 x i8]* @__llvm_profile_name_test_simple_for, i32 0, i32 0), i64 34137660316, i32 2, i32 0)
				%inc1 = add nsw i32 %i, 1
				br label %for.cond

				for.end:
				; GEN: for.end:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([15 x i8], [15 x i8]* @__llvm_profile_name_test_simple_for, i32 0, i32 0), i64 34137660316, i32 2, i32 1)
				ret i32 %sum
				}

test/Transforms/PGOProfile/loop2.ll

This file was added.

				; RUN: opt < %s -pgo-instr-gen -S \| FileCheck %s --check-prefix=GEN
				; RUN: llvm-profdata merge %S/Inputs/loop2.proftext -o %T/loop2.profdata
				; RUN: opt < %s -pgo-instr-use -pgo-test-profile-file=%T/loop2.profdata -S \| FileCheck %s --check-prefix=USE
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; GEN: @__llvm_profile_name_test_nested_for = private constant [15 x i8] c"test_nested_for"

				define i32 @test_nested_for(i32 %r, i32 %s) {
				entry:
				silvasUnsubmitted Not Done Reply Inline Actions What is the importance of testing a nested for loop? What part of the code are you trying to exercise that isn't covered by loop1.ll? silvas: What is the importance of testing a nested for loop? What part of the code are you trying to…
				davidxlUnsubmitted Not Done Reply Inline Actions Having loop nest in the test for better coverage is fine -- but looks like we don't actually need 3-deep nest -- a 2-deep loop nest is good enough and will be easier to read. loop1.ll has a loop with top test. How about a loop test with bottom testing. Also a loop with more control flow inside the body, and a loop with early exit might be nice to have -- but those can be added later as follow ups if needed. davidxl: Having loop nest in the test for better coverage is fine -- but looks like we don't actually…
				; GEN: entry:
				; GEN-NOT: call void @llvm.instrprof.increment
				br label %for.cond.outer

				for.cond.outer:
				; GEN: for.cond.outer:
				; GEN-NOT: call void @llvm.instrprof.increment
				%i.0 = phi i32 [ 0, %entry ], [ %inc.2, %for.inc.outer ]
				%sum.0 = phi i32 [ 1, %entry ], [ %sum.1, %for.inc.outer ]
				%cmp = icmp slt i32 %i.0, %r
				br i1 %cmp, label %for.body.outer, label %for.end.outer
				; USE: br i1 %cmp, label %for.body.outer, label %for.end.outer
				; USE-SAME: !prof ![[BW_FOR_COND_OUTER:[0-9]+]]

				for.body.outer:
				; GEN: for.body.outer:
				; GEN-NOT: call void @llvm.instrprof.increment
				br label %for.cond.inner

				for.cond.inner:
				; GEN: for.cond.inner:
				; GEN-NOT: call void @llvm.instrprof.increment
				%j.0 = phi i32 [ 0, %for.body.outer ], [ %inc.1, %for.inc.inner ]
				%sum.1 = phi i32 [ %sum.0, %for.body.outer ], [ %inc, %for.inc.inner ]
				%cmp2 = icmp slt i32 %j.0, %s
				br i1 %cmp2, label %for.body.inner, label %for.end.inner
				; USE: br i1 %cmp2, label %for.body.inner, label %for.end.inner
				; USE-SAME: !prof ![[BW_FOR_COND_INNER:[0-9]+]]

				for.body.inner:
				; GEN: for.body.inner:
				; GEN-NOT: call void @llvm.instrprof.increment
				%inc = add nsw i32 %sum.1, 1
				br label %for.inc.inner

				for.inc.inner:
				; GEN: for.inc.inner:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([15 x i8], [15 x i8]* @__llvm_profile_name_test_nested_for, i32 0, i32 0), i64 53929068288, i32 3, i32 0)
				%inc.1 = add nsw i32 %j.0, 1
				br label %for.cond.inner

				for.end.inner:
				; GEN: for.end.inner:
				br label %for.inc.outer

				for.inc.outer:
				; GEN: for.inc.outer:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([15 x i8], [15 x i8]* @__llvm_profile_name_test_nested_for, i32 0, i32 0), i64 53929068288, i32 3, i32 1)
				%inc.2 = add nsw i32 %i.0, 1
				br label %for.cond.outer

				for.end.outer:
				; GEN: for.end.outer:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([15 x i8], [15 x i8]* @__llvm_profile_name_test_nested_for, i32 0, i32 0), i64 53929068288, i32 3, i32 2)
				ret i32 %sum.0
				}

				; USE-DAG: ![[BW_FOR_COND_OUTER]] = !{!"branch_weights", i32 10, i32 6}
				; USE-DAG: ![[BW_FOR_COND_INNER]] = !{!"branch_weights", i32 33, i32 10}

test/Transforms/PGOProfile/single_bb.ll

This file was added.

				; RUN: opt < %s -pgo-instr-gen -S \| FileCheck %s --check-prefix=GEN
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; GEN: @__llvm_profile_name_single_bb = private constant [9 x i8] c"single_bb"

				define i32 @single_bb() {
				entry:
				; GEN: entry:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([9 x i8], [9 x i8]* @__llvm_profile_name_single_bb, i32 0, i32 0), i64 12884901887, i32 1, i32 0)
				ret i32 0
				}

test/Transforms/PGOProfile/switch.ll

This file was added.

				; RUN: opt < %s -pgo-instr-gen -S \| FileCheck %s --check-prefix=GEN
				; RUN: llvm-profdata merge %S/Inputs/switch.proftext -o %T/switch.profdata
				; RUN: opt < %s -pgo-instr-use -pgo-test-profile-file=%T/switch.profdata -S \| FileCheck %s --check-prefix=USE
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; GEN: @__llvm_profile_name_test_switch = private constant [11 x i8] c"test_switch"

				define void @test_switch(i32 %i) {
				entry:
				; GEN: entry:
				; GEN-NOT: call void @llvm.instrprof.increment
				switch i32 %i, label %sw.default [
				i32 1, label %sw.bb
				i32 2, label %sw.bb1
				i32 3, label %sw.bb2
				]
				; USE: ]
				; USE-SAME: !prof ![[BW_SWITCH:[0-9]+]]
				; USE: ![[BW_SWITCH]] = !{!"branch_weights", i32 3, i32 2, i32 0, i32 5}

				sw.bb:
				; GEN: sw.bb:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([11 x i8], [11 x i8]* @__llvm_profile_name_test_switch, i32 0, i32 0), i64 46200943743, i32 4, i32 2)
				br label %sw.epilog

				sw.bb1:
				; GEN: sw.bb1:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([11 x i8], [11 x i8]* @__llvm_profile_name_test_switch, i32 0, i32 0), i64 46200943743, i32 4, i32 0)
				br label %sw.epilog

				sw.bb2:
				; GEN: sw.bb2:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([11 x i8], [11 x i8]* @__llvm_profile_name_test_switch, i32 0, i32 0), i64 46200943743, i32 4, i32 1)
				br label %sw.epilog

				sw.default:
				; GEN: sw.default:
				; GEN: call void @llvm.instrprof.increment(i8* getelementptr inbounds ([11 x i8], [11 x i8]* @__llvm_profile_name_test_switch, i32 0, i32 0), i64 46200943743, i32 4, i32 3)
				br label %sw.epilog

				sw.epilog:
				; GEN: sw.epilog:
				; GEN-NOT: call void @llvm.instrprof.increment
				ret void
				; GEN: ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

PGO IR-level instrumentation infrastructureClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 41686

include/llvm/IR/DiagnosticInfo.h

include/llvm/InitializePasses.h

include/llvm/LinkAllPasses.h

include/llvm/Transforms/Instrumentation.h

lib/IR/DiagnosticInfo.cpp

lib/Transforms/Instrumentation/CFGMST.h

lib/Transforms/Instrumentation/CMakeLists.txt

lib/Transforms/Instrumentation/Instrumentation.cpp

lib/Transforms/Instrumentation/LLVMBuild.txt

lib/Transforms/Instrumentation/PGOInstrumentation.cpp

test/Transforms/PGOProfile/Inputs/branch1.proftext

test/Transforms/PGOProfile/Inputs/branch2.proftext

test/Transforms/PGOProfile/Inputs/criticaledge.proftext

test/Transforms/PGOProfile/Inputs/diag.proftext

test/Transforms/PGOProfile/Inputs/landingpad.proftext

test/Transforms/PGOProfile/Inputs/loop1.proftext

test/Transforms/PGOProfile/Inputs/loop2.proftext

test/Transforms/PGOProfile/Inputs/switch.proftext

test/Transforms/PGOProfile/branch1.ll

test/Transforms/PGOProfile/branch2.ll

test/Transforms/PGOProfile/criticaledge.ll

test/Transforms/PGOProfile/diag_mismatch.ll

test/Transforms/PGOProfile/diag_no_funcprofdata.ll

test/Transforms/PGOProfile/diag_no_profile.ll

test/Transforms/PGOProfile/landingpad.ll

test/Transforms/PGOProfile/loop1.ll

test/Transforms/PGOProfile/loop2.ll

test/Transforms/PGOProfile/single_bb.ll

test/Transforms/PGOProfile/switch.ll

PGO IR-level instrumentation infrastructure
ClosedPublic