This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
Bitcode/
7
LLVMBitCodes.h
-
IR/
29
FunctionInfo.h
-
ProfileData/
2
ProfileCommon.h
-
lib/Bitcode/
-
Bitcode/
-
Reader/
13
BitcodeReader.cpp
-
Writer/
33
BitcodeWriter.cpp
-
LLVMBuild.txt
-
test/
-
Bitcode/
-
Inputs/
-
thinlto-function-summary-callgraph-pgo.ll
-
thinlto-function-summary-callgraph.ll
-
thinlto-function-summary-callgraph-pgo.ll
-
thinlto-function-summary-callgraph.ll
-
thinlto-function-summary.ll
-
thinlto-summary-linkage-types.ll
-
tools/
-
gold/X86/
-
X86/
-
thinlto.ll
-
llvm-lto/
-
thinlto.ll
-
tools/llvm-bcanalyzer/
-
llvm-bcanalyzer/
-
llvm-bcanalyzer.cpp

Differential D17212

[ThinLTO] Support for reference graph in per-module and combined summary.
ClosedPublic

Authored by tejohnson on Feb 12 2016, 1:21 PM.

Download Raw Diff

Details

Reviewers

davidxl
mehdi_amini

Commits

rG76a1c1d0ba05: [ThinLTO] Support for reference graph in per-module and combined summary.
rL263275: [ThinLTO] Support for reference graph in per-module and combined summary.

Summary

This patch adds support for including a full reference graph including
call graph edges and other GV references in the summary.

The reference graph edges can be used to make importing decisions
without materializing any source modules, can be used in the plugin
to make file staging decisions for distributed build systems, and is
expected to have other uses.

The call graph edges are recorded in each function summary in the
bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO
data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when
there is PGO, where the ValueId can be mapped to the function GUID via
the ValueSymbolTable. In the function index in memory, the call graph
edges reference the target via the CalleeGUID instead of the
CalleeValueId.

The reference graph edges are recorded in each summary record with a
list of referenced value IDs, which can be mapped to value GUID via the
ValueSymbolTable.

Addtionally, a new summary record type is added to record references
from global variable initializers. A number of bitcode records and data
structures have been renamed to reflect the newly expanded scope of the
summary beyond functions. More cleanup will follow.

Measurements on 483.xalancbmk show the following increases in object and
combined index sizes:

Overall .o bitcode increase is 1.34% without PGO and 1.54% with PGO
The combined index increase is 116% without PGO and 120% with PGO
The aggregate size of the .o bitcode and combined index together increased 3.22% without PGO and 3.39% with PGO (the combined index size increase is large taken alone, but its size is still on the same order of magnitude as the larger .o files).

Diff Detail

Event Timeline

tejohnson updated this revision to Diff 47847.Feb 12 2016, 1:21 PM

tejohnson retitled this revision from to [ThinLTO] Support for call graph in per-module and combined summary..

tejohnson updated this object.

tejohnson added reviewers: mehdi_amini, davidxl.

tejohnson added a subscriber: llvm-commits.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptFeb 12 2016, 1:22 PM

davidxl added inline comments.Feb 16 2016, 1:03 PM

include/llvm/Bitcode/LLVMBitCodes.h
190–202	A side note -- it might be useful to reserve some bits for general function attributes such as 'address-taken' etc.
include/llvm/IR/FunctionInfo.h
97	Without profile data, it might be useful to record the number of static callsites from a function to the callee. This can help compiler backend to make better global decisions later (e.g. better inlining and enable more function GC).
286	Is this member really needed? The helper is not intended to own the summary.
lib/Bitcode/Reader/BitcodeReader.cpp
5708	Is it better to introduce a symbolic name for the start index instead of hardcoded 3?
lib/Bitcode/Writer/BitcodeWriter.cpp
2520	Should this be called 'getBlockProfileCount' ? This utiltiy method belongs to include/llvm/ProfileData/ProfileCommon.h.
2864–2868	Should the type be SmallVector<uint64_t, 64> ? GUID is 64 bit type.

+eraman This patch defines scaledCount() method which should probably be in ProfileCommon.h. There is a similar method in your inliner patch which can be moved there too.

tejohnson added inline comments.Feb 23 2016, 2:23 PM

include/llvm/Bitcode/LLVMBitCodes.h
190–202	What is the advantage of saving bits now? The format is still highly in flux anyway, and once it is nailed down eventually we would need to use some kind of version id to specify the change regardless of whether it was using saved bits or not.
include/llvm/IR/FunctionInfo.h
97	I could do that by overloading the count field to hold the static number of callsites. Alternatively, for statically-generated profile information is it useful to record the block frequency sums or something like that?
286	The helper does own the function summary temporarily. We create it when the function summary section is read, then later transfer ownership to the index when the VST is read. The reason why I need to keep a non-owning pointer to the summary is that later on after all VST entries are read from the combined index I need to set up the call graph edges in the combined index, and for that I need to keep the association here between the function summary and the CallGraphEdgeValueIdList.
lib/Bitcode/Reader/BitcodeReader.cpp
5708	I can do that. There's a lot of hardcoded start indices just like this in the file already, but I agree a symbolic name would be clearer.
lib/Bitcode/Writer/BitcodeWriter.cpp
2520	Good idea. Will rename and move.
2864–2868	Good point, it should be uint64_t, but for a different reason. We aren't writing the GUID, but rather a value id which is unsigned. But we are writing the profile count when we have profile data, which is uint64_t. Fixing it here and when writing the combined summary.

davidxl added inline comments.Feb 23 2016, 5:23 PM

include/llvm/Bitcode/LLVMBitCodes.h
190–202	ok.
include/llvm/IR/FunctionInfo.h
97	Regarding static # of callsites -- another way is to build a multi-graph -- basically do not collapse all edges from one caller to the same callee. How much space do we save there? If we want to save space, there is a better way to do this -- we can build a graph with edges from module to the callee. Such a module->function edge contains the following information : { Aggregate call count, max call count, static # of sites } It is probably not useful to record static block frequency.
286	owns it temporarily in what sense? Does it have a chance to de-allocate it?

davidxl added inline comments.Feb 23 2016, 7:58 PM

include/llvm/IR/FunctionInfo.h
97	My suggestion regarding using module->function edge can not replace the aggregate function->function call edge here -- but it can be emitted as additional information to track static callsite info -- but that is independent of what this patch does.

tejohnson added inline comments.Feb 24 2016, 7:46 AM

include/llvm/IR/FunctionInfo.h
97	Right, we still want the function->function edges for better precision. Will consider module->function edges at a later point. I will need to do some measurements to see what the overhead is of not collapsing the edges within each function. But if we simply add a static callsite count (possibly both with and without PGO), I think that may be sufficient.
286	It passes ownership to the function index in memory when the corresponding VST entry is created. After all the VST entries are created, it uses the saved (non-owned) pointer to create the call graph edges in the function index in memory (need to wait until all the VST entries are parsed as we need to know all the value id and GUIDs). See FunctionIndexBitcodeReader::parseValueSymbolTable (note the function return is in the EndBlock case in the switch statement, not at the bottom of the routine). The other alternative would be to keep a mapping there from the FunctionSummaryIOHelper object to the corresponding FunctionSummary pointer (created when we parse the corresponding VST entry and transfer the ownership of the unique_ptr to the function index). I think that is probably cleaner and more straightforward. Will make that change. As an aside, I just noticed that the ValueIdToCallGraphGUIDMap is only used in this routine, I will move it to function local instead of being on the FunctionIndexBitcodeReader class.

davidxl added inline comments.Feb 24 2016, 9:19 AM

include/llvm/IR/FunctionInfo.h
286	Right -- if the IO helper class only conditionally pass the ownership to functionIndex in some cases, the unique pointer is needed. However if it is intended to be always passed to another object, the helper just need to keep a naked pointer to the summary as it does not really own it.

tejohnson added inline comments.Feb 24 2016, 9:34 AM

include/llvm/IR/FunctionInfo.h
286	In the BitcodeWriter it stays owned by the helper. In the BitcodeReader ownership is always transferred to the in-memory function index. Even in the latter case, I think creating a unique_ptr immediately, rather than creating a naked pointer and only creating a unique_ptr when the VST is read, is cleaner as it makes the current ownership clear and avoids malloc.

Address most of davidxl's comments (except keeping track of total static
count of call edges, that is still TODO).

I have a concerned about the "Only inter-module calls are recorded" part. This prevents to perform accurate pure summary-based importing decision (that I'm planning on doing).
Have you measured the impact of representing the full call-graph in the summary?

In D17212#360805, @joker.eph wrote:

I have a concerned about the "Only inter-module calls are recorded" part. This prevents to perform accurate pure summary-based importing decision (that I'm planning on doing).
Have you measured the impact of representing the full call-graph in the summary?

Good point, I forgot that would be needed if we want to make summary-only importing decisions. Let me add the intra-module edges and measure the added overhead.

In D17212#360841, @tejohnson wrote:

In D17212#360805, @joker.eph wrote:

I have a concerned about the "Only inter-module calls are recorded" part. This prevents to perform accurate pure summary-based importing decision (that I'm planning on doing).
Have you measured the impact of representing the full call-graph in the summary?

Good point, I forgot that would be needed if we want to make summary-only importing decisions. Let me add the intra-module edges and measure the added overhead.

For 483.xalancbmk, adding in the intra-module calls adds ~4% to the combined index size (so 26-27% over no callgraph) with and without PGO. This seems reasonable. Will update the patch to include those.

Include intra-module calls.

eraman added inline comments.Feb 24 2016, 2:32 PM

include/llvm/ProfileData/ProfileCommon.h
98	Why not just do ScaledCount = EntryCount * BlockFreq / EntryFreq using APInt<128>, take the lower 64 bits if the result fitx within uint64_t and use UINT64_MAX otheerwise?

I skimmed through and it sounds good to me, please see a few comments below.

mehdi_amini added inline comments.Feb 25 2016, 12:45 AM

include/llvm/Bitcode/LLVMBitCodes.h
190–203	Side note: we will need to nail the format in the coming weeks/months somehow, or have a plan for backward compatibility (auto upgrade...).
195	It is a bit annoying to have to manually overload for every "optional" data. It may also lead to a combinatory explosion if there were many. Unfortunately the only alternative in the current bitcode format is to have function summary stored as block instead of record...
include/llvm/IR/FunctionInfo.h
97	Without profile data, it might be useful to record the number of static callsites from a function to the callee. This can help compiler backend to make better global decisions later (e.g. better inlining and enable more function GC). Can you elaborate how would you use this information and how it would help inlining and GC?
298	I think we usually prefix `takeXXX` and not `getXXX` when ownership is transferred.

Responses to Mehdi's comments.

include/llvm/Bitcode/LLVMBitCodes.h
190–203	Agreed, although I think we are still in the heavy churn tuning phase.
195	Actually my first implementation of this used a single record id. What I did was add a bit to indicate whether there was any profile data, so that the reader knew how to interpret the array data at the end (as either a list of just callee ids, or <callee id, count> pairs). I think I used the size of the record to deduce whether there was any call information. I decided to switch to explicitly encoding this information in the record id because it seemed cleaner and more consistent, and also removed a bit from each record (the profile flag). This reduces the size for large .o bitcode files and for the combined index, although the size for small .o bitcode files is a bit larger due to the 2 extra abbrev ids. I could probably go back to using the size to deduce the presence or absence of call information, and reduce this to 2 per summary type. WDYT?
include/llvm/IR/FunctionInfo.h
97	I think what David was alluding to was similar to what I was describing on IRC yesterday - if you know you have a single static callsite via the summary, you can give the same inline cost benefit currently given for internal(ized) single-callsite functions. Then if it is inlined, hopefully the original out-of-line function can be eliminated via linker GC. To do this, the combined index creation step would need to aggregate the # static callsites to each function and put that into the callee function's combined summary.
298	Ok, will fix.

Response to eraman's comment.

include/llvm/ProfileData/ProfileCommon.h
98	I looked at APInt briefly when I was thinking about how to do this, but was concerned that it was going to be inefficient in the common case. However, it looks like that isn't the case as it has a fast path. Will change this to use it. Note I actually stole this code from RAGreedy::initializeCSRCost() (which could probably be refactored to use this new method once it goes in).

mehdi_amini added inline comments.Feb 25 2016, 8:49 AM

include/llvm/IR/FunctionInfo.h
291	Why this? If needed, replace with `FunctionSummaryIOHelper() = default;`

tejohnson added inline comments.Feb 25 2016, 9:31 AM

include/llvm/IR/FunctionInfo.h
291	This was leftover from when I had a member variable that was being passed in and initialized in the constructor. Removing this now.

Address comments by eraman and jokereph.

Add static callsite count to summary callgraph edges.
For 483.xalancbmk this increases the size of the combined index
by another 3.5%.

mehdi_amini added inline comments.Feb 25 2016, 11:54 PM

include/llvm/IR/FunctionInfo.h
105	Missing const variant.

tejohnson added inline comments.Feb 26 2016, 6:31 AM

include/llvm/IR/FunctionInfo.h
105	Adding that now.

Add const variant of FunctionSummary::edges

How could we integrate accesses to global variable as part of this?
It turns out that to be able to benefit from the linker information on what symbol is exported during the import, this is a must.

In D17212#362864, @joker.eph wrote:

How could we integrate accesses to global variable as part of this?
It turns out that to be able to benefit from the linker information on what symbol is exported during the import, this is a must.

Well without it you still can see which function symbols will be exported, just not the variables, so you are running with less info and I guess need to assume that all static variables will be exposed and promote them. To refine that behavior for variables, yes, we'd need additional info in the summary.

(For davidxl or anyone else who didn't see the IRC conversation, Mehdi is looking at doing pure summary-based importing decisions in the linker step, then giving this info to the ThinLTO backends to avoid promotion of local values that aren't exported. For a distributed build if we wanted to do it this way the importing decisions would all be made in the plugin step, then would need to be serialized out for the distributed backends to check.)

Two possibilities depending on the fidelity of the info you think you need:

One possibility is to just put a flag in the function summary if it accesses *any* local variables, and adjust the importing threshold accordingly. Then in the ThinLTO backend for the exporting module you need to check which of your own functions are on the import list, and which local variables they access, and promote accordingly.

If it will be really beneficial to note exactly which local variables are accessed by which function, we'll need to broaden the edges list to include accesses to variables (I assume you only care about local variables here). E.g. the per-module summary edge list for a function would need to include value ids for any local variables referenced by that function (not sure that the other parts of the triple, the static and profile counts, are needed for that). Then in the combined VST we need to include entries for GUIDs of things that don't have a function summary, but are referenced by these edges. When accessing a function summary edge list for a candidate function to import, you could then see the GUID of any local variables accessed. You wouldn't know them by name, but if for example you wanted a heuristic like "if >=N hot import candidate functions from module M access a local variable with GUID G, go ahead and import those and let G be promoted by the backend (which like in approach #1 needs to check which local variables are accessed by any functions on an import list)".

Obviously 1) is easier and cheaper space-wise. What are your thoughts?

Is there a need to mark it? Does the following work?

using summary call graph to do importing decision
for any function that is imported by another module, mark it as

'exported'

When processing the exporting module, check any exported function and

promote any statics it references.

David

In D17212#362994, @davidxl wrote:

Is there a need to mark it? Does the following work?

using summary call graph to do importing decision

for any function that is imported by another module, mark it as

'exported'

When processing the exporting module, check any exported function and

promote any statics it references.

David

That does work for correctness, but Mehdi was proposing that functions accessing local values need to meet a higher importing threshold so that we don't cause too many promotions due to over-eager importing.

In terms of serializing the info out for doing any of this with a distributed build, we could do it via an "exported" bit in the combined summary though.

tejohnson mentioned this in D17656: [ThinLTO] (WIP) Include total static callsite counts in function summaries.Feb 26 2016, 1:50 PM

mehdi_amini added inline comments.Feb 28 2016, 7:53 PM

lib/Bitcode/Reader/BitcodeReader.cpp
5528	Shouldn't this be `ValueIdToCallGraphGUIDMap[ValueID] = Function::getGUID(FunctionGlobalId);` ?

tejohnson added inline comments.Feb 29 2016, 6:42 AM

lib/Bitcode/Reader/BitcodeReader.cpp
5528	Good catch! Yes, it should be. The VST_CODE_ENTRY case does not need to invoke getGlobalIdentifier since those are externally-defined functions (and so not possibly local). But looking at the other places here where we set up this map I realized that the VST_CODE_COMBINED_FNENTRY case was wrong - we already have the GUID and there is no ValueName in that record. I have a fix for both coming up shortly.

Fix ValueIdToCallGraphGUIDMap entries.

Update patch to include full reference graph.

Some notable changes from prior patch:

include non-call reference edges in summary records
introduce summary entries for global variable initializations
add the VSTOFFSET record to the combined index bitcode file, and use it when reading both the per-module and combined indices to parse the VST before the summary section (analogous to how it is used when parsing normal bitcode to parse the VST before the function blocks), which reduces the amount of bookkeeping required to associate the VST entry with the value ids used in the summary section.

Bitcode name changes:

add a new FS_*_GLOBALVAR_INIT_REFS records to hold ref graph edges from global variable inits to referenced global vars.
changed the VST_CODE_COMBINED_FNENTRY name to VST_CODE_COMBINED_GVDEFENTRY to reflect the broadened scope (can correspond to either a function or variable def and associated summary).
changed FUNCTION_SUMMARY_BLOCK_ID to GLOBALVAL_SUMMARY_BLOCK_ID to reflect additional scope

Data structure and related naming changes:

Added a new GlobalValueSummary type which holds summary info common to both the function and globalvar summaries.
Made the existing FunctionSummary record a derived class of the new GlobalValueSummary.
Added a new derived GlobalVarSummary class of GlobalValueSummary.
Renamed FunctionInfo* -> GlobalValueInfo*
Renamed FunctionSummary -> GlobalValueSummary or just Summary as appropriate
Misc other variable and method renamings to reflect above changes.

TODO:

Rename FunctionInfoIndex to something more broad (ModuleIndex?) (ditto for related classes like FunctionInfoIndexBitcodeReader).
Move getGUID and getGlobalIdentifier from Function to GlobalValue class.
Some lingering FunctionSummary references that need to be updated to GlobalValueSummary or just Summary.

I will work on the TODO items now but wanted to get this patch out for review,
testing by Mehdi, and feedback.

Note that some of these changes (many of the renames, can be committed first
as NFC changes).

LGTM overall. Still some comments/questions below.

Thanks for the good work!

include/llvm/IR/FunctionInfo.h
73	Can't/shouldn't be protected?
91	why `to be written to...`?
99	I think the usual terminology in LLVM is "PGO" (I had to google FDO) (same below)
104	`F`? Also again the "to be written to" seems off to me.
146	Can't you have a single ctor? `GlobalValueInfo(uint64_t Offset = 0, std::unique_ptr<GlobalValueSummary> Summary = nullptr)`
lib/Bitcode/Reader/BitcodeReader.cpp
416	I guess more renaming could have been done here right? (I really don't mind, but since you cared to rename many places below...)
454	doc
5749	Mmmmm....
5759	Mmmmm (bis)....
lib/Bitcode/Writer/BitcodeWriter.cpp
920	I'd write it: if (M->getValueSymbolTable().empty()) return 0; return WriteValueSymbolTableForwardDecl(Stream);
2346	Why is it no longer valid?
2517–2518	Mind adding a one line comment?
2530	Guarantee on the recursion depth?
2582	`if(CS)` could be hoisted out of the loop.
2594	You depend on the order of the calls. If an instruction first add a call to a function, but later on will reference it, you won't remove it from RefEdges. This may be intended and you want to have an accurate count of ref + calls and here you're trying to filter calls out of refedges. However a call instruction could legitimately refer the function (a call passing a function pointer as an argument for instance)
2596	?
2617	?
2957	Isn't the opposite that the code is doing?

Another question: you reported measurements in the description, is it up to date with the new patch?

In D17212#369588, @joker.eph wrote:

Another question: you reported measurements in the description, is it up to date with the new patch?

Not yet, need to update that. The size of the thinlto.bc combined index went up quite a bit percentage wise with the ref graph, especially once I corrected it to recursively look for references, and added in the global variable init refs. Here are the new stats, only have non-PGO so far, will collect PGO as well and update the summary (once I make a fix described below to add linkage types to variables):

For 483.xalancbmk (non-PGO):
Total .o (bitcode) size: +1.24%
Combined .thinlto.bc size: +120%
Aggregate .o + .thinlto.bc size: +3.16%

I gave the last number because the size of the combined index .thinlto.bc file is on the same order of magnitude as the largest individual .o bitcode files, so the increase should be taken in that context.

include/llvm/IR/FunctionInfo.h
73	Good point, will make protected.
91	Good catch, this was leftover from the prior patch when it was part of a bitcode writer helper, didn't updated it up after moving here. Will fix.
99	Will fix...bummer that every compiler I have worked on seems to use a different term! PGO/FDO/PBO...bah.
104	Same issue as above, stale comment, will update.
146	Yes, will fix.
lib/Bitcode/Reader/BitcodeReader.cpp
416	Yep, this was part of the TODO list in this patch update. Working on that now. Specifically, I'm changing FunctionIndex* and FunctionInfoIndex* to ModuleSummaryIndex*. I left this for a follow-on update because already the renaming was pretty extensive, and this particular change bleeds into some of the interfaces and variable/field names used in other parts of the compiler (clang etc).
454	Will do.
5749	Oops. In my haste to get the renaming done and send this out, forgot that I planned to add a linkage type to the global var init records. Will add now.
5759	Ditto.
lib/Bitcode/Writer/BitcodeWriter.cpp
920	Good idea, will fix.
2346	Good catch. I think this got removed when I was using a different data structure here for awhile, and I missed it when restoring the old approach. Will add back.
2517–2518	Will do.
2530	Beyond the Visited set check earlier? What would you like to see?
2582	Yes, missed this when I cleaned things up from when I was initially looking for non-call references inside this loop.
2594	I don't think the order of calls matters? The reference list is populated once outside of this loop. On your second point, that is true that this will cause a function both called and referenced in another way to only be recorded as called by this function. Is it important to list both types of references? I was thinking that it was important for importing needs to distinguish the functions being called from the other non-call references, but that essentially the combination of the two are the full reference set of the function. If I should put it in both places, I will need to figure out how to distinguish the two types of function references in findRefEdges.
2596	More cruft left from when I was looking for non-call refs in this loop, will remove
2617	Incomplete cleanup of debugging output, will remove.
2957	Yes, good catch. I think what I initially wanted was to record them this way, but the issue is that the alias doesn't have a separate summary, which is why we are grabbing the aliasee summary here. Will update the comment.

mehdi_amini added inline comments.Mar 8 2016, 11:42 AM

lib/Bitcode/Reader/BitcodeReader.cpp
416	Fine with me.
lib/Bitcode/Writer/BitcodeWriter.cpp
920	Re-reading it, the variable had the nice property of having a name, making it very clear what is returned. If you adopt the above suggested change, I would add a one line comment along the line of `// return the VSTOffsetPlaceholder if we have a VST`
2530	I was not worried on non-termination or corretness, but depth that would explode the stack. It was late and I couldn't think clearly enough to find the answer myself. I just got a coffee so I should be able to answer myself now ;) -> For every instruction you will pull transitively all the operands until you reach a global (or something already seen). i.e. this is a DFS search on the SSA graph. If I'm correct, a worklist would probably be appropriate here.
2582	Re-reading, I'm not even sure what this loop on the operand is doing at all! Wouldn't the code do exactly the same if you just remove the loop and turn the test into `if (CS)``
2594	I pointed what I saw as an inconsistency, but I may misunderstand totally what you're trying to do here. So I will assume in the following that: RefEdges contains the global values referenced by the current function CallGraphEdges contains the list of Functions called by the current function Do we agree that we should have either of these properties: A function that is part of CallGraphEdges should not be present in RefEdges even if it is referenced in another way than a call A function that is part of CallGraphEdges should also be present in RefEdges if it is referenced in another way than a call What I read from your code right now is: A function that is part of CallGraphEdges may or may not be present in RefEdges if it is referenced in another way than a call.

tejohnson added inline comments.Mar 8 2016, 1:21 PM

lib/Bitcode/Writer/BitcodeWriter.cpp
920	Will do.
2530	I wouldn't have thought that an instruction would be large enough to cause stack explosion. But I have no issue with changing this to a worklist iteration instead.
2582	Right, this loop is bogus!
2594	Yep, I was distracted by the bogus loop above, we are collecting this per instruction so there is an ordering issue, and we are getting #3 which is undesirable. I think the way to fix this is to check for the callsite first, then pass in some info from the callsite to findRefEdges so that the callsite reference is itself ignored. I.e. pass in either the callee GV and skip it to get #1, or pass in the Use to get #2. I saw David's follow-on that he thinks #2 is best. I can try that.

mehdi_amini added inline comments.Mar 8 2016, 4:13 PM

lib/Bitcode/Writer/BitcodeWriter.cpp
2530	I may misunderstand, but it seems to me that the depth is not the width of a single instruction, but potentially on the order of the number of instructions in the Function. If I'm wrong and you're bounded somehow by the number of operands, then recursion is fine with me.

tejohnson added inline comments.Mar 9 2016, 7:21 AM

lib/Bitcode/Writer/BitcodeWriter.cpp
2530	If you are right then this needs to change beyond moving to a worklist, as I only intended to capture the references within a single instruction or variable def. The two entry points to this routine are in WriteFunction, where we pass in an Instruction as the User, and in WriteModuleLevelReferences, where we pass in a GlobalVariable (and I had confirmed that the operands of a variable are its initializer). We recursively walk the given User's operands(). I wouldn't have thought that we could jump from a given Instruction to other instructions in the function this way?

tejohnson added inline comments.Mar 9 2016, 11:57 AM

lib/Bitcode/Writer/BitcodeWriter.cpp
2594	Actually it turns out to be pretty simple, we only need to pass in the callee GV to findRefEdges to exclude it (because you can't have both a call and a non-call reference to a function in the same instruction, at least I couldn't find a way!). For xalancbmk this fix actually reduced the combined index and .o sizes a bit. It turns out that with the old flow we were inadvertently including intrinsics (which were ignored in the call graph edge list) in the reference list. In the new version of the patch I will upload shortly I've included a new test that checks for both issues (ensuring a function is in both the call list and ref list if it is accessed both ways by the function, and ignoring intrinsics).

Address Mehdi's review comments.

Fixes to findRefEdges discussed in review and IRC

Two main changes to findRefEdges:

Use worklist iteration instead of recursion

The reason is that when we invoke with an Instruction, it will iterate
into other instructions in the function when it encounters a use of
a local variable defined in another instruction. Ultimately we are
analyzing all instructions in the function, and we prevent duplicate
work by sharing the Visited set across all instructions in the function.
However, using a worklist approach is more efficient for large functions
where with recursion we may end up recursing many times (using much
stack).

More accurate detection and skipping of callsite references

The prior solution would miss the non-call reference to foo in the
following case:
foo((void *)foo);
since we were skipping all GVs that matched the callee GV.

Additionally, as noted in 1) we may traverse through local variables
defined on other instructions. Those other instructions may be calls
which return a value. We want to make sure we don't count calls in other
instructions encountered this way as non-call refs. Depending on the
order of traversal (e.g. a use on a phi encountered before the call
instruction), we can't count on the call being in the visited set.

Finally, doing the checking for callsites here avoids the need for
WriteFunction to special case call instructions, which has its own set
of issues. For example, one solution considered was for call
instructions to invoke findRefEdges on each of its data_ops(). This
means findRefEdges would need to distinguish between a GV user that was
passed in being either a reference (if it was a call instruction
operand) or a global variable for which we want to analyze initializers
(i.e. called from WriteModuleLevelReferences).

Finally, enhanced the new test quite a bit to check for the various
cases I encountered when fixing findRefEdges.

LGTM.
Thanks! I think we reached the point of "good enough". I feel this review was productive :)

This revision is now accepted and ready to land.Mar 10 2016, 10:20 AM

In D17212#372070, @joker.eph wrote:

LGTM.
Thanks! I think we reached the point of "good enough". I feel this review was productive :)

Great, thanks for the reviews! Agreed, the final state is much more complete and accurate.

Will update the title and summary along with updated stats before I commit.

tejohnson retitled this revision from [ThinLTO] Support for call graph in per-module and combined summary. to [ThinLTO] Support for reference graph in per-module and combined summary..Mar 11 2016, 6:57 AM

tejohnson updated this object.

tejohnson edited edge metadata.

tejohnson updated this object.Mar 11 2016, 7:00 AM

Closed by commit rL263275: [ThinLTO] Support for reference graph in per-module and combined summary. (authored by tejohnson). · Explain WhyMar 11 2016, 10:57 AM

This revision was automatically updated to reflect the committed changes.

This is the same as what was reviewed, except for 2 small changes that can be post-commit reviewed:

Mehdi: A bad interaction between my changes and your libLTO interfaces committed this week (which merges into the first index when creating the combined index, instead of creating a new combined index and merging everything into it). See removeEmptySummaryEntries() and its callsite for the detailed explanation of what I needed to do since I am now parsing the VST first and need to do some eager creation of index entries that was being cleaned up as we merged the indexes.

Easwaran: Updated getBlockProfileCount as per our conversation this morning to do the multiplication before the integer divide in order to avoid inaccuracies due to truncation.

Revision Contents

Path

Size

include/

llvm/

Bitcode/

LLVMBitCodes.h

17 lines

IR/

FunctionInfo.h

69 lines

ProfileData/

ProfileCommon.h

20 lines

lib/

Bitcode/

Reader/

BitcodeReader.cpp

126 lines

Writer/

BitcodeWriter.cpp

337 lines

LLVMBuild.txt

2 lines

test/

Bitcode/

Inputs/

thinlto-function-summary-callgraph-pgo.ll

11 lines

thinlto-function-summary-callgraph.ll

10 lines

thinlto-function-summary-callgraph-pgo.ll

43 lines

thinlto-function-summary-callgraph.ll

39 lines

thinlto-function-summary.ll

6 lines

thinlto-summary-linkage-types.ll

32 lines

tools/

gold/

X86/

thinlto.ll

12 lines

llvm-lto/

thinlto.ll

12 lines

tools/

llvm-bcanalyzer/

llvm-bcanalyzer.cpp

8 lines

Diff 48980

include/llvm/Bitcode/LLVMBitCodes.h

Show First 20 Lines • Show All 169 Lines • ▼ Show 20 Lines	enum TypeSymtabCodes {
TST_CODE_ENTRY = 1 // TST_ENTRY: [typeid, namechar x N]		TST_CODE_ENTRY = 1 // TST_ENTRY: [typeid, namechar x N]
};		};

// Value symbol table codes.		// Value symbol table codes.
enum ValueSymtabCodes {		enum ValueSymtabCodes {
VST_CODE_ENTRY = 1, // VST_ENTRY: [valueid, namechar x N]		VST_CODE_ENTRY = 1, // VST_ENTRY: [valueid, namechar x N]
VST_CODE_BBENTRY = 2, // VST_BBENTRY: [bbid, namechar x N]		VST_CODE_BBENTRY = 2, // VST_BBENTRY: [bbid, namechar x N]
VST_CODE_FNENTRY = 3, // VST_FNENTRY: [valueid, offset, namechar x N]		VST_CODE_FNENTRY = 3, // VST_FNENTRY: [valueid, offset, namechar x N]
// VST_COMBINED_FNENTRY: [funcsumoffset, funcguid]		// VST_COMBINED_FNENTRY: [valueid, funcsumoffset, funcguid]
VST_CODE_COMBINED_FNENTRY = 4		VST_CODE_COMBINED_FNENTRY = 4
};		};

// The module path symbol table only has one code (MST_CODE_ENTRY).		// The module path symbol table only has one code (MST_CODE_ENTRY).
enum ModulePathSymtabCodes {		enum ModulePathSymtabCodes {
MST_CODE_ENTRY = 1, // MST_ENTRY: [modid, namechar x N]		MST_CODE_ENTRY = 1, // MST_ENTRY: [modid, namechar x N]
};		};

// The function summary section uses different codes in the per-module		// The function summary section uses different codes in the per-module
// and combined index cases.		// and combined index cases.
enum FunctionSummarySymtabCodes {		enum FunctionSummarySymtabCodes {
FS_CODE_PERMODULE_ENTRY = 1, // FS_ENTRY: [valueid, linkage, instcount]		// PERMODULE_NOCALLS: [valueid, linkage, instcount]
FS_CODE_COMBINED_ENTRY = 2, // FS_ENTRY: [modid, linkage, instcount]		FS_PERMODULE_NOCALLS = 1,
		// PERMODULE_CALLS: [valueid, linkage, instcount, n x valueid]
		FS_PERMODULE_CALLS = 2,
		// PERMODULE_CALLS_PROFILE: [valueid, linkage, instcount,
		// n x (valueid, count)]
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions It is a bit annoying to have to manually overload for every "optional" data. It may also lead to a combinatory explosion if there were many. Unfortunately the only alternative in the current bitcode format is to have function summary stored as block instead of record... mehdi_amini: It is a bit annoying to have to manually overload for every "optional" data. It may also lead…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Actually my first implementation of this used a single record id. What I did was add a bit to indicate whether there was any profile data, so that the reader knew how to interpret the array data at the end (as either a list of just callee ids, or <callee id, count> pairs). I think I used the size of the record to deduce whether there was any call information. I decided to switch to explicitly encoding this information in the record id because it seemed cleaner and more consistent, and also removed a bit from each record (the profile flag). This reduces the size for large .o bitcode files and for the combined index, although the size for small .o bitcode files is a bit larger due to the 2 extra abbrev ids. I could probably go back to using the size to deduce the presence or absence of call information, and reduce this to 2 per summary type. WDYT? tejohnson: Actually my first implementation of this used a single record id. What I did was add a bit to…
		FS_PERMODULE_CALLS_PROFILE = 3,
		// COMBINED_NOCALLS: [modid, linkage, instcount]
		FS_COMBINED_NOCALLS = 4,
		// COMBINED_CALLS: [modid, linkage, instcount, n x valueid]
		FS_COMBINED_CALLS = 5,
		// COMBINED_CALLS_PROFILE: [modid, linkage, instcount, n x (valueid, count)]
		FS_COMBINED_CALLS_PROFILE = 6,
		davidxlUnsubmitted Not Done Reply Inline Actions A side note -- it might be useful to reserve some bits for general function attributes such as 'address-taken' etc. davidxl: A side note -- it might be useful to reserve some bits for general function attributes such as…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions What is the advantage of saving bits now? The format is still highly in flux anyway, and once it is nailed down eventually we would need to use some kind of version id to specify the change regardless of whether it was using saved bits or not. tejohnson: What is the advantage of saving bits now? The format is still highly in flux anyway, and once…
		davidxlUnsubmitted Not Done Reply Inline Actions ok. davidxl: ok.
};		};
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Side note: we will need to nail the format in the coming weeks/months somehow, or have a plan for backward compatibility (auto upgrade...). mehdi_amini: Side note: we will need to nail the format in the coming weeks/months somehow, or have a plan…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Agreed, although I think we are still in the heavy churn tuning phase. tejohnson: Agreed, although I think we are still in the heavy churn tuning phase.

enum MetadataCodes {		enum MetadataCodes {
METADATA_STRING = 1, // MDSTRING: [values]		METADATA_STRING = 1, // MDSTRING: [values]
METADATA_VALUE = 2, // VALUE: [type num, value num]		METADATA_VALUE = 2, // VALUE: [type num, value num]
METADATA_NODE = 3, // NODE: [n x md num]		METADATA_NODE = 3, // NODE: [n x md num]
METADATA_NAME = 4, // STRING: [values]		METADATA_NAME = 4, // STRING: [values]
METADATA_DISTINCT_NODE = 5, // DISTINCT_NODE: [n x md num]		METADATA_DISTINCT_NODE = 5, // DISTINCT_NODE: [n x md num]
METADATA_KIND = 6, // [n x [id, name]]		METADATA_KIND = 6, // [n x [id, name]]
▲ Show 20 Lines • Show All 305 Lines • Show Last 20 Lines

include/llvm/IR/FunctionInfo.h

Show All 25 Lines
namespace llvm {		namespace llvm {

/// \brief Function summary information to aid decisions and implementation of		/// \brief Function summary information to aid decisions and implementation of
/// importing.		/// importing.
///		///
/// This is a separate class from FunctionInfo to enable lazy reading of this		/// This is a separate class from FunctionInfo to enable lazy reading of this
/// function summary information from the combined index file during imporing.		/// function summary information from the combined index file during imporing.
class FunctionSummary {		class FunctionSummary {
		public:
		/// <CalleeGUID, ProfileCount> call edge pair.
		typedef std::pair<uint64_t, uint64_t> EdgeTy;

private:		private:
/// \brief Path of module containing function IR, used to locate module when		/// \brief Path of module containing function IR, used to locate module when
/// importing this function.		/// importing this function.
///		///
/// This is only used during parsing of the combined function index, or when		/// This is only used during parsing of the combined function index, or when
/// parsing the per-module index for creation of the combined function index,		/// parsing the per-module index for creation of the combined function index,
/// not during writing of the per-module index which doesn't contain a		/// not during writing of the per-module index which doesn't contain a
/// module path string table.		/// module path string table.
Show All 12 Lines	private:
// is likely to be profitable.		// is likely to be profitable.
// Other information will be added as the importing is tuned, such		// Other information will be added as the importing is tuned, such
// as hotness (when profile available), and other function characteristics.		// as hotness (when profile available), and other function characteristics.

/// Number of instructions (ignoring debug instructions, e.g.) computed		/// Number of instructions (ignoring debug instructions, e.g.) computed
/// during the initial compile step when the function index is first built.		/// during the initial compile step when the function index is first built.
unsigned InstCount;		unsigned InstCount;

		/// List of <CalleeGUID, ProfileCount> call edge pairs from this function.
		std::vector<EdgeTy> CallGraphEdgeList;

public:		public:
/// Construct a summary object from summary data expected for all		/// Summary constructors.
/// summary records.		FunctionSummary() : InstCount(0) {}
FunctionSummary(unsigned NumInsts) : InstCount(NumInsts) {}		FunctionSummary(unsigned NumInsts) : InstCount(NumInsts) {}

		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Can't/shouldn't be protected? mehdi_amini: Can't/shouldn't be protected?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Good point, will make protected. tejohnson: Good point, will make protected.
/// Set the path to the module containing this function, for use in		/// Set the path to the module containing this function, for use in
/// the combined index.		/// the combined index.
void setModulePath(StringRef ModPath) { ModulePath = ModPath; }		void setModulePath(StringRef ModPath) { ModulePath = ModPath; }

/// Get the path to the module containing this function.		/// Get the path to the module containing this function.
StringRef modulePath() const { return ModulePath; }		StringRef modulePath() const { return ModulePath; }

/// Record linkage type.		/// Record linkage type.
void setFunctionLinkage(GlobalValue::LinkageTypes Linkage) {		void setFunctionLinkage(GlobalValue::LinkageTypes Linkage) {
FunctionLinkage = Linkage;		FunctionLinkage = Linkage;
}		}

/// Return linkage type recorded for this function.		/// Return linkage type recorded for this function.
GlobalValue::LinkageTypes getFunctionLinkage() const {		GlobalValue::LinkageTypes getFunctionLinkage() const {
return FunctionLinkage;		return FunctionLinkage;
}		}

		/// Set the instruction count for this function.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions why `to be written to...`? mehdi_amini: why `to be written to...`?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Good catch, this was leftover from the prior patch when it was part of a bitcode writer helper, didn't updated it up after moving here. Will fix. tejohnson: Good catch, this was leftover from the prior patch when it was part of a bitcode writer helper…
		void setInstCount(unsigned NumInsts) { InstCount = NumInsts; }

/// Get the instruction count recorded for this function.		/// Get the instruction count recorded for this function.
unsigned instCount() const { return InstCount; }		unsigned instCount() const { return InstCount; }

		/// Record a call graph edge from this function to the function identified
		davidxlUnsubmitted Not Done Reply Inline Actions Without profile data, it might be useful to record the number of static callsites from a function to the callee. This can help compiler backend to make better global decisions later (e.g. better inlining and enable more function GC). davidxl: Without profile data, it might be useful to record the number of static callsites from a…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions I could do that by overloading the count field to hold the static number of callsites. Alternatively, for statically-generated profile information is it useful to record the block frequency sums or something like that? tejohnson: I could do that by overloading the count field to hold the static number of callsites.
		davidxlUnsubmitted Not Done Reply Inline Actions Regarding static # of callsites -- another way is to build a multi-graph -- basically do not collapse all edges from one caller to the same callee. How much space do we save there? If we want to save space, there is a better way to do this -- we can build a graph with edges from module to the callee. Such a module->function edge contains the following information : { Aggregate call count, max call count, static # of sites } It is probably not useful to record static block frequency. davidxl: Regarding static # of callsites -- another way is to build a multi-graph -- basically do not…
		davidxlUnsubmitted Not Done Reply Inline Actions My suggestion regarding using module->function edge can not replace the aggregate function->function call edge here -- but it can be emitted as additional information to track static callsite info -- but that is independent of what this patch does. davidxl: My suggestion regarding using module->function edge can not replace the aggregate function…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Right, we still want the function->function edges for better precision. Will consider module->function edges at a later point. I will need to do some measurements to see what the overhead is of not collapsing the edges within each function. But if we simply add a static callsite count (possibly both with and without PGO), I think that may be sufficient. tejohnson: Right, we still want the function->function edges for better precision. Will consider module…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Without profile data, it might be useful to record the number of static callsites from a function to the callee. This can help compiler backend to make better global decisions later (e.g. better inlining and enable more function GC). Can you elaborate how would you use this information and how it would help inlining and GC? mehdi_amini: > Without profile data, it might be useful to record the number of static callsites from a…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions I think what David was alluding to was similar to what I was describing on IRC yesterday - if you know you have a single static callsite via the summary, you can give the same inline cost benefit currently given for internal(ized) single-callsite functions. Then if it is inlined, hopefully the original out-of-line function can be eliminated via linker GC. To do this, the combined index creation step would need to aggregate the # static callsites to each function and put that into the callee function's combined summary. tejohnson: I think what David was alluding to was similar to what I was describing on IRC yesterday - if…
		/// by \p CalleeGUID, with cumulative profile count (across all calls from
		/// this function) \p CalleeCount, or 0 if no FDO.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I think the usual terminology in LLVM is "PGO" (I had to google FDO) (same below) mehdi_amini: I think the usual terminology in LLVM is "PGO" (I had to google FDO) (same below)
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Will fix...bummer that every compiler I have worked on seems to use a different term! PGO/FDO/PBO...bah. tejohnson: Will fix...bummer that every compiler I have worked on seems to use a different term!
		void addCallGraphEdge(uint64_t CalleeGUID, uint64_t CalleeCount) {
		CallGraphEdgeList.push_back(std::make_pair(CalleeGUID, CalleeCount));
		}

		/// Return the list of <CalleeGUID, ProfileCount> pairs.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions `F`? Also again the "to be written to" seems off to me. mehdi_amini: `F`? Also again the "to be written to" seems off to me.
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Same issue as above, stale comment, will update. tejohnson: Same issue as above, stale comment, will update.
		std::vector<EdgeTy> &edges() { return CallGraphEdgeList; }
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Missing const variant. mehdi_amini: Missing const variant.
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Adding that now. tejohnson: Adding that now.
};		};

/// \brief Class to hold pointer to function summary and information required		/// \brief Class to hold pointer to function summary and information required
/// for parsing it.		/// for parsing it.
///		///
/// For the per-module index, this holds the bitcode offset		/// For the per-module index, this holds the bitcode offset
/// of the corresponding function block. For the combined index,		/// of the corresponding function block. For the combined index,
/// after parsing of the \a ValueSymbolTable, this initially		/// after parsing of the \a ValueSymbolTable, this initially
Show All 24 Lines	public:
/// Constructor used during parsing of VST entries.		/// Constructor used during parsing of VST entries.
FunctionInfo(uint64_t FuncOffset)		FunctionInfo(uint64_t FuncOffset)
: Summary(nullptr), BitcodeIndex(FuncOffset) {}		: Summary(nullptr), BitcodeIndex(FuncOffset) {}

/// Constructor used for per-module index bitcode writing.		/// Constructor used for per-module index bitcode writing.
FunctionInfo(uint64_t FuncOffset,		FunctionInfo(uint64_t FuncOffset,
std::unique_ptr<FunctionSummary> FuncSummary)		std::unique_ptr<FunctionSummary> FuncSummary)
: Summary(std::move(FuncSummary)), BitcodeIndex(FuncOffset) {}		: Summary(std::move(FuncSummary)), BitcodeIndex(FuncOffset) {}

		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Can't you have a single ctor? `GlobalValueInfo(uint64_t Offset = 0, std::unique_ptr<GlobalValueSummary> Summary = nullptr)` mehdi_amini: Can't you have a single ctor? `GlobalValueInfo(uint64_t Offset = 0, std…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Yes, will fix. tejohnson: Yes, will fix.
/// Record the function summary information parsed out of the function		/// Record the function summary information parsed out of the function
/// summary block during parsing or combined index creation.		/// summary block during parsing or combined index creation.
void setFunctionSummary(std::unique_ptr<FunctionSummary> FuncSummary) {		void setFunctionSummary(std::unique_ptr<FunctionSummary> FuncSummary) {
Summary = std::move(FuncSummary);		Summary = std::move(FuncSummary);
}		}

/// Get the function summary recorded for this function.		/// Get the function summary recorded for this function.
FunctionSummary *functionSummary() const { return Summary.get(); }		FunctionSummary *functionSummary() const { return Summary.get(); }
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	public:
/// Check if the given Module has any functions available for exporting		/// Check if the given Module has any functions available for exporting
/// in the index. We consider any module present in the ModulePathStringTable		/// in the index. We consider any module present in the ModulePathStringTable
/// to have exported functions.		/// to have exported functions.
bool hasExportedFunctions(const Module &M) const {		bool hasExportedFunctions(const Module &M) const {
return ModulePathStringTable.count(M.getModuleIdentifier());		return ModulePathStringTable.count(M.getModuleIdentifier());
}		}
};		};

		/// Helper class for reading and writing the function summary in bitcode.
		/// Specifically, the call graph edges in the summary bitcode section
		/// reference the callees by ValueId. However, in the function index in
		/// memory we want these referenced by GUID. Rather than bloat the
		/// FunctionSummary class with an additional map, we use this class
		/// to temporarily hold the ValueId representation.
		class FunctionSummaryIOHelper {
		public:
		/// <CalleeValueId, ProfileCount> call edge pair.
		typedef std::pair<unsigned, uint64_t> EdgeTy;

		private:
		/// The FunctionSummary object being built during bitcode reading,
		/// or built during the per-module bitcode write process.
		std::unique_ptr<FunctionSummary> Summary;
		davidxlUnsubmitted Not Done Reply Inline Actions Is this member really needed? The helper is not intended to own the summary. davidxl: Is this member really needed? The helper is not intended to own the summary.
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions The helper does own the function summary temporarily. We create it when the function summary section is read, then later transfer ownership to the index when the VST is read. The reason why I need to keep a non-owning pointer to the summary is that later on after all VST entries are read from the combined index I need to set up the call graph edges in the combined index, and for that I need to keep the association here between the function summary and the CallGraphEdgeValueIdList. tejohnson: The helper does own the function summary temporarily. We create it when the function summary…
		davidxlUnsubmitted Not Done Reply Inline Actions owns it temporarily in what sense? Does it have a chance to de-allocate it? davidxl: owns it temporarily in what sense? Does it have a chance to de-allocate it?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions It passes ownership to the function index in memory when the corresponding VST entry is created. After all the VST entries are created, it uses the saved (non-owned) pointer to create the call graph edges in the function index in memory (need to wait until all the VST entries are parsed as we need to know all the value id and GUIDs). See FunctionIndexBitcodeReader::parseValueSymbolTable (note the function return is in the EndBlock case in the switch statement, not at the bottom of the routine). The other alternative would be to keep a mapping there from the FunctionSummaryIOHelper object to the corresponding FunctionSummary pointer (created when we parse the corresponding VST entry and transfer the ownership of the unique_ptr to the function index). I think that is probably cleaner and more straightforward. Will make that change. As an aside, I just noticed that the ValueIdToCallGraphGUIDMap is only used in this routine, I will move it to function local instead of being on the FunctionIndexBitcodeReader class. tejohnson: It passes ownership to the function index in memory when the corresponding VST entry is created.
		davidxlUnsubmitted Not Done Reply Inline Actions Right -- if the IO helper class only conditionally pass the ownership to functionIndex in some cases, the unique pointer is needed. However if it is intended to be always passed to another object, the helper just need to keep a naked pointer to the summary as it does not really own it. davidxl: Right -- if the IO helper class only conditionally pass the ownership to functionIndex in some…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions In the BitcodeWriter it stays owned by the helper. In the BitcodeReader ownership is always transferred to the in-memory function index. Even in the latter case, I think creating a unique_ptr immediately, rather than creating a naked pointer and only creating a unique_ptr when the VST is read, is cleaner as it makes the current ownership clear and avoids malloc. tejohnson: In the BitcodeWriter it stays owned by the helper. In the BitcodeReader ownership is always…
		/// List of <CalleeValueId, ProfileCount> call edge pairs for the function.
		std::vector<EdgeTy> CallGraphEdgeValueIdList;

		public:
		FunctionSummaryIOHelper() {}
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Why this? If needed, replace with `FunctionSummaryIOHelper() = default;` mehdi_amini: Why this? If needed, replace with `FunctionSummaryIOHelper() = default;`
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions This was leftover from when I had a member variable that was being passed in and initialized in the constructor. Removing this now. tejohnson: This was leftover from when I had a member variable that was being passed in and initialized in…

		/// Save the new function summary.
		void setFunctionSummary(std::unique_ptr<FunctionSummary> FuncSummary) {
		Summary = std::move(FuncSummary);
		}
		/// Get ownership of the function summary unique_ptr.
		std::unique_ptr<FunctionSummary> getFunctionSummary() {
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I think we usually prefix `takeXXX` and not `getXXX` when ownership is transferred. mehdi_amini: I think we usually prefix `takeXXX` and not `getXXX` when ownership is transferred.
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Ok, will fix. tejohnson: Ok, will fix.
		return std::move(Summary);
		}

		/// Access the function summary pointer.
		FunctionSummary *functionSummary() { return Summary.get(); }

		/// Record a call graph edge from this function to the function identified
		/// by \p CalleeValueId, with cumulative profile count (across all calls from
		/// this function) \p CalleeCount, or 0 if no FDO.
		void addCallGraphEdge(unsigned CalleeValueId, uint64_t CalleeCount) {
		CallGraphEdgeValueIdList.push_back(
		std::make_pair(CalleeValueId, CalleeCount));
		}

		/// Return the list of <CalleeValueId, ProfileCount> pairs.
		std::vector<EdgeTy> &edges() { return CallGraphEdgeValueIdList; }
		};

} // End llvm namespace		} // End llvm namespace

#endif		#endif

include/llvm/ProfileData/ProfileCommon.h

	//===-- ProfileCommon.h - Common profiling APIs. ----------------- C++ --===//			//===-- ProfileCommon.h - Common profiling APIs. ----------------- C++ --===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This file contains data structures and functions common to both instrumented			// This file contains data structures and functions common to both instrumented
	// and sample profiling.			// and sample profiling.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

				#include "llvm/Support/BlockFrequency.h"
				#include "llvm/Support/BranchProbability.h"
	#include <cstdint>			#include <cstdint>
	#include <functional>			#include <functional>
	#include <map>			#include <map>
	#include <vector>			#include <vector>

	#ifndef LLVM_PROFILEDATA_PROFILE_COMMON_H			#ifndef LLVM_PROFILEDATA_PROFILE_COMMON_H
	#define LLVM_PROFILEDATA_PROFILE_COMMON_H			#define LLVM_PROFILEDATA_PROFILE_COMMON_H

	▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
	}			}

	std::vector<ProfileSummaryEntry> &ProfileSummary::getDetailedSummary() {			std::vector<ProfileSummaryEntry> &ProfileSummary::getDetailedSummary() {
	if (!DetailedSummaryCutoffs.empty() && DetailedSummary.empty())			if (!DetailedSummaryCutoffs.empty() && DetailedSummary.empty())
	computeDetailedSummary();			computeDetailedSummary();
	return DetailedSummary;			return DetailedSummary;
	}			}

				/// Helper to compute the profile count for a block, based on the
				/// ratio of its frequency to the entry block frequency, multiplied
				/// by the entry block count.
				inline uint64_t getBlockProfileCount(uint64_t BlockFreq, uint64_t EntryFreq,
				uint64_t EntryCount) {
				auto ScaledCount = BlockFrequency(EntryCount);
				if (EntryFreq > UINT32_MAX \|\| BlockFreq > UINT32_MAX)
				eramanUnsubmitted Not Done Reply Inline Actions Why not just do ScaledCount = EntryCount * BlockFreq / EntryFreq using APInt<128>, take the lower 64 bits if the result fitx within uint64_t and use UINT64_MAX otheerwise? eraman: Why not just do ScaledCount = EntryCount * BlockFreq / EntryFreq using APInt<128>, take the…
				tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions I looked at APInt briefly when I was thinking about how to do this, but was concerned that it was going to be inefficient in the common case. However, it looks like that isn't the case as it has a fast path. Will change this to use it. Note I actually stole this code from RAGreedy::initializeCSRCost() (which could probably be refactored to use this new method once it goes in). tejohnson: I looked at APInt briefly when I was thinking about how to do this, but was concerned that it…
				// Can't use BranchProbability in general, since it takes 32-bit numbers.
				ScaledCount = ScaledCount.getFrequency() * (BlockFreq / EntryFreq);
				else if (BlockFreq < EntryFreq)
				ScaledCount *= BranchProbability(BlockFreq, EntryFreq);
				else
				// Invert the fraction and divide.
				ScaledCount /= BranchProbability(EntryFreq, BlockFreq);

				return ScaledCount.getFrequency();
				}

	} // end namespace llvm			} // end namespace llvm
	#endif			#endif

lib/Bitcode/Reader/BitcodeReader.cpp

Show First 20 Lines • Show All 407 Lines • ▼ Show 20 Lines	private:
std::error_code initLazyStream(std::unique_ptr<DataStreamer> Streamer);		std::error_code initLazyStream(std::unique_ptr<DataStreamer> Streamer);
std::error_code findFunctionInStream(		std::error_code findFunctionInStream(
Function *F,		Function *F,
DenseMap<Function *, uint64_t>::iterator DeferredFunctionInfoIterator);		DenseMap<Function *, uint64_t>::iterator DeferredFunctionInfoIterator);
};		};

/// Class to manage reading and parsing function summary index bitcode		/// Class to manage reading and parsing function summary index bitcode
/// files/sections.		/// files/sections.
class FunctionIndexBitcodeReader {		class FunctionIndexBitcodeReader {
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I guess more renaming could have been done here right? (I really don't mind, but since you cared to rename many places below...) mehdi_amini: I guess more renaming could have been done here right? (I really don't mind, but since you…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Yep, this was part of the TODO list in this patch update. Working on that now. Specifically, I'm changing FunctionIndex* and FunctionInfoIndex* to ModuleSummaryIndex. I left this for a follow-on update because already the renaming was pretty extensive, and this particular change bleeds into some of the interfaces and variable/field names used in other parts of the compiler (clang etc). tejohnson:* Yep, this was part of the TODO list in this patch update. Working on that now. Specifically…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Fine with me. mehdi_amini: Fine with me.
DiagnosticHandlerFunction DiagnosticHandler;		DiagnosticHandlerFunction DiagnosticHandler;

/// Eventually points to the function index built during parsing.		/// Eventually points to the function index built during parsing.
FunctionInfoIndex *TheIndex = nullptr;		FunctionInfoIndex *TheIndex = nullptr;

std::unique_ptr<MemoryBuffer> Buffer;		std::unique_ptr<MemoryBuffer> Buffer;
std::unique_ptr<BitstreamReader> StreamFile;		std::unique_ptr<BitstreamReader> StreamFile;
BitstreamCursor Stream;		BitstreamCursor Stream;
Show All 20 Lines	class FunctionIndexBitcodeReader {
/// consumed during ValueSymbolTable parsing.		/// consumed during ValueSymbolTable parsing.
///		///
/// Used to correlate summary records with VST entries. For the per-module		/// Used to correlate summary records with VST entries. For the per-module
/// index this maps the ValueID to the parsed function summary, and		/// index this maps the ValueID to the parsed function summary, and
/// for the combined index this maps the summary record's bitcode		/// for the combined index this maps the summary record's bitcode
/// offset to the function summary (since in the combined index the		/// offset to the function summary (since in the combined index the
/// VST records do not hold value IDs but rather hold the function		/// VST records do not hold value IDs but rather hold the function
/// summary record offset).		/// summary record offset).
DenseMap<uint64_t, std::unique_ptr<FunctionSummary>> SummaryMap;		std::map<uint64_t, FunctionSummaryIOHelper> SummaryMap;

		mehdi_aminiUnsubmitted Not Done Reply Inline Actions doc mehdi_amini: doc
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Will do. tejohnson: Will do.
/// Map populated during module path string table parsing, from the		/// Map populated during module path string table parsing, from the
/// module ID to a string reference owned by the index's module		/// module ID to a string reference owned by the index's module
/// path string table, used to correlate with combined index function		/// path string table, used to correlate with combined index function
/// summary records.		/// summary records.
DenseMap<uint64_t, StringRef> ModuleIdMap;		DenseMap<uint64_t, StringRef> ModuleIdMap;

/// Original source file name recorded in a bitcode record.		/// Original source file name recorded in a bitcode record.
std::string SourceFileName;		std::string SourceFileName;
▲ Show 20 Lines • Show All 4,969 Lines • ▼ Show 20 Lines
// the function block's bitcode offset as well as the offset into the		// the function block's bitcode offset as well as the offset into the
// function summary section.		// function summary section.
std::error_code FunctionIndexBitcodeReader::parseValueSymbolTable() {		std::error_code FunctionIndexBitcodeReader::parseValueSymbolTable() {
if (Stream.EnterSubBlock(bitc::VALUE_SYMTAB_BLOCK_ID))		if (Stream.EnterSubBlock(bitc::VALUE_SYMTAB_BLOCK_ID))
return error("Invalid record");		return error("Invalid record");

SmallVector<uint64_t, 64> Record;		SmallVector<uint64_t, 64> Record;

		// Map to save ValueId to GUID association that was recorded in the
		// ValueSymbolTable. It is used after the VST is parsed to convert
		// call graph edges read from the function summary from referencing
		// callees by their ValueId to using the GUID instead, which is how
		// they are recorded in the function index being built.
		DenseMap<unsigned, uint64_t> ValueIdToCallGraphGUIDMap;

		// Map to keep track of which helper object was associated with which
		// function summary, used when we transfer ownership of summary to
		// the index, because we need this information for later creation
		// of call graph edges in the index.
		DenseMap<FunctionSummaryIOHelper , FunctionSummary > HelperToFuncSummaryMap;

// Read all the records for this value table.		// Read all the records for this value table.
SmallString<128> ValueName;		SmallString<128> ValueName;
while (1) {		while (1) {
BitstreamEntry Entry = Stream.advanceSkippingSubblocks();		BitstreamEntry Entry = Stream.advanceSkippingSubblocks();

switch (Entry.Kind) {		switch (Entry.Kind) {
case BitstreamEntry::SubBlock: // Handled for us already.		case BitstreamEntry::SubBlock: // Handled for us already.
case BitstreamEntry::Error:		case BitstreamEntry::Error:
return error("Malformed block");		return error("Malformed block");
case BitstreamEntry::EndBlock:		case BitstreamEntry::EndBlock: {
		// We now have saved the ValueId to GUID mapping for all functions in the
		// VST. For the non-lazy function summary parsing case, we can convert the
		// call graph edges from using the Callee ValueId to instead using the
		// GUID and record them in the index. In the lazy summary parsing combined
		// function case, we can perform this mapping as the summaries are parsed.
		if (foundFuncSummary() && !IsLazy) {
		for (auto &SMI : SummaryMap) {
		// Walk over the call edges in this entry and add them to the
		// function summary.
		for (auto &EI : SMI.second.edges()) {
		auto CGI = ValueIdToCallGraphGUIDMap.find(EI.first);
		// For the per-module case all edges should have a corresponding
		// declaration in the VST, so we should always find an entry.
		// Similarly, when writing the combined index bitcode we filtered
		// out any edges to functions that weren't in the index, and all
		// combined VST entries were added to the map.
		assert(CGI != ValueIdToCallGraphGUIDMap.end());
		// Add the edge to the summary now, using the GUID.
		auto FSI = HelperToFuncSummaryMap.find(&SMI.second);
		assert(FSI != HelperToFuncSummaryMap.end());
		FSI->second->addCallGraphEdge(CGI->second, EI.second);
		}
		}
		}
return std::error_code();		return std::error_code();
		}
case BitstreamEntry::Record:		case BitstreamEntry::Record:
// The interesting case.		// The interesting case.
break;		break;
}		}

// Read a record.		// Read a record.
Record.clear();		Record.clear();
switch (Stream.readRecord(Entry.ID, Record)) {		switch (Stream.readRecord(Entry.ID, Record)) {
default: // Default behavior: ignore (e.g. VST_CODE_BBENTRY records).		default: // Default behavior: ignore (e.g. VST_CODE_BBENTRY records).
break;		break;
		case bitc::VST_CODE_ENTRY: { // VST_CODE_ENTRY: [valueid, namechar x N]
		if (convertToString(Record, 1, ValueName))
		return error("Invalid record");
		unsigned ValueID = Record[0];
		ValueIdToCallGraphGUIDMap[ValueID] = Function::getGUID(ValueName);
		ValueName.clear();
		break;
		}
case bitc::VST_CODE_FNENTRY: {		case bitc::VST_CODE_FNENTRY: {
// VST_CODE_FNENTRY: [valueid, offset, namechar x N]		// VST_CODE_FNENTRY: [valueid, offset, namechar x N]
if (convertToString(Record, 2, ValueName))		if (convertToString(Record, 2, ValueName))
return error("Invalid record");		return error("Invalid record");
unsigned ValueID = Record[0];		unsigned ValueID = Record[0];
uint64_t FuncOffset = Record[1];		uint64_t FuncOffset = Record[1];
assert(!IsLazy && "Lazy summary read only supported for combined index");		assert(!IsLazy && "Lazy summary read only supported for combined index");
// Gracefully handle bitcode without a function summary section,		// Gracefully handle bitcode without a function summary section,
// which will simply not populate the index.		// which will simply not populate the index.
if (foundFuncSummary()) {		if (foundFuncSummary()) {
DenseMap<uint64_t, std::unique_ptr<FunctionSummary>>::iterator SMI =		auto SMI = SummaryMap.find(ValueID);
SummaryMap.find(ValueID);
assert(SMI != SummaryMap.end() && "Summary info not found");		assert(SMI != SummaryMap.end() && "Summary info not found");
std::unique_ptr<FunctionInfo> FuncInfo =		std::unique_ptr<FunctionInfo> FuncInfo =
llvm::make_unique<FunctionInfo>(FuncOffset);		llvm::make_unique<FunctionInfo>(FuncOffset);
FuncInfo->setFunctionSummary(std::move(SMI->second));		FuncInfo->setFunctionSummary(SMI->second.getFunctionSummary());
		HelperToFuncSummaryMap[&SMI->second] = FuncInfo->functionSummary();
assert(!SourceFileName.empty());		assert(!SourceFileName.empty());
std::string FunctionGlobalId = Function::getGlobalIdentifier(		std::string FunctionGlobalId = Function::getGlobalIdentifier(
ValueName, FuncInfo->functionSummary()->getFunctionLinkage(),		ValueName, FuncInfo->functionSummary()->getFunctionLinkage(),
SourceFileName);		SourceFileName);
TheIndex->addFunctionInfo(FunctionGlobalId, std::move(FuncInfo));		TheIndex->addFunctionInfo(FunctionGlobalId, std::move(FuncInfo));
		ValueIdToCallGraphGUIDMap[ValueID] = Function::getGUID(ValueName);
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Shouldn't this be `ValueIdToCallGraphGUIDMap[ValueID] = Function::getGUID(FunctionGlobalId);` ? mehdi_amini: Shouldn't this be `ValueIdToCallGraphGUIDMap[ValueID] = Function::getGUID(FunctionGlobalId);` ?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Good catch! Yes, it should be. The VST_CODE_ENTRY case does not need to invoke getGlobalIdentifier since those are externally-defined functions (and so not possibly local). But looking at the other places here where we set up this map I realized that the VST_CODE_COMBINED_FNENTRY case was wrong - we already have the GUID and there is no ValueName in that record. I have a fix for both coming up shortly. tejohnson: Good catch! Yes, it should be. The VST_CODE_ENTRY case does not need to invoke…
}		}

ValueName.clear();		ValueName.clear();
break;		break;
}		}
case bitc::VST_CODE_COMBINED_FNENTRY: {		case bitc::VST_CODE_COMBINED_FNENTRY: {
// VST_CODE_COMBINED_FNENTRY: [offset, funcguid]		// VST_CODE_COMBINED_FNENTRY: [valueid, offset, funcguid]
uint64_t FuncSummaryOffset = Record[0];		unsigned ValueID = Record[0];
uint64_t FuncGUID = Record[1];		uint64_t FuncSummaryOffset = Record[1];
		uint64_t FuncGUID = Record[2];
std::unique_ptr<FunctionInfo> FuncInfo =		std::unique_ptr<FunctionInfo> FuncInfo =
llvm::make_unique<FunctionInfo>(FuncSummaryOffset);		llvm::make_unique<FunctionInfo>(FuncSummaryOffset);
if (foundFuncSummary() && !IsLazy) {		if (foundFuncSummary() && !IsLazy) {
DenseMap<uint64_t, std::unique_ptr<FunctionSummary>>::iterator SMI =		auto SMI = SummaryMap.find(FuncSummaryOffset);
SummaryMap.find(FuncSummaryOffset);
assert(SMI != SummaryMap.end() && "Summary info not found");		assert(SMI != SummaryMap.end() && "Summary info not found");
FuncInfo->setFunctionSummary(std::move(SMI->second));		FuncInfo->setFunctionSummary(SMI->second.getFunctionSummary());
		HelperToFuncSummaryMap[&SMI->second] = FuncInfo->functionSummary();
}		}
TheIndex->addFunctionInfo(FuncGUID, std::move(FuncInfo));		TheIndex->addFunctionInfo(FuncGUID, std::move(FuncInfo));
		ValueIdToCallGraphGUIDMap[ValueID] = Function::getGUID(ValueName);

ValueName.clear();		ValueName.clear();
break;		break;
}		}
}		}
}		}
}		}

▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	while (1) {
// is a per-module index or a combined index file. In the per-module		// is a per-module index or a combined index file. In the per-module
// case the records contain the associated value's ID for correlation		// case the records contain the associated value's ID for correlation
// with VST entries. In the combined index the correlation is done		// with VST entries. In the combined index the correlation is done
// via the bitcode offset of the summary records (which were saved		// via the bitcode offset of the summary records (which were saved
// in the combined index VST entries). The records also contain		// in the combined index VST entries). The records also contain
// information used for ThinLTO renaming and importing.		// information used for ThinLTO renaming and importing.
Record.clear();		Record.clear();
uint64_t CurRecordBit = Stream.GetCurrentBitNo();		uint64_t CurRecordBit = Stream.GetCurrentBitNo();
switch (Stream.readRecord(Entry.ID, Record)) {		auto BitCode = Stream.readRecord(Entry.ID, Record);
		switch (BitCode) {
default: // Default behavior: ignore.		default: // Default behavior: ignore.
break;		break;
// FS_PERMODULE_ENTRY: [valueid, linkage, instcount]		// FS_PERMODULE_NOCALLS: [valueid, linkage, instcount]
case bitc::FS_CODE_PERMODULE_ENTRY: {		// FS_PERMODULE_CALLS: [valueid, linkage, instcount, n x valueid]
		// FS_PERMODULE_CALLS_PROFILE: [valueid, linkage, instcount,
		// n x (valueid, count)]
		case bitc::FS_PERMODULE_NOCALLS:
		case bitc::FS_PERMODULE_CALLS:
		case bitc::FS_PERMODULE_CALLS_PROFILE: {
unsigned ValueID = Record[0];		unsigned ValueID = Record[0];
uint64_t RawLinkage = Record[1];		uint64_t RawLinkage = Record[1];
unsigned InstCount = Record[2];		unsigned InstCount = Record[2];
std::unique_ptr<FunctionSummary> FS =		std::unique_ptr<FunctionSummary> FS =
llvm::make_unique<FunctionSummary>(InstCount);		llvm::make_unique<FunctionSummary>(InstCount);
FS->setFunctionLinkage(getDecodedLinkage(RawLinkage));		FS->setFunctionLinkage(getDecodedLinkage(RawLinkage));
// The module path string ref set in the summary must be owned by the		// The module path string ref set in the summary must be owned by the
// index's module string table. Since we don't have a module path		// index's module string table. Since we don't have a module path
// string table section in the per-module index, we create a single		// string table section in the per-module index, we create a single
// module path string table entry with an empty (0) ID to take		// module path string table entry with an empty (0) ID to take
// ownership.		// ownership.
FS->setModulePath(		FS->setModulePath(
TheIndex->addModulePath(Buffer->getBufferIdentifier(), 0));		TheIndex->addModulePath(Buffer->getBufferIdentifier(), 0));
SummaryMap[ValueID] = std::move(FS);		FunctionSummaryIOHelper &BitcodeSummary = SummaryMap[ValueID];
		BitcodeSummary.setFunctionSummary(std::move(FS));
		bool HasProfile = (BitCode == bitc::FS_PERMODULE_CALLS_PROFILE);
		assert(BitCode != bitc::FS_PERMODULE_NOCALLS \|\| Record.size() < 4);
		// For now save the call graph edges using the recorded ValueId,
		// on the summary helper object. After reading the VST this will
		// be transferred into the function summary in the index, using the
		// callee GUID instead.
		for (unsigned I = 3, E = Record.size(); I != E; ++I) {
		davidxlUnsubmitted Not Done Reply Inline Actions Is it better to introduce a symbolic name for the start index instead of hardcoded 3? davidxl: Is it better to introduce a symbolic name for the start index instead of hardcoded 3?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions I can do that. There's a lot of hardcoded start indices just like this in the file already, but I agree a symbolic name would be clearer. tejohnson: I can do that. There's a lot of hardcoded start indices just like this in the file already, but…
		BitcodeSummary.addCallGraphEdge(Record[I],
		HasProfile ? Record[++I] : 0);
		}
		break;
}		}
// FS_COMBINED_ENTRY: [modid, linkage, instcount]		// FS_COMBINED_NOCALLS: [modid, linkage, instcount]
case bitc::FS_CODE_COMBINED_ENTRY: {		// FS_COMBINED_CALLS: [modid, linkage, instcount, n x valueid]
		// FS_COMBINED_CALLS_PROFILE: [modid, linkage, instcount,
		// n x (valueid, count)]
		case bitc::FS_COMBINED_NOCALLS:
		case bitc::FS_COMBINED_CALLS:
		case bitc::FS_COMBINED_CALLS_PROFILE: {
uint64_t ModuleId = Record[0];		uint64_t ModuleId = Record[0];
uint64_t RawLinkage = Record[1];		uint64_t RawLinkage = Record[1];
unsigned InstCount = Record[2];		unsigned InstCount = Record[2];
std::unique_ptr<FunctionSummary> FS =		std::unique_ptr<FunctionSummary> FS =
llvm::make_unique<FunctionSummary>(InstCount);		llvm::make_unique<FunctionSummary>(InstCount);
FS->setFunctionLinkage(getDecodedLinkage(RawLinkage));		FS->setFunctionLinkage(getDecodedLinkage(RawLinkage));
FS->setModulePath(ModuleIdMap[ModuleId]);		FS->setModulePath(ModuleIdMap[ModuleId]);
SummaryMap[CurRecordBit] = std::move(FS);		FunctionSummaryIOHelper &BitcodeSummary = SummaryMap[CurRecordBit];
		BitcodeSummary.setFunctionSummary(std::move(FS));
		bool HasProfile = (BitCode == bitc::FS_COMBINED_CALLS_PROFILE);
		assert(BitCode != bitc::FS_COMBINED_NOCALLS \|\| Record.size() < 4);
		// For now save the call graph edges using the recorded ValueId,
		// on the summary helper object. After reading the VST this will
		// be transferred into the function summary in the index, using the
		// callee GUID instead.
		static int CallGraphEdgeStartIndex = 3;
		for (unsigned I = CallGraphEdgeStartIndex, E = Record.size(); I != E;
		++I) {
		BitcodeSummary.addCallGraphEdge(Record[I],
		HasProfile ? Record[++I] : 0);
		}
		break;
}		}
}		}
}		}
llvm_unreachable("Exit infinite loop");		llvm_unreachable("Exit infinite loop");
}		}

// Parse the module string table block into the Index.		// Parse the module string table block into the Index.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Mmmmm.... mehdi_amini: Mmmmm....
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Oops. In my haste to get the renaming done and send this out, forgot that I planned to add a linkage type to the global var init records. Will add now. tejohnson: Oops. In my haste to get the renaming done and send this out, forgot that I planned to add a…
// This populates the ModulePathStringTable map in the index.		// This populates the ModulePathStringTable map in the index.
std::error_code FunctionIndexBitcodeReader::parseModuleStringTable() {		std::error_code FunctionIndexBitcodeReader::parseModuleStringTable() {
if (Stream.EnterSubBlock(bitc::MODULE_STRTAB_BLOCK_ID))		if (Stream.EnterSubBlock(bitc::MODULE_STRTAB_BLOCK_ID))
return error("Invalid record");		return error("Invalid record");

SmallVector<uint64_t, 64> Record;		SmallVector<uint64_t, 64> Record;

SmallString<128> ModulePath;		SmallString<128> ModulePath;
while (1) {		while (1) {
BitstreamEntry Entry = Stream.advanceSkippingSubblocks();		BitstreamEntry Entry = Stream.advanceSkippingSubblocks();
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Mmmmm (bis).... mehdi_amini: Mmmmm (bis)....
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Ditto. tejohnson: Ditto.

switch (Entry.Kind) {		switch (Entry.Kind) {
case BitstreamEntry::SubBlock: // Handled for us already.		case BitstreamEntry::SubBlock: // Handled for us already.
case BitstreamEntry::Error:		case BitstreamEntry::Error:
return error("Malformed block");		return error("Malformed block");
case BitstreamEntry::EndBlock:		case BitstreamEntry::EndBlock:
return std::error_code();		return std::error_code();
case BitstreamEntry::Record:		case BitstreamEntry::Record:
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	case BitstreamEntry::Record:
// The expected case.		// The expected case.
break;		break;
}		}

// TODO: Read a record. This interface will be completed when ThinLTO		// TODO: Read a record. This interface will be completed when ThinLTO
// importing is added so that it can be tested.		// importing is added so that it can be tested.
SmallVector<uint64_t, 64> Record;		SmallVector<uint64_t, 64> Record;
switch (Stream.readRecord(Entry.ID, Record)) {		switch (Stream.readRecord(Entry.ID, Record)) {
case bitc::FS_CODE_COMBINED_ENTRY:		case bitc::FS_COMBINED_NOCALLS:
		case bitc::FS_COMBINED_CALLS:
		case bitc::FS_COMBINED_CALLS_PROFILE:
default:		default:
return error("Invalid record");		return error("Invalid record");
}		}

return std::error_code();		return std::error_code();
}		}

std::error_code		std::error_code
▲ Show 20 Lines • Show All 254 Lines • Show Last 20 Lines

lib/Bitcode/Writer/BitcodeWriter.cpp

//===--- Bitcode/Writer/BitcodeWriter.cpp - Bitcode Writer ----------------===//		//===--- Bitcode/Writer/BitcodeWriter.cpp - Bitcode Writer ----------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// Bitcode writer implementation.		// Bitcode writer implementation.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Bitcode/ReaderWriter.h"
#include "ValueEnumerator.h"		#include "ValueEnumerator.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
		#include "llvm/Analysis/BlockFrequencyInfo.h"
		#include "llvm/Analysis/BlockFrequencyInfoImpl.h"
		#include "llvm/Analysis/BranchProbabilityInfo.h"
		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Bitcode/BitstreamWriter.h"		#include "llvm/Bitcode/BitstreamWriter.h"
#include "llvm/Bitcode/LLVMBitCodes.h"		#include "llvm/Bitcode/LLVMBitCodes.h"
		#include "llvm/Bitcode/ReaderWriter.h"
#include "llvm/IR/CallSite.h"		#include "llvm/IR/CallSite.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
		#include "llvm/IR/Dominators.h"
#include "llvm/IR/InlineAsm.h"		#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/UseListOrder.h"		#include "llvm/IR/UseListOrder.h"
#include "llvm/IR/ValueSymbolTable.h"		#include "llvm/IR/ValueSymbolTable.h"
		#include "llvm/ProfileData/ProfileCommon.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"
#include "llvm/Support/Program.h"		#include "llvm/Support/Program.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <cctype>		#include <cctype>
#include <map>		#include <map>
using namespace llvm;		using namespace llvm;

		namespace {
		/// Helper class for writing function summary section. Used to hold information
		/// built while writing the function to bitcode, then later accessed when
		/// writing the function summary and the ValueSymbolTable.
		class FunctionWriteInfo {
		private:
		// Pair holding the bitcode index of the corresponding function block
		// (written to the VST) and the helper object holding other summary
		// information.
		typedef std::pair<uint64_t, std::unique_ptr<FunctionSummaryIOHelper>> ValueTy;
		std::map<const Function *, ValueTy> FunctionMap;
		bool EmitFunctionSummary;

		public:
		FunctionWriteInfo(bool EmitFunctionSummary)
		: EmitFunctionSummary(EmitFunctionSummary) {}

		/// Create an entry in the map for the given function, which is
		/// written to the bitcode index \p FuncOffset.
		void recordFunction(const Function &F, uint64_t FuncOffset) {
		std::unique_ptr<FunctionSummaryIOHelper> BitcodeSummaryInfo;
		// If writing function summary sections, create the helper object and
		// start populating the function summary.
		if (EmitFunctionSummary) {
		BitcodeSummaryInfo = llvm::make_unique<FunctionSummaryIOHelper>();
		std::unique_ptr<FunctionSummary> FuncSummary =
		llvm::make_unique<FunctionSummary>();
		FuncSummary->setFunctionLinkage(F.getLinkage());
		BitcodeSummaryInfo->setFunctionSummary(std::move(FuncSummary));
		}
		FunctionMap[&F] = std::make_pair(FuncOffset, std::move(BitcodeSummaryInfo));
		}

		/// Save the instruction count to be written to the summary section for \p F.
		void setInstCount(const Function &F, unsigned NumInsts) {
		assert(EmitFunctionSummary);
		FunctionMap[&F].second->functionSummary()->setInstCount(NumInsts);
		}

		/// Save the call graph edges to be written to the summary section for \p F.
		/// The edges are pairs of <CalleeValueId, ProfileCount>, where ProfileCount
		/// is 0 when there is no PGO.
		void addCallGraphEdges(const Function &F,
		DenseMap<unsigned, uint64_t> &CallGraphEdges) {
		assert(EmitFunctionSummary);
		for (auto &EI : CallGraphEdges)
		FunctionMap[&F].second->addCallGraphEdge(EI.first, EI.second);
		}

		/// Return the bitcode index where the function block for \p F was written.
		uint64_t getBitcodeIndex(const Function &F) {
		auto FMI = FunctionMap.find(&F);
		assert(FMI != FunctionMap.end());
		return FMI->second.first;
		}

		/// Return the summary helper object for writing the summary section for \p F.
		FunctionSummaryIOHelper &getBitcodeSummary(const Function &F) {
		auto FMI = FunctionMap.find(&F);
		assert(FMI != FunctionMap.end());
		return *FMI->second.second;
		}
		};
		}

/// These are manifest constants used by the bitcode writer. They do not need to		/// These are manifest constants used by the bitcode writer. They do not need to
/// be kept in sync with the reader, but need to be consistent within this file.		/// be kept in sync with the reader, but need to be consistent within this file.
enum {		enum {
// VALUE_SYMTAB_BLOCK abbrev id's.		// VALUE_SYMTAB_BLOCK abbrev id's.
VST_ENTRY_8_ABBREV = bitc::FIRST_APPLICATION_ABBREV,		VST_ENTRY_8_ABBREV = bitc::FIRST_APPLICATION_ABBREV,
VST_ENTRY_7_ABBREV,		VST_ENTRY_7_ABBREV,
VST_ENTRY_6_ABBREV,		VST_ENTRY_6_ABBREV,
VST_BBENTRY_6_ABBREV,		VST_BBENTRY_6_ABBREV,
▲ Show 20 Lines • Show All 792 Lines • ▼ Show 20 Lines	// Emit the module's source file name.

// Emit the finished record.		// Emit the finished record.
Stream.EmitRecord(bitc::MODULE_CODE_SOURCE_FILENAME, Vals, FilenameAbbrev);		Stream.EmitRecord(bitc::MODULE_CODE_SOURCE_FILENAME, Vals, FilenameAbbrev);
Vals.clear();		Vals.clear();
}		}

uint64_t VSTOffsetPlaceholder =		uint64_t VSTOffsetPlaceholder =
WriteValueSymbolTableForwardDecl(M->getValueSymbolTable(), Stream);		WriteValueSymbolTableForwardDecl(M->getValueSymbolTable(), Stream);
return VSTOffsetPlaceholder;		return VSTOffsetPlaceholder;
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I'd write it: if (M->getValueSymbolTable().empty()) return 0; return WriteValueSymbolTableForwardDecl(Stream); mehdi_amini: I'd write it: ``` if (M->getValueSymbolTable().empty()) return 0; return…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Good idea, will fix. tejohnson: Good idea, will fix.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Re-reading it, the variable had the nice property of having a name, making it very clear what is returned. If you adopt the above suggested change, I would add a one line comment along the line of `// return the VSTOffsetPlaceholder if we have a VST` mehdi_amini: Re-reading it, the variable had the nice property of having a name, making it very clear what…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Will do. tejohnson: Will do.
}		}

static uint64_t GetOptimizationFlags(const Value *V) {		static uint64_t GetOptimizationFlags(const Value *V) {
uint64_t Flags = 0;		uint64_t Flags = 0;

if (const auto *OBO = dyn_cast<OverflowingBinaryOperator>(V)) {		if (const auto *OBO = dyn_cast<OverflowingBinaryOperator>(V)) {
if (OBO->hasNoSignedWrap())		if (OBO->hasNoSignedWrap())
Flags \|= 1 << bitc::OBO_NO_SIGNED_WRAP;		Flags \|= 1 << bitc::OBO_NO_SIGNED_WRAP;
▲ Show 20 Lines • Show All 1,378 Lines • ▼ Show 20 Lines	case Instruction::VAArg:
break;		break;
}		}

Stream.EmitRecord(Code, Vals, AbbrevToUse);		Stream.EmitRecord(Code, Vals, AbbrevToUse);
Vals.clear();		Vals.clear();
}		}

/// Emit names for globals/functions etc. The VSTOffsetPlaceholder,		/// Emit names for globals/functions etc. The VSTOffsetPlaceholder,
/// BitcodeStartBit and FunctionIndex are only passed for the module-level		/// BitcodeStartBit and FunctionInfo are only passed for the module-level
/// VST, where we are including a function bitcode index and need to		/// VST, where we are including a function bitcode index and need to
/// backpatch the VST forward declaration record.		/// backpatch the VST forward declaration record.
static void WriteValueSymbolTable(		static void WriteValueSymbolTable(const ValueSymbolTable &VST,
const ValueSymbolTable &VST, const ValueEnumerator &VE,		const ValueEnumerator &VE,
BitstreamWriter &Stream, uint64_t VSTOffsetPlaceholder = 0,		BitstreamWriter &Stream,
		uint64_t VSTOffsetPlaceholder = 0,
uint64_t BitcodeStartBit = 0,		uint64_t BitcodeStartBit = 0,
DenseMap<const Function , std::unique_ptr<FunctionInfo>> FunctionIndex =		FunctionWriteInfo *FunctionInfo = nullptr) {
nullptr) {
if (VST.empty()) {		if (VST.empty()) {
// WriteValueSymbolTableForwardDecl should have returned early as		// WriteValueSymbolTableForwardDecl should have returned early as
// well. Ensure this handling remains in sync by asserting that		// well. Ensure this handling remains in sync by asserting that
// the placeholder offset is not set.		// the placeholder offset is not set.
assert(VSTOffsetPlaceholder == 0);		assert(VSTOffsetPlaceholder == 0);
return;		return;
}		}

▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	for (const ValueName &Name : VST) {
if (isa<BasicBlock>(Name.getValue())) {		if (isa<BasicBlock>(Name.getValue())) {
Code = bitc::VST_CODE_BBENTRY;		Code = bitc::VST_CODE_BBENTRY;
if (Bits == SE_Char6)		if (Bits == SE_Char6)
AbbrevToUse = VST_BBENTRY_6_ABBREV;		AbbrevToUse = VST_BBENTRY_6_ABBREV;
} else if (F && !F->isDeclaration()) {		} else if (F && !F->isDeclaration()) {
// Must be the module-level VST, where we pass in the Index and		// Must be the module-level VST, where we pass in the Index and
// have a VSTOffsetPlaceholder. The function-level VST should not		// have a VSTOffsetPlaceholder. The function-level VST should not
// contain any Function symbols.		// contain any Function symbols.
assert(FunctionIndex);		assert(FunctionInfo);
assert(VSTOffsetPlaceholder > 0);		assert(VSTOffsetPlaceholder > 0);

// Save the word offset of the function (from the start of the		// Save the word offset of the function (from the start of the
// actual bitcode written to the stream).		// actual bitcode written to the stream).
assert(FunctionIndex->count(F) == 1);
mehdi_aminiUnsubmitted Not Done Reply Inline Actions Why is it no longer valid? mehdi_amini: Why is it no longer valid?
tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Good catch. I think this got removed when I was using a different data structure here for awhile, and I missed it when restoring the old approach. Will add back. tejohnson: Good catch. I think this got removed when I was using a different data structure here for…
uint64_t BitcodeIndex =		uint64_t BitcodeIndex =
(*FunctionIndex)[F]->bitcodeIndex() - BitcodeStartBit;		FunctionInfo->getBitcodeIndex(*F) - BitcodeStartBit;
assert((BitcodeIndex & 31) == 0 && "function block not 32-bit aligned");		assert((BitcodeIndex & 31) == 0 && "function block not 32-bit aligned");
NameVals.push_back(BitcodeIndex / 32);		NameVals.push_back(BitcodeIndex / 32);

Code = bitc::VST_CODE_FNENTRY;		Code = bitc::VST_CODE_FNENTRY;
AbbrevToUse = FnEntry8BitAbbrev;		AbbrevToUse = FnEntry8BitAbbrev;
if (Bits == SE_Char6)		if (Bits == SE_Char6)
AbbrevToUse = FnEntry6BitAbbrev;		AbbrevToUse = FnEntry6BitAbbrev;
else if (Bits == SE_Fixed7)		else if (Bits == SE_Fixed7)
Show All 13 Lines	for (const ValueName &Name : VST) {
Stream.EmitRecord(Code, NameVals, AbbrevToUse);		Stream.EmitRecord(Code, NameVals, AbbrevToUse);
NameVals.clear();		NameVals.clear();
}		}
Stream.ExitBlock();		Stream.ExitBlock();
}		}

/// Emit function names and summary offsets for the combined index		/// Emit function names and summary offsets for the combined index
/// used by ThinLTO.		/// used by ThinLTO.
static void WriteCombinedValueSymbolTable(const FunctionInfoIndex &Index,		static void
BitstreamWriter &Stream) {		WriteCombinedValueSymbolTable(const FunctionInfoIndex &Index,
		BitstreamWriter &Stream,
		std::map<uint64_t, unsigned> &GUIDToValueIdMap) {
Stream.EnterSubblock(bitc::VALUE_SYMTAB_BLOCK_ID, 4);		Stream.EnterSubblock(bitc::VALUE_SYMTAB_BLOCK_ID, 4);

BitCodeAbbrev *Abbv = new BitCodeAbbrev();		BitCodeAbbrev *Abbv = new BitCodeAbbrev();
Abbv->Add(BitCodeAbbrevOp(bitc::VST_CODE_COMBINED_FNENTRY));		Abbv->Add(BitCodeAbbrevOp(bitc::VST_CODE_COMBINED_FNENTRY));
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // valueid
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // funcsumoffset		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // funcsumoffset
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // funcguid		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // funcguid
unsigned FnEntryAbbrev = Stream.EmitAbbrev(Abbv);		unsigned FnEntryAbbrev = Stream.EmitAbbrev(Abbv);

SmallVector<uint64_t, 64> NameVals;		SmallVector<uint64_t, 64> NameVals;

for (const auto &FII : Index) {		for (const auto &FII : Index) {
for (const auto &FI : FII.second) {		for (const auto &FI : FII.second) {
NameVals.push_back(FI->bitcodeIndex());		// VST_CODE_COMBINED_FNENTRY: [valueid, funcsumoffset, funcguid]
		unsigned AbbrevToUse = FnEntryAbbrev;

uint64_t FuncGUID = FII.first;		uint64_t FuncGUID = FII.first;
		const auto &VMI = GUIDToValueIdMap.find(FuncGUID);
		assert(VMI != GUIDToValueIdMap.end());

// VST_CODE_COMBINED_FNENTRY: [funcsumoffset, funcguid]		NameVals.push_back(VMI->second);
unsigned AbbrevToUse = FnEntryAbbrev;		NameVals.push_back(FI->bitcodeIndex());

NameVals.push_back(FuncGUID);		NameVals.push_back(FuncGUID);

// Emit the finished record.		// Emit the finished record.
Stream.EmitRecord(bitc::VST_CODE_COMBINED_FNENTRY, NameVals, AbbrevToUse);		Stream.EmitRecord(bitc::VST_CODE_COMBINED_FNENTRY, NameVals, AbbrevToUse);
NameVals.clear();		NameVals.clear();
}		}
}		}
Stream.ExitBlock();		Stream.ExitBlock();
Show All 27 Lines	static void WriteUseListBlock(const Function *F, ValueEnumerator &VE,

Stream.EnterSubblock(bitc::USELIST_BLOCK_ID, 3);		Stream.EnterSubblock(bitc::USELIST_BLOCK_ID, 3);
while (hasMore()) {		while (hasMore()) {
WriteUseList(VE, std::move(VE.UseListOrders.back()), Stream);		WriteUseList(VE, std::move(VE.UseListOrders.back()), Stream);
VE.UseListOrders.pop_back();		VE.UseListOrders.pop_back();
}		}
Stream.ExitBlock();		Stream.ExitBlock();
}		}

/// \brief Save information for the given function into the function index.
///
/// At a minimum this saves the bitcode index of the function record that
/// was just written. However, if we are emitting function summary information,
/// for example for ThinLTO, then a \a FunctionSummary object is created
/// to hold the provided summary information.
static void SaveFunctionInfo(
const Function &F,
DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,
unsigned NumInsts, uint64_t BitcodeIndex, bool EmitFunctionSummary) {
std::unique_ptr<FunctionSummary> FuncSummary;
if (EmitFunctionSummary) {
FuncSummary = llvm::make_unique<FunctionSummary>(NumInsts);
FuncSummary->setFunctionLinkage(F.getLinkage());
}
FunctionIndex[&F] =
llvm::make_unique<FunctionInfo>(BitcodeIndex, std::move(FuncSummary));
}

/// Emit a function body to the module stream.		/// Emit a function body to the module stream.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Mind adding a one line comment? mehdi_amini: Mind adding a one line comment?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Will do. tejohnson: Will do.
static void WriteFunction(		static void WriteFunction(const Function &F, const Module *M,
const Function &F, ValueEnumerator &VE, BitstreamWriter &Stream,		ValueEnumerator &VE, BitstreamWriter &Stream,
		davidxlUnsubmitted Not Done Reply Inline Actions Should this be called 'getBlockProfileCount' ? This utiltiy method belongs to include/llvm/ProfileData/ProfileCommon.h. davidxl: Should this be called 'getBlockProfileCount' ? This utiltiy method belongs to…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Good idea. Will rename and move. tejohnson: Good idea. Will rename and move.
DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,		FunctionWriteInfo &FunctionInfo,
bool EmitFunctionSummary) {		bool EmitFunctionSummary) {
// Save the bitcode index of the start of this function block for recording		// Save the bitcode index of the start of this function block for recording
// in the VST.		// in the VST.
uint64_t BitcodeIndex = Stream.GetCurrentBitNo();		FunctionInfo.recordFunction(F, Stream.GetCurrentBitNo());

		bool HasProfileData = F.getEntryCount().hasValue();
		std::unique_ptr<BlockFrequencyInfo> BFI;
		if (EmitFunctionSummary && HasProfileData) {
		Function &Func = const_cast<Function &>(F);
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Guarantee on the recursion depth? mehdi_amini: Guarantee on the recursion depth?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Beyond the Visited set check earlier? What would you like to see? tejohnson: Beyond the Visited set check earlier? What would you like to see?
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I was not worried on non-termination or corretness, but depth that would explode the stack. It was late and I couldn't think clearly enough to find the answer myself. I just got a coffee so I should be able to answer myself now ;) -> For every instruction you will pull transitively all the operands until you reach a global (or something already seen). i.e. this is a DFS search on the SSA graph. If I'm correct, a worklist would probably be appropriate here. mehdi_amini: I was not worried on non-termination or corretness, but depth that would explode the stack. It…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions I wouldn't have thought that an instruction would be large enough to cause stack explosion. But I have no issue with changing this to a worklist iteration instead. tejohnson: I wouldn't have thought that an instruction would be large enough to cause stack explosion. But…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I may misunderstand, but it seems to me that the depth is not the width of a single instruction, but potentially on the order of the number of instructions in the Function. If I'm wrong and you're bounded somehow by the number of operands, then recursion is fine with me. mehdi_amini: I may misunderstand, but it seems to me that the depth is not the width of a single instruction…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions If you are right then this needs to change beyond moving to a worklist, as I only intended to capture the references within a single instruction or variable def. The two entry points to this routine are in WriteFunction, where we pass in an Instruction as the User, and in WriteModuleLevelReferences, where we pass in a GlobalVariable (and I had confirmed that the operands of a variable are its initializer). We recursively walk the given User's operands(). I wouldn't have thought that we could jump from a given Instruction to other instructions in the function this way? tejohnson: If you are right then this needs to change beyond moving to a worklist, as I only intended to…
		LoopInfo LI{DominatorTree(Func)};
		BranchProbabilityInfo BPI{Func, LI};
		BFI = llvm::make_unique<BlockFrequencyInfo>(Func, BPI, LI);
		}

Stream.EnterSubblock(bitc::FUNCTION_BLOCK_ID, 4);		Stream.EnterSubblock(bitc::FUNCTION_BLOCK_ID, 4);
VE.incorporateFunction(F);		VE.incorporateFunction(F);

SmallVector<unsigned, 64> Vals;		SmallVector<unsigned, 64> Vals;

// Emit the number of basic blocks, so the reader can create them ahead of		// Emit the number of basic blocks, so the reader can create them ahead of
// time.		// time.
Show All 11 Lines	static void WriteFunction(const Function &F, const Module *M,

// Keep a running idea of what the instruction ID is.		// Keep a running idea of what the instruction ID is.
unsigned InstID = CstEnd;		unsigned InstID = CstEnd;

bool NeedsMetadataAttachment = F.hasMetadata();		bool NeedsMetadataAttachment = F.hasMetadata();

DILocation *LastDL = nullptr;		DILocation *LastDL = nullptr;
unsigned NumInsts = 0;		unsigned NumInsts = 0;
		// Map from callee ValueId to profile count. Used to accumulate profile
		// counts for all static calls to a given callee.
		DenseMap<unsigned, uint64_t> CallGraphEdges;

// Finally, emit all the instructions, in order.		// Finally, emit all the instructions, in order.
for (Function::const_iterator BB = F.begin(), E = F.end(); BB != E; ++BB)		for (Function::const_iterator BB = F.begin(), E = F.end(); BB != E; ++BB)
for (BasicBlock::const_iterator I = BB->begin(), E = BB->end();		for (BasicBlock::const_iterator I = BB->begin(), E = BB->end();
I != E; ++I) {		I != E; ++I) {
WriteInstruction(*I, InstID, VE, Stream, Vals);		WriteInstruction(*I, InstID, VE, Stream, Vals);

if (!isa<DbgInfoIntrinsic>(I))		if (!isa<DbgInfoIntrinsic>(I))
++NumInsts;		++NumInsts;

if (!I->getType()->isVoidTy())		if (!I->getType()->isVoidTy())
++InstID;		++InstID;

		if (EmitFunctionSummary && isa<CallInst>(I)) {
		auto CalledFunction = cast<CallInst>(I)->getCalledFunction();
		if (CalledFunction && CalledFunction->hasName() &&
		!CalledFunction->isIntrinsic()) {
		uint64_t ScaledCount = 0;
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions `if(CS)` could be hoisted out of the loop. mehdi_amini: `if(CS)` could be hoisted out of the loop.
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Yes, missed this when I cleaned things up from when I was initially looking for non-call references inside this loop. tejohnson: Yes, missed this when I cleaned things up from when I was initially looking for non-call…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Re-reading, I'm not even sure what this loop on the operand is doing at all! Wouldn't the code do exactly the same if you just remove the loop and turn the test into `if (CS)`` mehdi_amini: Re-reading, I'm not even sure what this loop on the operand is doing at all! Wouldn't the code…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Right, this loop is bogus! tejohnson: Right, this loop is bogus!
		if (HasProfileData)
		ScaledCount = getBlockProfileCount(
		BFI->getBlockFreq(&(*BB)).getFrequency(), BFI->getEntryFreq(),
		F.getEntryCount().getValue());
		unsigned CalleeId = VE.getValueID(
		M->getValueSymbolTable().lookup(CalledFunction->getName()));
		CallGraphEdges[CalleeId] += ScaledCount;
		}
		}

// If the instruction has metadata, write a metadata attachment later.		// If the instruction has metadata, write a metadata attachment later.
NeedsMetadataAttachment \|= I->hasMetadataOtherThanDebugLoc();		NeedsMetadataAttachment \|= I->hasMetadataOtherThanDebugLoc();
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions You depend on the order of the calls. If an instruction first add a call to a function, but later on will reference it, you won't remove it from RefEdges. This may be intended and you want to have an accurate count of ref + calls and here you're trying to filter calls out of refedges. However a call instruction could legitimately refer the function (a call passing a function pointer as an argument for instance) mehdi_amini: You depend on the order of the calls. If an instruction first add a call to a function, but…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions I don't think the order of calls matters? The reference list is populated once outside of this loop. On your second point, that is true that this will cause a function both called and referenced in another way to only be recorded as called by this function. Is it important to list both types of references? I was thinking that it was important for importing needs to distinguish the functions being called from the other non-call references, but that essentially the combination of the two are the full reference set of the function. If I should put it in both places, I will need to figure out how to distinguish the two types of function references in findRefEdges. tejohnson: I don't think the order of calls matters? The reference list is populated once outside of this…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I pointed what I saw as an inconsistency, but I may misunderstand totally what you're trying to do here. So I will assume in the following that: RefEdges contains the global values referenced by the current function CallGraphEdges contains the list of Functions called by the current function Do we agree that we should have either of these properties: A function that is part of CallGraphEdges should not be present in RefEdges even if it is referenced in another way than a call A function that is part of CallGraphEdges should also be present in RefEdges if it is referenced in another way than a call What I read from your code right now is: A function that is part of CallGraphEdges may or may not be present in RefEdges if it is referenced in another way than a call. mehdi_amini: I pointed what I saw as an inconsistency, but I may misunderstand totally what you're trying to…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Yep, I was distracted by the bogus loop above, we are collecting this per instruction so there is an ordering issue, and we are getting #3 which is undesirable. I think the way to fix this is to check for the callsite first, then pass in some info from the callsite to findRefEdges so that the callsite reference is itself ignored. I.e. pass in either the callee GV and skip it to get #1, or pass in the Use to get #2. I saw David's follow-on that he thinks #2 is best. I can try that. tejohnson: Yep, I was distracted by the bogus loop above, we are collecting this per instruction so there…
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Actually it turns out to be pretty simple, we only need to pass in the callee GV to findRefEdges to exclude it (because you can't have both a call and a non-call reference to a function in the same instruction, at least I couldn't find a way!). For xalancbmk this fix actually reduced the combined index and .o sizes a bit. It turns out that with the old flow we were inadvertently including intrinsics (which were ignored in the call graph edge list) in the reference list. In the new version of the patch I will upload shortly I've included a new test that checks for both issues (ensuring a function is in both the call list and ref list if it is accessed both ways by the function, and ignoring intrinsics). tejohnson: Actually it turns out to be pretty simple, we only need to pass in the callee GV to…

// If the instruction has a debug location, emit it.		// If the instruction has a debug location, emit it.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions ? mehdi_amini: ?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions More cruft left from when I was looking for non-call refs in this loop, will remove tejohnson: More cruft left from when I was looking for non-call refs in this loop, will remove
DILocation *DL = I->getDebugLoc();		DILocation *DL = I->getDebugLoc();
if (!DL)		if (!DL)
continue;		continue;

if (DL == LastDL) {		if (DL == LastDL) {
// Just repeat the same debug loc as last time.		// Just repeat the same debug loc as last time.
Stream.EmitRecord(bitc::FUNC_CODE_DEBUG_LOC_AGAIN, Vals);		Stream.EmitRecord(bitc::FUNC_CODE_DEBUG_LOC_AGAIN, Vals);
continue;		continue;
}		}

Vals.push_back(DL->getLine());		Vals.push_back(DL->getLine());
Vals.push_back(DL->getColumn());		Vals.push_back(DL->getColumn());
Vals.push_back(VE.getMetadataOrNullID(DL->getScope()));		Vals.push_back(VE.getMetadataOrNullID(DL->getScope()));
Vals.push_back(VE.getMetadataOrNullID(DL->getInlinedAt()));		Vals.push_back(VE.getMetadataOrNullID(DL->getInlinedAt()));
Stream.EmitRecord(bitc::FUNC_CODE_DEBUG_LOC, Vals);		Stream.EmitRecord(bitc::FUNC_CODE_DEBUG_LOC, Vals);
Vals.clear();		Vals.clear();

LastDL = DL;		LastDL = DL;
}		}

		if (EmitFunctionSummary) {
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions ? mehdi_amini: ?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Incomplete cleanup of debugging output, will remove. tejohnson: Incomplete cleanup of debugging output, will remove.
		FunctionInfo.setInstCount(F, NumInsts);
		FunctionInfo.addCallGraphEdges(F, CallGraphEdges);
		}

// Emit names for all the instructions etc.		// Emit names for all the instructions etc.
WriteValueSymbolTable(F.getValueSymbolTable(), VE, Stream);		WriteValueSymbolTable(F.getValueSymbolTable(), VE, Stream);

if (NeedsMetadataAttachment)		if (NeedsMetadataAttachment)
WriteMetadataAttachment(F, VE, Stream);		WriteMetadataAttachment(F, VE, Stream);
if (VE.shouldPreserveUseListOrder())		if (VE.shouldPreserveUseListOrder())
WriteUseListBlock(&F, VE, Stream);		WriteUseListBlock(&F, VE, Stream);
VE.purgeFunction();		VE.purgeFunction();
Stream.ExitBlock();		Stream.ExitBlock();

SaveFunctionInfo(F, FunctionIndex, NumInsts, BitcodeIndex,
EmitFunctionSummary);
}		}

// Emit blockinfo, which defines the standard abbreviations etc.		// Emit blockinfo, which defines the standard abbreviations etc.
static void WriteBlockInfo(const ValueEnumerator &VE, BitstreamWriter &Stream) {		static void WriteBlockInfo(const ValueEnumerator &VE, BitstreamWriter &Stream) {
// We only want to emit block info records for blocks that have multiple		// We only want to emit block info records for blocks that have multiple
// instances: CONSTANTS_BLOCK, FUNCTION_BLOCK and VALUE_SYMTAB_BLOCK.		// instances: CONSTANTS_BLOCK, FUNCTION_BLOCK and VALUE_SYMTAB_BLOCK.
// Other blocks can define their abbrevs inline.		// Other blocks can define their abbrevs inline.
Stream.EnterBlockInfoBlock(2);		Stream.EnterBlockInfoBlock(2);
▲ Show 20 Lines • Show All 217 Lines • ▼ Show 20 Lines	for (const StringMapEntry<uint64_t> &MPSE : I.modPathStringEntries()) {
Stream.EmitRecord(bitc::MST_CODE_ENTRY, NameVals, AbbrevToUse);		Stream.EmitRecord(bitc::MST_CODE_ENTRY, NameVals, AbbrevToUse);
NameVals.clear();		NameVals.clear();
}		}
Stream.ExitBlock();		Stream.ExitBlock();
}		}

// Helper to emit a single function summary record.		// Helper to emit a single function summary record.
static void WritePerModuleFunctionSummaryRecord(		static void WritePerModuleFunctionSummaryRecord(
SmallVector<unsigned, 64> &NameVals, FunctionSummary *FS, unsigned ValueID,		SmallVector<uint64_t, 64> &NameVals,
unsigned FSAbbrev, BitstreamWriter &Stream) {		FunctionSummaryIOHelper &BitcodeSummary, unsigned ValueID,
		unsigned FSNoCallsAbbrev, unsigned FSCallsAbbrev,
		unsigned FSCallsProfileAbbrev, BitstreamWriter &Stream, const Function &F) {
		FunctionSummary *FS = BitcodeSummary.functionSummary();
		davidxlUnsubmitted Not Done Reply Inline Actions Should the type be SmallVector<uint64_t, 64> ? GUID is 64 bit type. davidxl: Should the type be SmallVector<uint64_t, 64> ? GUID is 64 bit type.
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Good point, it should be uint64_t, but for a different reason. We aren't writing the GUID, but rather a value id which is unsigned. But we are writing the profile count when we have profile data, which is uint64_t. Fixing it here and when writing the combined summary. tejohnson: Good point, it should be uint64_t, but for a different reason. We aren't writing the GUID, but…
assert(FS);		assert(FS);
NameVals.push_back(ValueID);		NameVals.push_back(ValueID);
NameVals.push_back(getEncodedLinkage(FS->getFunctionLinkage()));		NameVals.push_back(getEncodedLinkage(FS->getFunctionLinkage()));
NameVals.push_back(FS->instCount());		NameVals.push_back(FS->instCount());

		bool HasProfileData = F.getEntryCount().hasValue();
		for (auto &ECI : BitcodeSummary.edges()) {
		NameVals.push_back(ECI.first);
		if (HasProfileData)
		NameVals.push_back(ECI.second);
		}

		unsigned FSAbbrev =
		BitcodeSummary.edges().empty()
		? FSNoCallsAbbrev
		: (HasProfileData ? FSCallsProfileAbbrev : FSCallsAbbrev);
		unsigned Code = BitcodeSummary.edges().empty()
		? bitc::FS_PERMODULE_NOCALLS
		: (HasProfileData ? bitc::FS_PERMODULE_CALLS_PROFILE
		: bitc::FS_PERMODULE_CALLS);

// Emit the finished record.		// Emit the finished record.
Stream.EmitRecord(bitc::FS_CODE_PERMODULE_ENTRY, NameVals, FSAbbrev);		Stream.EmitRecord(Code, NameVals, FSAbbrev);
NameVals.clear();		NameVals.clear();
}		}

/// Emit the per-module function summary section alongside the rest of		/// Emit the per-module function summary section alongside the rest of
/// the module's bitcode.		/// the module's bitcode.
static void WritePerModuleFunctionSummary(		static void WritePerModuleFunctionSummary(FunctionWriteInfo &FunctionInfo,
DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,		const Module *M,
const Module *M, const ValueEnumerator &VE, BitstreamWriter &Stream) {		const ValueEnumerator &VE,
		BitstreamWriter &Stream) {
Stream.EnterSubblock(bitc::FUNCTION_SUMMARY_BLOCK_ID, 3);		Stream.EnterSubblock(bitc::FUNCTION_SUMMARY_BLOCK_ID, 3);

// Abbrev for FS_CODE_PERMODULE_ENTRY.		// Abbrev for FS_PERMODULE_NOCALLS.
BitCodeAbbrev *Abbv = new BitCodeAbbrev();		BitCodeAbbrev *Abbv = new BitCodeAbbrev();
Abbv->Add(BitCodeAbbrevOp(bitc::FS_CODE_PERMODULE_ENTRY));		Abbv->Add(BitCodeAbbrevOp(bitc::FS_PERMODULE_NOCALLS));
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // valueid		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // valueid
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount
unsigned FSAbbrev = Stream.EmitAbbrev(Abbv);		unsigned FSNoCallsAbbrev = Stream.EmitAbbrev(Abbv);

SmallVector<unsigned, 64> NameVals;		// Abbrev for FS_PERMODULE_CALLS.
// Iterate over the list of functions instead of the FunctionIndex map to		Abbv = new BitCodeAbbrev();
		Abbv->Add(BitCodeAbbrevOp(bitc::FS_PERMODULE_CALLS));
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // valueid
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array)); // valueids
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6));
		unsigned FSCallsAbbrev = Stream.EmitAbbrev(Abbv);

		// Abbrev for FS_PERMODULE_CALLS_PROFILE.
		Abbv = new BitCodeAbbrev();
		Abbv->Add(BitCodeAbbrevOp(bitc::FS_PERMODULE_CALLS_PROFILE));
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // valueid
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array)); // valueid/count pairs
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6));
		unsigned FSCallsProfileAbbrev = Stream.EmitAbbrev(Abbv);

		SmallVector<uint64_t, 64> NameVals;
		// Iterate over the list of functions instead of the FunctionInfo map to
// ensure the ordering is stable.		// ensure the ordering is stable.
for (const Function &F : *M) {		for (const Function &F : *M) {
if (F.isDeclaration())		if (F.isDeclaration())
continue;		continue;
// Skip anonymous functions. We will emit a function summary for		// Skip anonymous functions. We will emit a function summary for
// any aliases below.		// any aliases below.
if (!F.hasName())		if (!F.hasName())
continue;		continue;

assert(FunctionIndex.count(&F) == 1);

WritePerModuleFunctionSummaryRecord(		WritePerModuleFunctionSummaryRecord(
NameVals, FunctionIndex[&F]->functionSummary(),		NameVals, FunctionInfo.getBitcodeSummary(F),
VE.getValueID(M->getValueSymbolTable().lookup(F.getName())), FSAbbrev,		VE.getValueID(M->getValueSymbolTable().lookup(F.getName())),
Stream);		FSNoCallsAbbrev, FSCallsAbbrev, FSCallsProfileAbbrev, Stream, F);
}		}

for (const GlobalAlias &A : M->aliases()) {		for (const GlobalAlias &A : M->aliases()) {
if (!A.getBaseObject())		if (!A.getBaseObject())
continue;		continue;
const Function *F = dyn_cast<Function>(A.getBaseObject());		const Function *F = dyn_cast<Function>(A.getBaseObject());
if (!F \|\| F->isDeclaration())		if (!F \|\| F->isDeclaration())
continue;		continue;

assert(FunctionIndex.count(F) == 1);
WritePerModuleFunctionSummaryRecord(		WritePerModuleFunctionSummaryRecord(
NameVals, FunctionIndex[F]->functionSummary(),		NameVals, FunctionInfo.getBitcodeSummary(*F),
VE.getValueID(M->getValueSymbolTable().lookup(A.getName())), FSAbbrev,		VE.getValueID(M->getValueSymbolTable().lookup(A.getName())),
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Isn't the opposite that the code is doing? mehdi_amini: Isn't the opposite that the code is doing?
		tejohnsonAuthorUnsubmitted Not Done Reply Inline Actions Yes, good catch. I think what I initially wanted was to record them this way, but the issue is that the alias doesn't have a separate summary, which is why we are grabbing the aliasee summary here. Will update the comment. tejohnson: Yes, good catch. I think what I initially wanted was to record them this way, but the issue is…
Stream);		FSNoCallsAbbrev, FSCallsAbbrev, FSCallsProfileAbbrev, Stream, *F);
}		}

Stream.ExitBlock();		Stream.ExitBlock();
}		}

/// Emit the combined function summary section into the combined index		/// Emit the combined function summary section into the combined index
/// file.		/// file.
static void WriteCombinedFunctionSummary(const FunctionInfoIndex &I,		static void
BitstreamWriter &Stream) {		WriteCombinedFunctionSummary(const FunctionInfoIndex &I,
		BitstreamWriter &Stream,
		std::map<uint64_t, unsigned> &GUIDToValueIdMap) {
Stream.EnterSubblock(bitc::FUNCTION_SUMMARY_BLOCK_ID, 3);		Stream.EnterSubblock(bitc::FUNCTION_SUMMARY_BLOCK_ID, 3);

// Abbrev for FS_CODE_COMBINED_ENTRY.		// Abbrev for FS_COMBINED_NOCALLS.
BitCodeAbbrev *Abbv = new BitCodeAbbrev();		BitCodeAbbrev *Abbv = new BitCodeAbbrev();
Abbv->Add(BitCodeAbbrevOp(bitc::FS_CODE_COMBINED_ENTRY));		Abbv->Add(BitCodeAbbrevOp(bitc::FS_COMBINED_NOCALLS));
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // modid		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // modid
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount
unsigned FSAbbrev = Stream.EmitAbbrev(Abbv);		unsigned FSNoCallsAbbrev = Stream.EmitAbbrev(Abbv);

SmallVector<unsigned, 64> NameVals;		// Abbrev for FS_COMBINED_CALLS.
		Abbv = new BitCodeAbbrev();
		Abbv->Add(BitCodeAbbrevOp(bitc::FS_COMBINED_CALLS));
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // modid
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array)); // valueids
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6));
		unsigned FSCallsAbbrev = Stream.EmitAbbrev(Abbv);

		// Abbrev for FS_COMBINED_CALLS_PROFILE.
		Abbv = new BitCodeAbbrev();
		Abbv->Add(BitCodeAbbrevOp(bitc::FS_COMBINED_CALLS_PROFILE));
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // modid
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array)); // valueid/count pairs
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6));
		unsigned FSCallsProfileAbbrev = Stream.EmitAbbrev(Abbv);

		SmallVector<uint64_t, 64> NameVals;
for (const auto &FII : I) {		for (const auto &FII : I) {
for (auto &FI : FII.second) {		for (auto &FI : FII.second) {
FunctionSummary *FS = FI->functionSummary();		FunctionSummary *FS = FI->functionSummary();
assert(FS);		assert(FS);

NameVals.push_back(I.getModuleId(FS->modulePath()));		NameVals.push_back(I.getModuleId(FS->modulePath()));
NameVals.push_back(getEncodedLinkage(FS->getFunctionLinkage()));		NameVals.push_back(getEncodedLinkage(FS->getFunctionLinkage()));
NameVals.push_back(FS->instCount());		NameVals.push_back(FS->instCount());

		bool HasProfileData = false;
		for (auto &EI : FS->edges()) {
		HasProfileData \|= EI.second != 0;
		if (HasProfileData)
		break;
		}

		bool HasCalls = false;
		for (auto &EI : FS->edges()) {
		const auto &VMI = GUIDToValueIdMap.find(EI.first);
		// If this GUID doesn't have an entry, it doesn't have a function
		// summary and we don't need to record any calls to it.
		if (VMI == GUIDToValueIdMap.end())
		continue;
		HasCalls = true;
		NameVals.push_back(VMI->second);
		if (HasProfileData)
		NameVals.push_back(EI.second);
		}

// Record the starting offset of this summary entry for use		// Record the starting offset of this summary entry for use
// in the VST entry. Add the current code size since the		// in the VST entry. Add the current code size since the
// reader will invoke readRecord after the abbrev id read.		// reader will invoke readRecord after the abbrev id read.
FI->setBitcodeIndex(Stream.GetCurrentBitNo() + Stream.GetAbbrevIDWidth());		FI->setBitcodeIndex(Stream.GetCurrentBitNo() + Stream.GetAbbrevIDWidth());

		unsigned FSAbbrev =
		HasCalls ? (HasProfileData ? FSCallsProfileAbbrev : FSCallsAbbrev)
		: FSNoCallsAbbrev;
		unsigned Code = HasCalls
		? (HasProfileData ? bitc::FS_COMBINED_CALLS_PROFILE
		: bitc::FS_COMBINED_CALLS)
		: bitc::FS_COMBINED_NOCALLS;

// Emit the finished record.		// Emit the finished record.
Stream.EmitRecord(bitc::FS_CODE_COMBINED_ENTRY, NameVals, FSAbbrev);		Stream.EmitRecord(Code, NameVals, FSAbbrev);
NameVals.clear();		NameVals.clear();
}		}
}		}

Stream.ExitBlock();		Stream.ExitBlock();
}		}

// Create the "IDENTIFICATION_BLOCK_ID" containing a single string with the		// Create the "IDENTIFICATION_BLOCK_ID" containing a single string with the
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	static void WriteModule(const Module *M, BitstreamWriter &Stream,

// Emit module-level use-lists.		// Emit module-level use-lists.
if (VE.shouldPreserveUseListOrder())		if (VE.shouldPreserveUseListOrder())
WriteUseListBlock(nullptr, VE, Stream);		WriteUseListBlock(nullptr, VE, Stream);

WriteOperandBundleTags(M, Stream);		WriteOperandBundleTags(M, Stream);

// Emit function bodies.		// Emit function bodies.
DenseMap<const Function *, std::unique_ptr<FunctionInfo>> FunctionIndex;		FunctionWriteInfo FunctionInfo(EmitFunctionSummary);
for (Module::const_iterator F = M->begin(), E = M->end(); F != E; ++F)		for (Module::const_iterator F = M->begin(), E = M->end(); F != E; ++F)
if (!F->isDeclaration())		if (!F->isDeclaration())
WriteFunction(*F, VE, Stream, FunctionIndex, EmitFunctionSummary);		WriteFunction(*F, M, VE, Stream, FunctionInfo, EmitFunctionSummary);

// Need to write after the above call to WriteFunction which populates		// Need to write after the above call to WriteFunction which populates
// the summary information in the index.		// the summary information in the index.
if (EmitFunctionSummary)		if (EmitFunctionSummary)
WritePerModuleFunctionSummary(FunctionIndex, M, VE, Stream);		WritePerModuleFunctionSummary(FunctionInfo, M, VE, Stream);

WriteValueSymbolTable(M->getValueSymbolTable(), VE, Stream,		WriteValueSymbolTable(M->getValueSymbolTable(), VE, Stream,
VSTOffsetPlaceholder, BitcodeStartBit, &FunctionIndex);		VSTOffsetPlaceholder, BitcodeStartBit, &FunctionInfo);

Stream.ExitBlock();		Stream.ExitBlock();
}		}

/// EmitDarwinBCHeader - If generating a bc file on darwin, we have to emit a		/// EmitDarwinBCHeader - If generating a bc file on darwin, we have to emit a
/// header and trailer to make it compatible with the system archiver. To do		/// header and trailer to make it compatible with the system archiver. To do
/// this we emit the following header, and then emit a trailer that pads the		/// this we emit the following header, and then emit a trailer that pads the
/// file out to be a multiple of 16 bytes.		/// file out to be a multiple of 16 bytes.
▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	void llvm::WriteFunctionSummaryToFile(const FunctionInfoIndex &Index,
SmallVector<unsigned, 1> Vals;		SmallVector<unsigned, 1> Vals;
unsigned CurVersion = 1;		unsigned CurVersion = 1;
Vals.push_back(CurVersion);		Vals.push_back(CurVersion);
Stream.EmitRecord(bitc::MODULE_CODE_VERSION, Vals);		Stream.EmitRecord(bitc::MODULE_CODE_VERSION, Vals);

// Write the module paths in the combined index.		// Write the module paths in the combined index.
WriteModStrings(Index, Stream);		WriteModStrings(Index, Stream);

		// Assign unique value ids to all functions in the index for use
		// in writing out the call graph edges. Save the mapping from GUID
		// to the new global value id to use when writing those edges, which
		// are currently saved in the index in terms of GUID.
		std::map<uint64_t, unsigned> GUIDToValueIdMap;
		unsigned GlobalValueId = 0;
		for (auto &II : Index)
		GUIDToValueIdMap[II.first] = ++GlobalValueId;

// Write the function summary combined index records.		// Write the function summary combined index records.
WriteCombinedFunctionSummary(Index, Stream);		WriteCombinedFunctionSummary(Index, Stream, GUIDToValueIdMap);

// Need a special VST writer for the combined index (we don't have a		// Need a special VST writer for the combined index (we don't have a
// real VST and real values when this is invoked).		// real VST and real values when this is invoked).
WriteCombinedValueSymbolTable(Index, Stream);		WriteCombinedValueSymbolTable(Index, Stream, GUIDToValueIdMap);

Stream.ExitBlock();		Stream.ExitBlock();

Out.write((char *)&Buffer.front(), Buffer.size());		Out.write((char *)&Buffer.front(), Buffer.size());
}		}

lib/Bitcode/Writer/LLVMBuild.txt

	Show All 13 Lines
	; http://llvm.org/docs/LLVMBuild.html			; http://llvm.org/docs/LLVMBuild.html
	;			;
	;===------------------------------------------------------------------------===;			;===------------------------------------------------------------------------===;

	[component_0]			[component_0]
	type = Library			type = Library
	name = BitWriter			name = BitWriter
	parent = Bitcode			parent = Bitcode
	required_libraries = Core Support			required_libraries = Analysis Core Support

test/Bitcode/Inputs/thinlto-function-summary-callgraph-pgo.ll

This file was added.

				; ModuleID = 'thinlto-function-summary-callgraph2.ll'
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; Function Attrs: nounwind uwtable
				define void @func() #0 !prof !2 {
				entry:
				ret void
				}

				!2 = !{!"function_entry_count", i64 1}

test/Bitcode/Inputs/thinlto-function-summary-callgraph.ll

This file was added.

				; ModuleID = 'thinlto-function-summary-callgraph2.ll'
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; Function Attrs: nounwind uwtable
				define void @func() #0 {
				entry:
				ret void
				}

test/Bitcode/thinlto-function-summary-callgraph-pgo.ll

This file was added.

				; RUN: llvm-as -function-summary %s -o %t.o
				; RUN: llvm-bcanalyzer -dump %t.o \| FileCheck %s
				; RUN: llvm-as -function-summary %p/Inputs/thinlto-function-summary-callgraph.ll -o %t2.o
				; RUN: llvm-lto -thinlto -o %t3 %t.o %t2.o
				; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc \| FileCheck %s --check-prefix=COMBINED

				; CHECK: <FUNCTION_SUMMARY_BLOCK
				; See if the call to func is registered, using the expected value id and
				; profile count
				; CHECK-NEXT: <PERMODULE_CALLS_PROFILE {{.*}} op3=1 op4=1/>
				; CHECK-NEXT: </FUNCTION_SUMMARY_BLOCK>
				; CHECK-NEXT: <VALUE_SYMTAB
				; CHECK-NEXT: <FNENTRY {{.*}} record string = 'main'
				; External function func should have entry with value id 1
				; CHECK-NEXT: <ENTRY {{.}} op0=1 {{.}} record string = 'func'
				; CHECK-NEXT: </VALUE_SYMTAB>

				; COMBINED: <FUNCTION_SUMMARY_BLOCK
				; COMBINED-NEXT: <COMBINED_NOCALLS
				; See if the call to func is registered, using the expected value id and
				; profile count
				; COMBINED-NEXT: <COMBINED_CALLS_PROFILE {{.*}} op3=1 op4=1/>
				; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK>
				; COMBINED-NEXT: <VALUE_SYMTAB
				; Entry for function func should have entry with value id 1
				; COMBINED-NEXT: <COMBINED_FNENTRY {{.}} op0=1 {{.}} op2=7289175272376759421/>
				; COMBINED-NEXT: <COMBINED_FNENTRY
				; COMBINED-NEXT: </VALUE_SYMTAB>

				; ModuleID = 'thinlto-function-summary-callgraph.ll'
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; Function Attrs: nounwind uwtable
				define i32 @main() #0 !prof !2 {
				entry:
				call void (...) @func()
				ret i32 0
				}

				declare void @func(...) #1

				!2 = !{!"function_entry_count", i64 1}

test/Bitcode/thinlto-function-summary-callgraph.ll

This file was added.

				; RUN: llvm-as -function-summary %s -o %t.o
				; RUN: llvm-bcanalyzer -dump %t.o \| FileCheck %s
				; RUN: llvm-as -function-summary %p/Inputs/thinlto-function-summary-callgraph.ll -o %t2.o
				; RUN: llvm-lto -thinlto -o %t3 %t.o %t2.o
				; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc \| FileCheck %s --check-prefix=COMBINED

				; CHECK: <FUNCTION_SUMMARY_BLOCK
				; See if the call to func is registered, using the expected value id
				; CHECK-NEXT: <PERMODULE_CALLS {{.*}} op3=1/>
				; CHECK-NEXT: </FUNCTION_SUMMARY_BLOCK>
				; CHECK-NEXT: <VALUE_SYMTAB
				; CHECK-NEXT: <FNENTRY {{.*}} record string = 'main'
				; External function func should have entry with value id 1
				; CHECK-NEXT: <ENTRY {{.}} op0=1 {{.}} record string = 'func'
				; CHECK-NEXT: </VALUE_SYMTAB>

				; COMBINED: <FUNCTION_SUMMARY_BLOCK
				; COMBINED-NEXT: <COMBINED_NOCALLS
				; See if the call to func is registered, using the expected value id
				; COMBINED-NEXT: <COMBINED_CALLS {{.*}} op3=1/>
				; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK>
				; COMBINED-NEXT: <VALUE_SYMTAB
				; Entry for function func should have entry with value id 1
				; COMBINED-NEXT: <COMBINED_FNENTRY {{.}} op0=1 {{.}} op2=7289175272376759421/>
				; COMBINED-NEXT: <COMBINED_FNENTRY
				; COMBINED-NEXT: </VALUE_SYMTAB>

				; ModuleID = 'thinlto-function-summary-callgraph.ll'
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; Function Attrs: nounwind uwtable
				define i32 @main() #0 {
				entry:
				call void (...) @func()
				ret i32 0
				}

				declare void @func(...) #1

test/Bitcode/thinlto-function-summary.ll

	; RUN: llvm-as -function-summary < %s \| llvm-bcanalyzer -dump \| FileCheck %s -check-prefix=BC			; RUN: llvm-as -function-summary < %s \| llvm-bcanalyzer -dump \| FileCheck %s -check-prefix=BC
	; Check for function summary block/records.			; Check for function summary block/records.

	; Check the value ids in the function summary entries against the			; Check the value ids in the function summary entries against the
	; same in the ValueSumbolTable, to ensure the ordering is stable.			; same in the ValueSumbolTable, to ensure the ordering is stable.
	; Also check the linkage field on the summary entries.			; Also check the linkage field on the summary entries.
	; BC: <FUNCTION_SUMMARY_BLOCK			; BC: <FUNCTION_SUMMARY_BLOCK
	; BC-NEXT: <PERMODULE_ENTRY {{.*}} op0=1 op1=0			; BC-NEXT: <PERMODULE_NOCALLS {{.*}} op0=1 op1=0
	; BC-NEXT: <PERMODULE_ENTRY {{.*}} op0=2 op1=0			; BC-NEXT: <PERMODULE_NOCALLS {{.*}} op0=2 op1=0
	; BC-NEXT: <PERMODULE_ENTRY {{.*}} op0=4 op1=3			; BC-NEXT: <PERMODULE_NOCALLS {{.*}} op0=4 op1=3
	; BC-NEXT: </FUNCTION_SUMMARY_BLOCK			; BC-NEXT: </FUNCTION_SUMMARY_BLOCK
	; BC-NEXT: <VALUE_SYMTAB			; BC-NEXT: <VALUE_SYMTAB
	; BC-NEXT: <FNENTRY {{.}} op0=1 {{.}}> record string = 'foo'			; BC-NEXT: <FNENTRY {{.}} op0=1 {{.}}> record string = 'foo'
	; BC-NEXT: <FNENTRY {{.}} op0=2 {{.}}> record string = 'bar'			; BC-NEXT: <FNENTRY {{.}} op0=2 {{.}}> record string = 'bar'
	; BC-NEXT: <FNENTRY {{.}} op0=4 {{.}}> record string = 'f'			; BC-NEXT: <FNENTRY {{.}} op0=4 {{.}}> record string = 'f'

	; RUN: llvm-as -function-summary < %s \| llvm-dis \| FileCheck %s			; RUN: llvm-as -function-summary < %s \| llvm-dis \| FileCheck %s
	; Check that this round-trips correctly.			; Check that this round-trips correctly.
	Show All 34 Lines

test/Bitcode/thinlto-summary-linkage-types.ll

	; Check the linkage types in both the per-module and combined summaries.			; Check the linkage types in both the per-module and combined summaries.
	; RUN: llvm-as -function-summary %s -o %t.o			; RUN: llvm-as -function-summary %s -o %t.o
	; RUN: llvm-bcanalyzer -dump %t.o \| FileCheck %s			; RUN: llvm-bcanalyzer -dump %t.o \| FileCheck %s
	; RUN: llvm-lto -thinlto -o %t2 %t.o			; RUN: llvm-lto -thinlto -o %t2 %t.o
	; RUN: llvm-bcanalyzer -dump %t2.thinlto.bc \| FileCheck %s --check-prefix=COMBINED			; RUN: llvm-bcanalyzer -dump %t2.thinlto.bc \| FileCheck %s --check-prefix=COMBINED

	define private void @private()			define private void @private()
	; CHECK: <PERMODULE_ENTRY {{.*}} op1=9			; CHECK: <PERMODULE_NOCALLS {{.*}} op1=9
	; COMBINED-DAG: <COMBINED_ENTRY {{.*}} op1=9			; COMBINED-DAG: <COMBINED_NOCALLS {{.*}} op1=9
	{			{
	ret void			ret void
	}			}

	define internal void @internal()			define internal void @internal()
	; CHECK: <PERMODULE_ENTRY {{.*}} op1=3			; CHECK: <PERMODULE_NOCALLS {{.*}} op1=3
	; COMBINED-DAG: <COMBINED_ENTRY {{.*}} op1=3			; COMBINED-DAG: <COMBINED_NOCALLS {{.*}} op1=3
	{			{
	ret void			ret void
	}			}

	define available_externally void @available_externally()			define available_externally void @available_externally()
	; CHECK: <PERMODULE_ENTRY {{.*}} op1=12			; CHECK: <PERMODULE_NOCALLS {{.*}} op1=12
	; COMBINED-DAG: <COMBINED_ENTRY {{.*}} op1=12			; COMBINED-DAG: <COMBINED_NOCALLS {{.*}} op1=12
	{			{
	ret void			ret void
	}			}

	define linkonce void @linkonce()			define linkonce void @linkonce()
	; CHECK: <PERMODULE_ENTRY {{.*}} op1=18			; CHECK: <PERMODULE_NOCALLS {{.*}} op1=18
	; COMBINED-DAG: <COMBINED_ENTRY {{.*}} op1=18			; COMBINED-DAG: <COMBINED_NOCALLS {{.*}} op1=18
	{			{
	ret void			ret void
	}			}

	define weak void @weak()			define weak void @weak()
	; CHECK: <PERMODULE_ENTRY {{.*}} op1=16			; CHECK: <PERMODULE_NOCALLS {{.*}} op1=16
	; COMBINED-DAG: <COMBINED_ENTRY {{.*}} op1=16			; COMBINED-DAG: <COMBINED_NOCALLS {{.*}} op1=16
	{			{
	ret void			ret void
	}			}

	define linkonce_odr void @linkonce_odr()			define linkonce_odr void @linkonce_odr()
	; CHECK: <PERMODULE_ENTRY {{.*}} op1=19			; CHECK: <PERMODULE_NOCALLS {{.*}} op1=19
	; COMBINED-DAG: <COMBINED_ENTRY {{.*}} op1=19			; COMBINED-DAG: <COMBINED_NOCALLS {{.*}} op1=19
	{			{
	ret void			ret void
	}			}

	define weak_odr void @weak_odr()			define weak_odr void @weak_odr()
	; CHECK: <PERMODULE_ENTRY {{.*}} op1=17			; CHECK: <PERMODULE_NOCALLS {{.*}} op1=17
	; COMBINED-DAG: <COMBINED_ENTRY {{.*}} op1=17			; COMBINED-DAG: <COMBINED_NOCALLS {{.*}} op1=17
	{			{
	ret void			ret void
	}			}

	define external void @external()			define external void @external()
	; CHECK: <PERMODULE_ENTRY {{.*}} op1=0			; CHECK: <PERMODULE_NOCALLS {{.*}} op1=0
	; COMBINED-DAG: <COMBINED_ENTRY {{.*}} op1=0			; COMBINED-DAG: <COMBINED_NOCALLS {{.*}} op1=0
	{			{
	ret void			ret void
	}			}

test/tools/gold/X86/thinlto.ll

	Show All 14 Lines
	; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc \| FileCheck %s --check-prefix=COMBINED			; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc \| FileCheck %s --check-prefix=COMBINED
	; RUN: not test -e %t3			; RUN: not test -e %t3

	; COMBINED: <MODULE_STRTAB_BLOCK			; COMBINED: <MODULE_STRTAB_BLOCK
	; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}/test/tools/gold/X86/Output/thinlto.ll.tmp{{.*}}.o'			; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}/test/tools/gold/X86/Output/thinlto.ll.tmp{{.*}}.o'
	; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}/test/tools/gold/X86/Output/thinlto.ll.tmp{{.*}}.o'			; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}/test/tools/gold/X86/Output/thinlto.ll.tmp{{.*}}.o'
	; COMBINED-NEXT: </MODULE_STRTAB_BLOCK			; COMBINED-NEXT: </MODULE_STRTAB_BLOCK
	; COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK			; COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK
	; COMBINED-NEXT: <COMBINED_ENTRY			; COMBINED-NEXT: <COMBINED_NOCALLS
	; COMBINED-NEXT: <COMBINED_ENTRY			; COMBINED-NEXT: <COMBINED_NOCALLS
	; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK			; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK
	; COMBINED-NEXT: <VALUE_SYMTAB			; COMBINED-NEXT: <VALUE_SYMTAB
	; Check that the format is: op0=offset, op1=funcguid, where funcguid is			; Check that the format is: op0=valueid, op1=offset, op2=funcguid,
	; the lower 64 bits of the function name MD5.			; where funcguid is the lower 64 bits of the function name MD5.
	; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785\|-5300342847281564238}}			; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{1\|2}} op1={{[0-9]+}} op2={{-3706093650706652785\|-5300342847281564238}}
	; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785\|-5300342847281564238}}			; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{1\|2}} op1={{[0-9]+}} op2={{-3706093650706652785\|-5300342847281564238}}
	; COMBINED-NEXT: </VALUE_SYMTAB			; COMBINED-NEXT: </VALUE_SYMTAB

	define void @f() {			define void @f() {
	entry:			entry:
	ret void			ret void
	}			}

test/tools/llvm-lto/thinlto.ll

	; Test combined function index generation for ThinLTO via llvm-lto.			; Test combined function index generation for ThinLTO via llvm-lto.
	; RUN: llvm-as -function-summary %s -o %t.o			; RUN: llvm-as -function-summary %s -o %t.o
	; RUN: llvm-as -function-summary %p/Inputs/thinlto.ll -o %t2.o			; RUN: llvm-as -function-summary %p/Inputs/thinlto.ll -o %t2.o
	; RUN: llvm-lto -thinlto -o %t3 %t.o %t2.o			; RUN: llvm-lto -thinlto -o %t3 %t.o %t2.o
	; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc \| FileCheck %s --check-prefix=COMBINED			; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc \| FileCheck %s --check-prefix=COMBINED
	; RUN: not test -e %t3			; RUN: not test -e %t3

	; COMBINED: <MODULE_STRTAB_BLOCK			; COMBINED: <MODULE_STRTAB_BLOCK
	; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}thinlto.ll.tmp{{.*}}.o'			; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}thinlto.ll.tmp{{.*}}.o'
	; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}thinlto.ll.tmp{{.*}}.o'			; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}thinlto.ll.tmp{{.*}}.o'
	; COMBINED-NEXT: </MODULE_STRTAB_BLOCK			; COMBINED-NEXT: </MODULE_STRTAB_BLOCK
	; COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK			; COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK
	; COMBINED-NEXT: <COMBINED_ENTRY			; COMBINED-NEXT: <COMBINED_NOCALLS
	; COMBINED-NEXT: <COMBINED_ENTRY			; COMBINED-NEXT: <COMBINED_NOCALLS
	; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK			; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK
	; COMBINED-NEXT: <VALUE_SYMTAB			; COMBINED-NEXT: <VALUE_SYMTAB
	; Check that the format is: op0=offset, op1=funcguid, where funcguid is			; Check that the format is: op0=valueid, op1=offset, op2=funcguid,
	; the lower 64 bits of the function name MD5.			; where funcguid is the lower 64 bits of the function name MD5.
	; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785\|-5300342847281564238}}			; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{1\|2}} op1={{[0-9]+}} op2={{-3706093650706652785\|-5300342847281564238}}
	; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785\|-5300342847281564238}}			; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{1\|2}} op1={{[0-9]+}} op2={{-3706093650706652785\|-5300342847281564238}}
	; COMBINED-NEXT: </VALUE_SYMTAB			; COMBINED-NEXT: </VALUE_SYMTAB

	define void @f() {			define void @f() {
	entry:			entry:
	ret void			ret void
	}			}

tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp

Show First 20 Lines • Show All 286 Lines • ▼ Show 20 Lines	case bitc::MODULE_STRTAB_BLOCK_ID:
default:		default:
return nullptr;		return nullptr;
STRINGIFY_CODE(MST_CODE, ENTRY)		STRINGIFY_CODE(MST_CODE, ENTRY)
}		}
case bitc::FUNCTION_SUMMARY_BLOCK_ID:		case bitc::FUNCTION_SUMMARY_BLOCK_ID:
switch (CodeID) {		switch (CodeID) {
default:		default:
return nullptr;		return nullptr;
STRINGIFY_CODE(FS_CODE, PERMODULE_ENTRY)		STRINGIFY_CODE(FS, PERMODULE_NOCALLS)
STRINGIFY_CODE(FS_CODE, COMBINED_ENTRY)		STRINGIFY_CODE(FS, PERMODULE_CALLS)
		STRINGIFY_CODE(FS, PERMODULE_CALLS_PROFILE)
		STRINGIFY_CODE(FS, COMBINED_NOCALLS)
		STRINGIFY_CODE(FS, COMBINED_CALLS)
		STRINGIFY_CODE(FS, COMBINED_CALLS_PROFILE)
}		}
case bitc::METADATA_ATTACHMENT_ID:		case bitc::METADATA_ATTACHMENT_ID:
switch(CodeID) {		switch(CodeID) {
default:return nullptr;		default:return nullptr;
STRINGIFY_CODE(METADATA, ATTACHMENT)		STRINGIFY_CODE(METADATA, ATTACHMENT)
}		}
case bitc::METADATA_BLOCK_ID:		case bitc::METADATA_BLOCK_ID:
switch(CodeID) {		switch(CodeID) {
▲ Show 20 Lines • Show All 509 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Support for reference graph in per-module and combined summary.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 48980

include/llvm/Bitcode/LLVMBitCodes.h

include/llvm/IR/FunctionInfo.h

include/llvm/ProfileData/ProfileCommon.h

lib/Bitcode/Reader/BitcodeReader.cpp

lib/Bitcode/Writer/BitcodeWriter.cpp

lib/Bitcode/Writer/LLVMBuild.txt

test/Bitcode/Inputs/thinlto-function-summary-callgraph-pgo.ll

test/Bitcode/Inputs/thinlto-function-summary-callgraph.ll

test/Bitcode/thinlto-function-summary-callgraph-pgo.ll

test/Bitcode/thinlto-function-summary-callgraph.ll

test/Bitcode/thinlto-function-summary.ll

test/Bitcode/thinlto-summary-linkage-types.ll

test/tools/gold/X86/thinlto.ll

test/tools/llvm-lto/thinlto.ll

tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp

[ThinLTO] Support for reference graph in per-module and combined summary.
ClosedPublic