This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/
-
llvm/
-
Bitcode/
-
LLVMBitCodes.h
-
IR/
-
Function.h
-
FunctionInfo.h
-
Module.h
-
lib/
-
Bitcode/
-
Reader/
-
BitcodeReader.cpp
-
Writer/
-
BitcodeWriter.cpp
-
IR/
-
FunctionInfo.cpp
-
Module.cpp
-
Transforms/IPO/
-
IPO/
-
FunctionImport.cpp
-
test/
-
Bitcode/
-
Inputs/
-
source-filename.bc
-
source-filename.test
-
tools/
-
gold/X86/
-
X86/
-
thinlto.ll
-
llvm-lto/
-
thinlto.ll
-
tools/llvm-bcanalyzer/
-
llvm-bcanalyzer/
-
llvm-bcanalyzer.cpp

Differential D17028

[ThinLTO] Use MD5 hash in function index.
ClosedPublic

Authored by tejohnson on Feb 9 2016, 7:06 AM.

Download Raw Diff

Details

Reviewers

davidxl
mehdi_amini

Commits

rG0919a84071e5: [ThinLTO] Use MD5 hash in function index.
rL260408: [ThinLTO] Use MD5 hash in function index.

Summary

This patch uses the lower 64-bits of the MD5 hash of a function name as
a GUID in the function index, instead of storing function names. Any
local functions are first given a global name by prepending the original
source file name. This is the same naming scheme and GUID used by PGO in
the indexed profile format.

This change has a couple of benefits. The primary benefit is size
reduction in the combined index file, for example 483.xalancbmk's
combined index file was reduced by around 70%. It should also result in
memory savings for the index file in memory, as the in-memory map is
also indexed by the hash instead of the string.

Second, this enables integration with indirect call promotion, since the
indirect call profile targets are recorded using the same global naming
convention and hash. This will enable the function importer to easily
locate function summaries for indirect call profile targets to enable
their import and subsequent promotion.

The original source file name is recorded in the bitcode in a new
module-level record for use in the ThinLTO backend pipeline.

Diff Detail

Repository: rL LLVM

Event Timeline

tejohnson updated this revision to Diff 47317.Feb 9 2016, 7:06 AM

tejohnson retitled this revision from to [ThinLTO] Use MD5 hash in function index..

tejohnson updated this object.

tejohnson added reviewers: davidxl, mehdi_amini.

tejohnson added a subscriber: llvm-commits.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptFeb 9 2016, 7:06 AM

davidxl added inline comments.Feb 9 2016, 9:18 AM

lib/Bitcode/Reader/BitcodeReader.cpp
5466 ↗	(On Diff #47317)	Is the assert unrelated change?
lib/Bitcode/Writer/BitcodeWriter.cpp
823 ↗	(On Diff #47317)	This code looks like a common utility (string emission) -- is there any code reuse opportunity?

tejohnson added inline comments.Feb 9 2016, 9:30 AM

lib/Bitcode/Reader/BitcodeReader.cpp
5466 ↗	(On Diff #47317)	It is related as we can't support lazy function summary reading for the per-module index since we need the function summary to get the global identifier below. Nor do we need to support lazy function summary reading here, that was meant for the combined index reading and the support was probably inadvertently cloned here.
lib/Bitcode/Writer/BitcodeWriter.cpp
823 ↗	(On Diff #47317)	In other places where we set up string encoding abbrevs are structured a bit differently since we are emitting multiple strings (e.g. the VST). Thus they emit all 3 abbrevs unconditionally, then for each string to be emitted invoke getStringEncoding, and use the appropriate abbrev.

Any test on the recorded source file name?

include/llvm/IR/FunctionInfo.h
151 ↗	(On Diff #47317)	I am a little concerned with using DenseMap data structure here. Both DenseMap and StringMap are implemented as open hashtab with quadratic probing. The load factor of both are guaranteed to < 0.75 -- that means there are lots of empty buckets in a large table. The main differences are: StringMap uses indirection -- each bucket contains pointer to the key-value pair object, so the bucket size is small. The copy is also cheaper when rehashing happens. However DenseMap is key-value pair is embedded inside the bucket. That means if the value type has large size, it will cause lots of waste in memory. The copy of dense map can also cause large overhead. When resizing, DenseMap will round up the new size to the next power of 2 value -- this further increases the memory overhead. There is another bad side effect is that if elements are inserted into map one by one, it will incur lots of reallocation operation (just like vector) unless the size of the map is known before hand and properly resized at the beginning. I suggest using std::map if the size of the map is not known priori. If it is known, and if the map lookup happens after the map is created, it might be better to use a vector (pushback) followed by a sort after the map is populated.
include/llvm/IR/Module.h
200 ↗	(On Diff #47317)	This comment is not quite clear in its meaning.
test/tools/gold/X86/thinlto.ll
27 ↗	(On Diff #47317)	test [offset, guid] pattern here?

mehdi_amini added inline comments.Feb 9 2016, 4:59 PM

lib/Bitcode/Reader/BitcodeReader.cpp
461 ↗	(On Diff #47317)	doxygen, at least pointing to the Module one.
5461 ↗	(On Diff #47317)	Typo in the comment: VST_FNENTRY instead of VST_CODE_FNENTRY
5488 ↗	(On Diff #47317)	Typo in the comment: VST_FNENTRY instead of VST_CODE_COMBINED_FNENTRY
lib/IR/Module.cpp
50 ↗	(On Diff #47317)	Can you initialize `SourceFileName` to empty and test in `getSourceFileName()`?
lib/Transforms/IPO/FunctionImport.cpp
206 ↗	(On Diff #47317)	I was not very happy with that originally, and it may be a good time to revisit it. When does it happen? Why don't we always rename `SrcModule` ?
tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp
176 ↗	(On Diff #47317)	Test?

tejohnson added inline comments.Feb 9 2016, 9:37 PM

include/llvm/IR/FunctionInfo.h
151 ↗	(On Diff #47317)	Good point, I hadn't compared the relative overheads of these data structures. Unfortunately, we don't know the size of this map ahead of time. E.g. when merging multiple indexes to create the combined index. And we do lookups while adding (e.g. when creating the combined index we may have multiple comdat with the same name/GUID). So it sounds like the best bet for now is to use std::map, and we can investigate alternatives that might involve restructuring the accesses if this turns out not to be efficient enough.
include/llvm/IR/Module.h
200 ↗	(On Diff #47317)	The point I'm trying to make is that if we are compiling this from bitcode, it comes from the new bitcode record, otherwise it is the name of the input file (which is the same as the ModuleID). E.g.: clang -c foo.c -flto=thin ModuleID = SourceFileName = foo.c (foo.o is bitcode with new MODULE_SOURCE_FILENAME record = "foo.c") clang foo.o -flto=thin ModuleID = foo.o, SourceFileName = foo.c
test/tools/gold/X86/thinlto.ll
27 ↗	(On Diff #47317)	I'm afraid testing the specific offsets will be brittle, unless I just check for any integer in that field. I could test for the GUIDs, but would have to handle either ordering as the order in the combined index is not currently guaranteed. Let me go ahead and check for the GUIDs as that maintains the same level of checking as we had with the function names.

tejohnson added inline comments.Feb 9 2016, 9:55 PM

lib/Bitcode/Reader/BitcodeReader.cpp
461 ↗	(On Diff #47317)	Will do.
5461 ↗	(On Diff #47317)	Will fix.
5488 ↗	(On Diff #47317)	Will fix.
lib/IR/Module.cpp
50 ↗	(On Diff #47317)	Actually, this is by design. When we are compiling from source (say clang -c foo.c -flto=thin), we want the SourceFileName to be foo.c. Only when we later compile the foo.o bitcode file do we then want this set to foo.c from the new bitcode record.
lib/Transforms/IPO/FunctionImport.cpp
206 ↗	(On Diff #47317)	The old comment was stale. We aren't actually doing any renaming in SrcModule until the importing happens later on. At this point we haven't even materialized everything in SrcModule that we want to import, so I think it is premature to do any renaming. The CalledFunctionName however comes out of the Worklist which was populated by findExternalCalls. We had to get the global identifier in order to access the correct function summary out of the index here, and to disambiguate from same named locals we are going to import from other modules. (Looking at findExternalCalls, I just realized that we should still be checking if this is already in the DestModule using the suffix form, however, since that is the name the local will get when it is promoted. However, with the current bulk importing mechanism we would never already have a promoted local from another module imported into DestModule yet, but I will fix this in case that ever changes.)
tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp
176 ↗	(On Diff #47317)	Will add one.

Address David and Mehdi's review comments and rebase.

LGTM. You may wait a little bit to give a change to David to chime in.

lib/IR/Module.cpp
50 ↗	(On Diff #47454)	Not if I was clear because I think what I suggested matches your behavior (unless I missed something). I'd have left the `SourceFileName` member empty and change the getter to: const std::string &getSourceFileName() const { if (SourceFileName.empty()) return getModuleIdentifier(); return SourceFileName; } But it is not important so feel free to keep as you did.
lib/Transforms/IPO/FunctionImport.cpp
221 ↗	(On Diff #47454)	We can't rename before materializing?
test/tools/gold/X86/thinlto.ll
27–28 ↗	(On Diff #47454)	What is the op1 here? Is it the 64 low-bits of the MD5 as an int? Can we have a better pretty-print from llvm-bcanalyzer? (don't hold this patch for this though)

This revision is now accepted and ready to land.Feb 10 2016, 9:11 AM

tejohnson added inline comments.Feb 10 2016, 9:38 AM

lib/IR/Module.cpp
50 ↗	(On Diff #47454)	Oh got it, I misunderstood. It will be more efficient to initialize SourceFileName once and not have to do the empty check every time we call this, so I think I will leave it as-is.
lib/Transforms/IPO/FunctionImport.cpp
221 ↗	(On Diff #47454)	I was worried about doing additional parsing after we have done some renaming. But thinking through this some more it should be ok as all GV references should be through the centralized ValueList which would contain value handles to the renamed GVs. In any case, there isn't a big need to do the renaming in the SrcModule before the actual importing.
test/tools/gold/X86/thinlto.ll
27–28 ↗	(On Diff #47454)	Yes, op0 is the bitcode offset of the summary, and op0 is the MD5 as an int. I will add a comment to these tests to enumerate the format before committing. Would you rather see the values printed in hex? The bcanalyzer just prints all record values as ints, it doesn't look at the record type to determine the best format. That would require more plumbing in the bcanalyzer than I'm thinking this is worth. Probably the easiest thing would be to add an option to llvm-bcanalyzer to dump all record values in hex.

Fix missed incorrect VST_CODE_COMBINED_FNENTRY comment, and add comments
to test describing expected format of this record type.

lgtm

test/tools/gold/X86/thinlto.ll
27–28 ↗	(On Diff #47454)	nit -- for better reading, probably just need to keep one line with actual MD5 value, the rest can be replaced with [0-9]+

tejohnson added inline comments.Feb 10 2016, 9:53 AM

test/tools/gold/X86/thinlto.ll
27–31 ↗	(On Diff #47474)	I think that's what I have here already? I could shorten this a bit by replacing the intermediate parts with {{.}}. E.g.: ; COMBINED-NEXT: <COMBINED_FNENTRY {{.}} {{-3706093650706652785\|-5300342847281564238}} Is that preferable?

I meant the op1 part can be simplified -- but not necessariliy need to.

David

Closed by commit rL260408: [ThinLTO] Use MD5 hash in function index. (authored by tejohnson). · Explain WhyFeb 10 2016, 11:02 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Bitcode/

LLVMBitCodes.h

5 lines

IR/

Function.h

7 lines

FunctionInfo.h

19 lines

Module.h

11 lines

lib/

Bitcode/

Reader/

BitcodeReader.cpp

54 lines

Writer/

BitcodeWriter.cpp

124 lines

IR/

FunctionInfo.cpp

17 lines

Module.cpp

2 lines

Transforms/

IPO/

FunctionImport.cpp

18 lines

test/

Bitcode/

Inputs/

source-filename.bc

source-filename.test

2 lines

tools/

gold/

X86/

thinlto.ll

6 lines

llvm-lto/

thinlto.ll

6 lines

tools/

llvm-bcanalyzer/

llvm-bcanalyzer.cpp

1 line

Diff 47488

llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	enum ModuleCodes {

MODULE_CODE_VSTOFFSET = 13, // VSTOFFSET: [offset]		MODULE_CODE_VSTOFFSET = 13, // VSTOFFSET: [offset]

// ALIAS: [alias value type, addrspace, aliasee val#, linkage, visibility]		// ALIAS: [alias value type, addrspace, aliasee val#, linkage, visibility]
MODULE_CODE_ALIAS = 14,		MODULE_CODE_ALIAS = 14,

// METADATA_VALUES: [numvals]		// METADATA_VALUES: [numvals]
MODULE_CODE_METADATA_VALUES = 15,		MODULE_CODE_METADATA_VALUES = 15,

		// SOURCE_FILENAME: [namechar x N]
		MODULE_CODE_SOURCE_FILENAME = 16,
};		};

/// PARAMATTR blocks have code for defining a parameter attribute set.		/// PARAMATTR blocks have code for defining a parameter attribute set.
enum AttributeCodes {		enum AttributeCodes {
// FIXME: Remove `PARAMATTR_CODE_ENTRY_OLD' in 4.0		// FIXME: Remove `PARAMATTR_CODE_ENTRY_OLD' in 4.0
PARAMATTR_CODE_ENTRY_OLD = 1, // ENTRY: [paramidx0, attr0,		PARAMATTR_CODE_ENTRY_OLD = 1, // ENTRY: [paramidx0, attr0,
// paramidx1, attr1...]		// paramidx1, attr1...]
PARAMATTR_CODE_ENTRY = 2, // ENTRY: [paramidx0, attrgrp0,		PARAMATTR_CODE_ENTRY = 2, // ENTRY: [paramidx0, attrgrp0,
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	enum TypeSymtabCodes {
TST_CODE_ENTRY = 1 // TST_ENTRY: [typeid, namechar x N]		TST_CODE_ENTRY = 1 // TST_ENTRY: [typeid, namechar x N]
};		};

// Value symbol table codes.		// Value symbol table codes.
enum ValueSymtabCodes {		enum ValueSymtabCodes {
VST_CODE_ENTRY = 1, // VST_ENTRY: [valueid, namechar x N]		VST_CODE_ENTRY = 1, // VST_ENTRY: [valueid, namechar x N]
VST_CODE_BBENTRY = 2, // VST_BBENTRY: [bbid, namechar x N]		VST_CODE_BBENTRY = 2, // VST_BBENTRY: [bbid, namechar x N]
VST_CODE_FNENTRY = 3, // VST_FNENTRY: [valueid, offset, namechar x N]		VST_CODE_FNENTRY = 3, // VST_FNENTRY: [valueid, offset, namechar x N]
// VST_COMBINED_FNENTRY: [offset, namechar x N]		// VST_COMBINED_FNENTRY: [funcsumoffset, funcguid]
VST_CODE_COMBINED_FNENTRY = 4		VST_CODE_COMBINED_FNENTRY = 4
};		};

// The module path symbol table only has one code (MST_CODE_ENTRY).		// The module path symbol table only has one code (MST_CODE_ENTRY).
enum ModulePathSymtabCodes {		enum ModulePathSymtabCodes {
MST_CODE_ENTRY = 1, // MST_ENTRY: [modid, namechar x N]		MST_CODE_ENTRY = 1, // MST_ENTRY: [modid, namechar x N]
};		};

▲ Show 20 Lines • Show All 319 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/IR/Function.h

Show All 21 Lines
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/IR/Argument.h"		#include "llvm/IR/Argument.h"
#include "llvm/IR/Attributes.h"		#include "llvm/IR/Attributes.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/CallingConv.h"		#include "llvm/IR/CallingConv.h"
#include "llvm/IR/GlobalObject.h"		#include "llvm/IR/GlobalObject.h"
#include "llvm/IR/OperandTraits.h"		#include "llvm/IR/OperandTraits.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
		#include "llvm/Support/MD5.h"

namespace llvm {		namespace llvm {

class FunctionType;		class FunctionType;
class LLVMContext;		class LLVMContext;
class DISubprogram;		class DISubprogram;

template <>		template <>
▲ Show 20 Lines • Show All 607 Lines • ▼ Show 20 Lines	/// @}
/// Return the modified name for a function suitable to be		/// Return the modified name for a function suitable to be
/// used as the key for a global lookup (e.g. profile or ThinLTO).		/// used as the key for a global lookup (e.g. profile or ThinLTO).
/// The function's original name is \c FuncName and has linkage of type		/// The function's original name is \c FuncName and has linkage of type
/// \c Linkage. The function is defined in module \c FileName.		/// \c Linkage. The function is defined in module \c FileName.
static std::string getGlobalIdentifier(StringRef FuncName,		static std::string getGlobalIdentifier(StringRef FuncName,
GlobalValue::LinkageTypes Linkage,		GlobalValue::LinkageTypes Linkage,
StringRef FileName);		StringRef FileName);

		/// Return a 64-bit global unique ID constructed from global function name
		/// (i.e. returned by getGlobalIdentifier).
		static uint64_t getGUID(StringRef GlobalFuncName) {
		return MD5Hash(GlobalFuncName);
		}

private:		private:
void allocHungoffUselist();		void allocHungoffUselist();
template<int Idx> void setHungoffOperand(Constant *C);		template<int Idx> void setHungoffOperand(Constant *C);

// Shadow Value::setValueSubclassData with a private forwarding method so that		// Shadow Value::setValueSubclassData with a private forwarding method so that
// subclasses cannot accidentally use it.		// subclasses cannot accidentally use it.
void setValueSubclassData(unsigned short D) {		void setValueSubclassData(unsigned short D) {
Value::setValueSubclassData(D);		Value::setValueSubclassData(D);
Show All 21 Lines

llvm/trunk/include/llvm/IR/FunctionInfo.h

Show All 12 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_IR_FUNCTIONINFO_H		#ifndef LLVM_IR_FUNCTIONINFO_H
#define LLVM_IR_FUNCTIONINFO_H		#define LLVM_IR_FUNCTIONINFO_H

#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/StringMap.h"		#include "llvm/ADT/StringMap.h"
		#include "llvm/IR/Function.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"

namespace llvm {		namespace llvm {

/// \brief Function summary information to aid decisions and implementation of		/// \brief Function summary information to aid decisions and implementation of
/// importing.		/// importing.
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	public:
void setBitcodeIndex(uint64_t FuncOffset) { BitcodeIndex = FuncOffset; }		void setBitcodeIndex(uint64_t FuncOffset) { BitcodeIndex = FuncOffset; }
};		};

/// List of function info structures for a particular function name held		/// List of function info structures for a particular function name held
/// in the FunctionMap. Requires a vector in the case of multiple		/// in the FunctionMap. Requires a vector in the case of multiple
/// COMDAT functions of the same name.		/// COMDAT functions of the same name.
typedef std::vector<std::unique_ptr<FunctionInfo>> FunctionInfoList;		typedef std::vector<std::unique_ptr<FunctionInfo>> FunctionInfoList;

/// Map from function name to corresponding function info structures.		/// Map from function GUID to corresponding function info structures.
typedef StringMap<FunctionInfoList> FunctionInfoMapTy;		/// Use a std::map rather than a DenseMap since it will likely incur
		/// less overhead, as the value type is not very small and the size
		/// of the map is unknown, resulting in inefficiencies due to repeated
		/// insertions and resizing.
		typedef std::map<uint64_t, FunctionInfoList> FunctionInfoMapTy;

/// Type used for iterating through the function info map.		/// Type used for iterating through the function info map.
typedef FunctionInfoMapTy::const_iterator const_funcinfo_iterator;		typedef FunctionInfoMapTy::const_iterator const_funcinfo_iterator;
typedef FunctionInfoMapTy::iterator funcinfo_iterator;		typedef FunctionInfoMapTy::iterator funcinfo_iterator;

/// String table to hold/own module path strings, which additionally holds the		/// String table to hold/own module path strings, which additionally holds the
/// module ID assigned to each module during the plugin step. The StringMap		/// module ID assigned to each module during the plugin step. The StringMap
/// makes a copy of and owns inserted strings.		/// makes a copy of and owns inserted strings.
Show All 20 Lines	public:

funcinfo_iterator begin() { return FunctionMap.begin(); }		funcinfo_iterator begin() { return FunctionMap.begin(); }
const_funcinfo_iterator begin() const { return FunctionMap.begin(); }		const_funcinfo_iterator begin() const { return FunctionMap.begin(); }
funcinfo_iterator end() { return FunctionMap.end(); }		funcinfo_iterator end() { return FunctionMap.end(); }
const_funcinfo_iterator end() const { return FunctionMap.end(); }		const_funcinfo_iterator end() const { return FunctionMap.end(); }

/// Get the list of function info objects for a given function.		/// Get the list of function info objects for a given function.
const FunctionInfoList &getFunctionInfoList(StringRef FuncName) {		const FunctionInfoList &getFunctionInfoList(StringRef FuncName) {
return FunctionMap[FuncName];		return FunctionMap[Function::getGUID(FuncName)];
}		}

/// Get the list of function info objects for a given function.		/// Get the list of function info objects for a given function.
const const_funcinfo_iterator findFunctionInfoList(StringRef FuncName) const {		const const_funcinfo_iterator findFunctionInfoList(StringRef FuncName) const {
return FunctionMap.find(FuncName);		return FunctionMap.find(Function::getGUID(FuncName));
}		}

/// Add a function info for a function of the given name.		/// Add a function info for a function of the given name.
void addFunctionInfo(StringRef FuncName, std::unique_ptr<FunctionInfo> Info) {		void addFunctionInfo(StringRef FuncName, std::unique_ptr<FunctionInfo> Info) {
FunctionMap[FuncName].push_back(std::move(Info));		FunctionMap[Function::getGUID(FuncName)].push_back(std::move(Info));
		}

		void addFunctionInfo(uint64_t FuncGUID, std::unique_ptr<FunctionInfo> Info) {
		FunctionMap[FuncGUID].push_back(std::move(Info));
}		}

/// Iterator to allow writer to walk through table during emission.		/// Iterator to allow writer to walk through table during emission.
iterator_range<StringMap<uint64_t>::const_iterator>		iterator_range<StringMap<uint64_t>::const_iterator>
modPathStringEntries() const {		modPathStringEntries() const {
return llvm::make_range(ModulePathStringTable.begin(),		return llvm::make_range(ModulePathStringTable.begin(),
ModulePathStringTable.end());		ModulePathStringTable.end());
}		}
Show All 40 Lines

llvm/trunk/include/llvm/IR/Module.h

Show First 20 Lines • Show All 164 Lines • ▼ Show 20 Lines	private:
AliasListType AliasList; ///< The Aliases in the module		AliasListType AliasList; ///< The Aliases in the module
NamedMDListType NamedMDList; ///< The named metadata in the module		NamedMDListType NamedMDList; ///< The named metadata in the module
std::string GlobalScopeAsm; ///< Inline Asm at global scope.		std::string GlobalScopeAsm; ///< Inline Asm at global scope.
ValueSymbolTable *ValSymTab; ///< Symbol table for values		ValueSymbolTable *ValSymTab; ///< Symbol table for values
ComdatSymTabType ComdatSymTab; ///< Symbol table for COMDATs		ComdatSymTabType ComdatSymTab; ///< Symbol table for COMDATs
std::unique_ptr<GVMaterializer>		std::unique_ptr<GVMaterializer>
Materializer; ///< Used to materialize GlobalValues		Materializer; ///< Used to materialize GlobalValues
std::string ModuleID; ///< Human readable identifier for the module		std::string ModuleID; ///< Human readable identifier for the module
		std::string SourceFileName; ///< Original source file name for module,
		///< recorded in bitcode.
std::string TargetTriple; ///< Platform target triple Module compiled on		std::string TargetTriple; ///< Platform target triple Module compiled on
///< Format: (arch)(sub)-(vendor)-(sys0-(abi)		///< Format: (arch)(sub)-(vendor)-(sys0-(abi)
void *NamedMDSymTab; ///< NamedMDNode names.		void *NamedMDSymTab; ///< NamedMDNode names.
DataLayout DL; ///< DataLayout associated with the module		DataLayout DL; ///< DataLayout associated with the module

friend class Constant;		friend class Constant;

/// @}		/// @}
Show All 9 Lines
/// @}		/// @}
/// @name Module Level Accessors		/// @name Module Level Accessors
/// @{		/// @{

/// Get the module identifier which is, essentially, the name of the module.		/// Get the module identifier which is, essentially, the name of the module.
/// @returns the module identifier as a string		/// @returns the module identifier as a string
const std::string &getModuleIdentifier() const { return ModuleID; }		const std::string &getModuleIdentifier() const { return ModuleID; }

		/// Get the module's original source file name. When compiling from
		/// bitcode, this is taken from a bitcode record where it was recorded.
		/// For other compiles it is the same as the ModuleID, which would
		/// contain the source file name.
		const std::string &getSourceFileName() const { return SourceFileName; }

/// \brief Get a short "name" for the module.		/// \brief Get a short "name" for the module.
///		///
/// This is useful for debugging or logging. It is essentially a convenience		/// This is useful for debugging or logging. It is essentially a convenience
/// wrapper around getModuleIdentifier().		/// wrapper around getModuleIdentifier().
StringRef getName() const { return ModuleID; }		StringRef getName() const { return ModuleID; }

/// Get the data layout string for the module's target platform. This is		/// Get the data layout string for the module's target platform. This is
/// equivalent to getDataLayout()->getStringRepresentation().		/// equivalent to getDataLayout()->getStringRepresentation().
Show All 29 Lines

/// @}		/// @}
/// @name Module Level Mutators		/// @name Module Level Mutators
/// @{		/// @{

/// Set the module identifier.		/// Set the module identifier.
void setModuleIdentifier(StringRef ID) { ModuleID = ID; }		void setModuleIdentifier(StringRef ID) { ModuleID = ID; }

		/// Set the module's original source file name.
		void setSourceFileName(StringRef Name) { SourceFileName = Name; }

/// Set the data layout		/// Set the data layout
void setDataLayout(StringRef Desc);		void setDataLayout(StringRef Desc);
void setDataLayout(const DataLayout &Other);		void setDataLayout(const DataLayout &Other);

/// Set the target triple.		/// Set the target triple.
void setTargetTriple(StringRef T) { TargetTriple = T; }		void setTargetTriple(StringRef T) { TargetTriple = T; }

/// Set the module-scope inline assembly blocks.		/// Set the module-scope inline assembly blocks.
▲ Show 20 Lines • Show All 410 Lines • Show Last 20 Lines

llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp

Show First 20 Lines • Show All 452 Lines • ▼ Show 20 Lines	class FunctionIndexBitcodeReader {
DenseMap<uint64_t, std::unique_ptr<FunctionSummary>> SummaryMap;		DenseMap<uint64_t, std::unique_ptr<FunctionSummary>> SummaryMap;

/// Map populated during module path string table parsing, from the		/// Map populated during module path string table parsing, from the
/// module ID to a string reference owned by the index's module		/// module ID to a string reference owned by the index's module
/// path string table, used to correlate with combined index function		/// path string table, used to correlate with combined index function
/// summary records.		/// summary records.
DenseMap<uint64_t, StringRef> ModuleIdMap;		DenseMap<uint64_t, StringRef> ModuleIdMap;

		/// Original source file name recorded in a bitcode record.
		std::string SourceFileName;

public:		public:
std::error_code error(BitcodeError E, const Twine &Message);		std::error_code error(BitcodeError E, const Twine &Message);
std::error_code error(BitcodeError E);		std::error_code error(BitcodeError E);
std::error_code error(const Twine &Message);		std::error_code error(const Twine &Message);

FunctionIndexBitcodeReader(MemoryBuffer *Buffer,		FunctionIndexBitcodeReader(MemoryBuffer *Buffer,
DiagnosticHandlerFunction DiagnosticHandler,		DiagnosticHandlerFunction DiagnosticHandler,
bool IsLazy = false,		bool IsLazy = false,
▲ Show 20 Lines • Show All 3,223 Lines • ▼ Show 20 Lines	case bitc::MODULE_CODE_METADATA_VALUES:
// record value, regardless of whether we are doing lazy metadata		// record value, regardless of whether we are doing lazy metadata
// loading, so that we have consistent handling and assertion		// loading, so that we have consistent handling and assertion
// checking in parseMetadata for module-level metadata.		// checking in parseMetadata for module-level metadata.
NumModuleMDs = Record[0];		NumModuleMDs = Record[0];
SeenModuleValuesRecord = true;		SeenModuleValuesRecord = true;
assert(MetadataList.size() == 0);		assert(MetadataList.size() == 0);
MetadataList.resize(NumModuleMDs);		MetadataList.resize(NumModuleMDs);
break;		break;
		/// MODULE_CODE_SOURCE_FILENAME: [namechar x N]
		case bitc::MODULE_CODE_SOURCE_FILENAME:
		SmallString<128> ValueName;
		if (convertToString(Record, 0, ValueName))
		return error("Invalid record");
		TheModule->setSourceFileName(ValueName);
		break;
}		}
Record.clear();		Record.clear();
}		}
}		}

/// Helper to read the header common to all bitcode files.		/// Helper to read the header common to all bitcode files.
static bool hasValidBitcodeHeader(BitstreamCursor &Stream) {		static bool hasValidBitcodeHeader(BitstreamCursor &Stream) {
// Sniff for the signature.		// Sniff for the signature.
▲ Show 20 Lines • Show All 1,741 Lines • ▼ Show 20 Lines	while (1) {
default: // Default behavior: ignore (e.g. VST_CODE_BBENTRY records).		default: // Default behavior: ignore (e.g. VST_CODE_BBENTRY records).
break;		break;
case bitc::VST_CODE_FNENTRY: {		case bitc::VST_CODE_FNENTRY: {
// VST_CODE_FNENTRY: [valueid, offset, namechar x N]		// VST_CODE_FNENTRY: [valueid, offset, namechar x N]
if (convertToString(Record, 2, ValueName))		if (convertToString(Record, 2, ValueName))
return error("Invalid record");		return error("Invalid record");
unsigned ValueID = Record[0];		unsigned ValueID = Record[0];
uint64_t FuncOffset = Record[1];		uint64_t FuncOffset = Record[1];
std::unique_ptr<FunctionInfo> FuncInfo =		assert(!IsLazy && "Lazy summary read only supported for combined index");
llvm::make_unique<FunctionInfo>(FuncOffset);		// Gracefully handle bitcode without a function summary section,
if (foundFuncSummary() && !IsLazy) {		// which will simply not populate the index.
		if (foundFuncSummary()) {
DenseMap<uint64_t, std::unique_ptr<FunctionSummary>>::iterator SMI =		DenseMap<uint64_t, std::unique_ptr<FunctionSummary>>::iterator SMI =
SummaryMap.find(ValueID);		SummaryMap.find(ValueID);
assert(SMI != SummaryMap.end() && "Summary info not found");		assert(SMI != SummaryMap.end() && "Summary info not found");
		std::unique_ptr<FunctionInfo> FuncInfo =
		llvm::make_unique<FunctionInfo>(FuncOffset);
FuncInfo->setFunctionSummary(std::move(SMI->second));		FuncInfo->setFunctionSummary(std::move(SMI->second));
		assert(!SourceFileName.empty());
		TheIndex->addFunctionInfo(
		Function::getGlobalIdentifier(
		ValueName, FuncInfo->functionSummary()->getFunctionLinkage(),
		SourceFileName),
		std::move(FuncInfo));
}		}
TheIndex->addFunctionInfo(ValueName, std::move(FuncInfo));

ValueName.clear();		ValueName.clear();
break;		break;
}		}
case bitc::VST_CODE_COMBINED_FNENTRY: {		case bitc::VST_CODE_COMBINED_FNENTRY: {
// VST_CODE_FNENTRY: [offset, namechar x N]		// VST_CODE_COMBINED_FNENTRY: [offset, funcguid]
if (convertToString(Record, 1, ValueName))
return error("Invalid record");
uint64_t FuncSummaryOffset = Record[0];		uint64_t FuncSummaryOffset = Record[0];
		uint64_t FuncGUID = Record[1];
std::unique_ptr<FunctionInfo> FuncInfo =		std::unique_ptr<FunctionInfo> FuncInfo =
llvm::make_unique<FunctionInfo>(FuncSummaryOffset);		llvm::make_unique<FunctionInfo>(FuncSummaryOffset);
if (foundFuncSummary() && !IsLazy) {		if (foundFuncSummary() && !IsLazy) {
DenseMap<uint64_t, std::unique_ptr<FunctionSummary>>::iterator SMI =		DenseMap<uint64_t, std::unique_ptr<FunctionSummary>>::iterator SMI =
SummaryMap.find(FuncSummaryOffset);		SummaryMap.find(FuncSummaryOffset);
assert(SMI != SummaryMap.end() && "Summary info not found");		assert(SMI != SummaryMap.end() && "Summary info not found");
FuncInfo->setFunctionSummary(std::move(SMI->second));		FuncInfo->setFunctionSummary(std::move(SMI->second));
}		}
TheIndex->addFunctionInfo(ValueName, std::move(FuncInfo));		TheIndex->addFunctionInfo(FuncGUID, std::move(FuncInfo));

ValueName.clear();		ValueName.clear();
break;		break;
}		}
}		}
}		}
}		}

// Parse just the blocks needed for function index building out of the module.		// Parse just the blocks needed for function index building out of the module.
// At the end of this routine the function Index is populated with a map		// At the end of this routine the function Index is populated with a map
// from function name to FunctionInfo. The function info contains		// from function name to FunctionInfo. The function info contains
// either the parsed function summary information (when parsing summaries		// either the parsed function summary information (when parsing summaries
// eagerly), or just to the function summary record's offset		// eagerly), or just to the function summary record's offset
// if parsing lazily (IsLazy).		// if parsing lazily (IsLazy).
std::error_code FunctionIndexBitcodeReader::parseModule() {		std::error_code FunctionIndexBitcodeReader::parseModule() {
if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))		if (Stream.EnterSubBlock(bitc::MODULE_BLOCK_ID))
return error("Invalid record");		return error("Invalid record");

		SmallVector<uint64_t, 64> Record;

// Read the function index for this module.		// Read the function index for this module.
while (1) {		while (1) {
BitstreamEntry Entry = Stream.advance();		BitstreamEntry Entry = Stream.advance();

switch (Entry.Kind) {		switch (Entry.Kind) {
case BitstreamEntry::Error:		case BitstreamEntry::Error:
return error("Malformed block");		return error("Malformed block");
case BitstreamEntry::EndBlock:		case BitstreamEntry::EndBlock:
Show All 36 Lines	case BitstreamEntry::SubBlock:
case bitc::MODULE_STRTAB_BLOCK_ID:		case bitc::MODULE_STRTAB_BLOCK_ID:
if (std::error_code EC = parseModuleStringTable())		if (std::error_code EC = parseModuleStringTable())
return EC;		return EC;
break;		break;
}		}
continue;		continue;

case BitstreamEntry::Record:		case BitstreamEntry::Record:
		// Once we find the single record of interest, skip the rest.
		if (!SourceFileName.empty())
Stream.skipRecord(Entry.ID);		Stream.skipRecord(Entry.ID);
		else {
		Record.clear();
		auto BitCode = Stream.readRecord(Entry.ID, Record);
		switch (BitCode) {
		default:
		break; // Default behavior, ignore unknown content.
		/// MODULE_CODE_SOURCE_FILENAME: [namechar x N]
		case bitc::MODULE_CODE_SOURCE_FILENAME:
		SmallString<128> ValueName;
		if (convertToString(Record, 0, ValueName))
		return error("Invalid record");
		SourceFileName = ValueName.c_str();
		break;
		}
		}
continue;		continue;
}		}
}		}
}		}

// Eagerly parse the entire function summary block (i.e. for all functions		// Eagerly parse the entire function summary block (i.e. for all functions
// in the index). This populates the FunctionSummary objects in		// in the index). This populates the FunctionSummary objects in
// the index.		// the index.
▲ Show 20 Lines • Show All 439 Lines • Show Last 20 Lines

llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp

Show First 20 Lines • Show All 612 Lines • ▼ Show 20 Lines	static uint64_t WriteValueSymbolTableForwardDecl(const ValueSymbolTable &VST,
Stream.EmitRecordWithAbbrev(VSTOffsetAbbrev, Vals);		Stream.EmitRecordWithAbbrev(VSTOffsetAbbrev, Vals);

// Compute and return the bit offset to the placeholder, which will be		// Compute and return the bit offset to the placeholder, which will be
// patched when the real VST is written. We can simply subtract the 32-bit		// patched when the real VST is written. We can simply subtract the 32-bit
// fixed size from the current bit number to get the location to backpatch.		// fixed size from the current bit number to get the location to backpatch.
return Stream.GetCurrentBitNo() - 32;		return Stream.GetCurrentBitNo() - 32;
}		}

		enum StringEncoding { SE_Char6, SE_Fixed7, SE_Fixed8 };

		/// Determine the encoding to use for the given string name and length.
		static StringEncoding getStringEncoding(const char *Str, unsigned StrLen) {
		bool isChar6 = true;
		for (const char C = Str, E = C + StrLen; C != E; ++C) {
		if (isChar6)
		isChar6 = BitCodeAbbrevOp::isChar6(*C);
		if ((unsigned char)*C & 128)
		// don't bother scanning the rest.
		return SE_Fixed8;
		}
		if (isChar6)
		return SE_Char6;
		else
		return SE_Fixed7;
		}

/// Emit top-level description of module, including target triple, inline asm,		/// Emit top-level description of module, including target triple, inline asm,
/// descriptors for global variables, and function prototype info.		/// descriptors for global variables, and function prototype info.
/// Returns the bit offset to backpatch with the location of the real VST.		/// Returns the bit offset to backpatch with the location of the real VST.
static uint64_t WriteModuleInfo(const Module *M, const ValueEnumerator &VE,		static uint64_t WriteModuleInfo(const Module *M, const ValueEnumerator &VE,
BitstreamWriter &Stream) {		BitstreamWriter &Stream) {
// Emit various pieces of data attached to a module.		// Emit various pieces of data attached to a module.
if (!M->getTargetTriple().empty())		if (!M->getTargetTriple().empty())
WriteStringRecord(bitc::MODULE_CODE_TRIPLE, M->getTargetTriple(),		WriteStringRecord(bitc::MODULE_CODE_TRIPLE, M->getTargetTriple(),
▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	static uint64_t WriteModuleInfo(const Module *M, const ValueEnumerator &VE,

// Write a record indicating the number of module-level metadata IDs		// Write a record indicating the number of module-level metadata IDs
// This is needed because the ids of metadata are assigned implicitly		// This is needed because the ids of metadata are assigned implicitly
// based on their ordering in the bitcode, with the function-level		// based on their ordering in the bitcode, with the function-level
// metadata ids starting after the module-level metadata ids. For		// metadata ids starting after the module-level metadata ids. For
// function importing where we lazy load the metadata as a postpass,		// function importing where we lazy load the metadata as a postpass,
// we want to avoid parsing the module-level metadata before parsing		// we want to avoid parsing the module-level metadata before parsing
// the imported functions.		// the imported functions.
		{
BitCodeAbbrev *Abbv = new BitCodeAbbrev();		BitCodeAbbrev *Abbv = new BitCodeAbbrev();
Abbv->Add(BitCodeAbbrevOp(bitc::MODULE_CODE_METADATA_VALUES));		Abbv->Add(BitCodeAbbrevOp(bitc::MODULE_CODE_METADATA_VALUES));
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6));		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6));
unsigned MDValsAbbrev = Stream.EmitAbbrev(Abbv);		unsigned MDValsAbbrev = Stream.EmitAbbrev(Abbv);
Vals.push_back(VE.numMDs());		Vals.push_back(VE.numMDs());
Stream.EmitRecord(bitc::MODULE_CODE_METADATA_VALUES, Vals, MDValsAbbrev);		Stream.EmitRecord(bitc::MODULE_CODE_METADATA_VALUES, Vals, MDValsAbbrev);
Vals.clear();		Vals.clear();
		}

		// Emit the module's source file name.
		{
		StringEncoding Bits =
		getStringEncoding(M->getName().data(), M->getName().size());
		BitCodeAbbrevOp AbbrevOpToUse = BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 8);
		if (Bits == SE_Char6)
		AbbrevOpToUse = BitCodeAbbrevOp(BitCodeAbbrevOp::Char6);
		else if (Bits == SE_Fixed7)
		AbbrevOpToUse = BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 7);

		// MODULE_CODE_SOURCE_FILENAME: [namechar x N]
		BitCodeAbbrev *Abbv = new BitCodeAbbrev();
		Abbv->Add(BitCodeAbbrevOp(bitc::MODULE_CODE_SOURCE_FILENAME));
		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
		Abbv->Add(AbbrevOpToUse);
		unsigned FilenameAbbrev = Stream.EmitAbbrev(Abbv);

		for (const auto P : M->getSourceFileName())
		Vals.push_back((unsigned char)P);

		// Emit the finished record.
		Stream.EmitRecord(bitc::MODULE_CODE_SOURCE_FILENAME, Vals, FilenameAbbrev);
		Vals.clear();
		}

uint64_t VSTOffsetPlaceholder =		uint64_t VSTOffsetPlaceholder =
WriteValueSymbolTableForwardDecl(M->getValueSymbolTable(), Stream);		WriteValueSymbolTableForwardDecl(M->getValueSymbolTable(), Stream);
return VSTOffsetPlaceholder;		return VSTOffsetPlaceholder;
}		}

static uint64_t GetOptimizationFlags(const Value *V) {		static uint64_t GetOptimizationFlags(const Value *V) {
uint64_t Flags = 0;		uint64_t Flags = 0;
▲ Show 20 Lines • Show All 1,381 Lines • ▼ Show 20 Lines	case Instruction::VAArg:
Vals.push_back(VE.getTypeID(I.getType())); // restype.		Vals.push_back(VE.getTypeID(I.getType())); // restype.
break;		break;
}		}

Stream.EmitRecord(Code, Vals, AbbrevToUse);		Stream.EmitRecord(Code, Vals, AbbrevToUse);
Vals.clear();		Vals.clear();
}		}

enum StringEncoding { SE_Char6, SE_Fixed7, SE_Fixed8 };

/// Determine the encoding to use for the given string name and length.
static StringEncoding getStringEncoding(const char *Str, unsigned StrLen) {
bool isChar6 = true;
for (const char C = Str, E = C + StrLen; C != E; ++C) {
if (isChar6)
isChar6 = BitCodeAbbrevOp::isChar6(*C);
if ((unsigned char)*C & 128)
// don't bother scanning the rest.
return SE_Fixed8;
}
if (isChar6)
return SE_Char6;
else
return SE_Fixed7;
}

/// Emit names for globals/functions etc. The VSTOffsetPlaceholder,		/// Emit names for globals/functions etc. The VSTOffsetPlaceholder,
/// BitcodeStartBit and FunctionIndex are only passed for the module-level		/// BitcodeStartBit and FunctionIndex are only passed for the module-level
/// VST, where we are including a function bitcode index and need to		/// VST, where we are including a function bitcode index and need to
/// backpatch the VST forward declaration record.		/// backpatch the VST forward declaration record.
static void WriteValueSymbolTable(		static void WriteValueSymbolTable(
const ValueSymbolTable &VST, const ValueEnumerator &VE,		const ValueSymbolTable &VST, const ValueEnumerator &VE,
BitstreamWriter &Stream, uint64_t VSTOffsetPlaceholder = 0,		BitstreamWriter &Stream, uint64_t VSTOffsetPlaceholder = 0,
uint64_t BitcodeStartBit = 0,		uint64_t BitcodeStartBit = 0,
▲ Show 20 Lines • Show All 123 Lines • ▼ Show 20 Lines
}		}

/// Emit function names and summary offsets for the combined index		/// Emit function names and summary offsets for the combined index
/// used by ThinLTO.		/// used by ThinLTO.
static void WriteCombinedValueSymbolTable(const FunctionInfoIndex &Index,		static void WriteCombinedValueSymbolTable(const FunctionInfoIndex &Index,
BitstreamWriter &Stream) {		BitstreamWriter &Stream) {
Stream.EnterSubblock(bitc::VALUE_SYMTAB_BLOCK_ID, 4);		Stream.EnterSubblock(bitc::VALUE_SYMTAB_BLOCK_ID, 4);

// 8-bit fixed-width VST_CODE_COMBINED_FNENTRY function strings.
BitCodeAbbrev *Abbv = new BitCodeAbbrev();		BitCodeAbbrev *Abbv = new BitCodeAbbrev();
Abbv->Add(BitCodeAbbrevOp(bitc::VST_CODE_COMBINED_FNENTRY));		Abbv->Add(BitCodeAbbrevOp(bitc::VST_CODE_COMBINED_FNENTRY));
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // funcoffset		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // funcsumoffset
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // funcguid
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 8));		unsigned FnEntryAbbrev = Stream.EmitAbbrev(Abbv);
unsigned FnEntry8BitAbbrev = Stream.EmitAbbrev(Abbv);

// 7-bit fixed width VST_CODE_COMBINED_FNENTRY function strings.		SmallVector<uint64_t, 64> NameVals;
Abbv = new BitCodeAbbrev();
Abbv->Add(BitCodeAbbrevOp(bitc::VST_CODE_COMBINED_FNENTRY));
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // funcoffset
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 7));
unsigned FnEntry7BitAbbrev = Stream.EmitAbbrev(Abbv);

// 6-bit char6 VST_CODE_COMBINED_FNENTRY function strings.
Abbv = new BitCodeAbbrev();
Abbv->Add(BitCodeAbbrevOp(bitc::VST_CODE_COMBINED_FNENTRY));
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // funcoffset
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Char6));
unsigned FnEntry6BitAbbrev = Stream.EmitAbbrev(Abbv);

// FIXME: We know if the type names can use 7-bit ascii.
SmallVector<unsigned, 64> NameVals;

for (const auto &FII : Index) {		for (const auto &FII : Index) {
for (const auto &FI : FII.getValue()) {		for (const auto &FI : FII.second) {
NameVals.push_back(FI->bitcodeIndex());		NameVals.push_back(FI->bitcodeIndex());

StringRef FuncName = FII.first();		uint64_t FuncGUID = FII.first;

// Figure out the encoding to use for the name.		// VST_CODE_COMBINED_FNENTRY: [funcsumoffset, funcguid]
StringEncoding Bits = getStringEncoding(FuncName.data(), FuncName.size());		unsigned AbbrevToUse = FnEntryAbbrev;

// VST_CODE_COMBINED_FNENTRY: [funcsumoffset, namechar x N]		NameVals.push_back(FuncGUID);
unsigned AbbrevToUse = FnEntry8BitAbbrev;
if (Bits == SE_Char6)
AbbrevToUse = FnEntry6BitAbbrev;
else if (Bits == SE_Fixed7)
AbbrevToUse = FnEntry7BitAbbrev;

for (const auto P : FuncName)
NameVals.push_back((unsigned char)P);

// Emit the finished record.		// Emit the finished record.
Stream.EmitRecord(bitc::VST_CODE_COMBINED_FNENTRY, NameVals, AbbrevToUse);		Stream.EmitRecord(bitc::VST_CODE_COMBINED_FNENTRY, NameVals, AbbrevToUse);
NameVals.clear();		NameVals.clear();
}		}
}		}
Stream.ExitBlock();		Stream.ExitBlock();
}		}
▲ Show 20 Lines • Show All 442 Lines • ▼ Show 20 Lines	static void WriteCombinedFunctionSummary(const FunctionInfoIndex &I,
Abbv->Add(BitCodeAbbrevOp(bitc::FS_CODE_COMBINED_ENTRY));		Abbv->Add(BitCodeAbbrevOp(bitc::FS_CODE_COMBINED_ENTRY));
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // modid		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // modid
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount		Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount
unsigned FSAbbrev = Stream.EmitAbbrev(Abbv);		unsigned FSAbbrev = Stream.EmitAbbrev(Abbv);

SmallVector<unsigned, 64> NameVals;		SmallVector<unsigned, 64> NameVals;
for (const auto &FII : I) {		for (const auto &FII : I) {
for (auto &FI : FII.getValue()) {		for (auto &FI : FII.second) {
FunctionSummary *FS = FI->functionSummary();		FunctionSummary *FS = FI->functionSummary();
assert(FS);		assert(FS);

NameVals.push_back(I.getModuleId(FS->modulePath()));		NameVals.push_back(I.getModuleId(FS->modulePath()));
NameVals.push_back(getEncodedLinkage(FS->getFunctionLinkage()));		NameVals.push_back(getEncodedLinkage(FS->getFunctionLinkage()));
NameVals.push_back(FS->instCount());		NameVals.push_back(FS->instCount());

// Record the starting offset of this summary entry for use		// Record the starting offset of this summary entry for use
▲ Show 20 Lines • Show All 252 Lines • Show Last 20 Lines

llvm/trunk/lib/IR/FunctionInfo.cpp

Show All 17 Lines

// Create the combined function index/summary from multiple		// Create the combined function index/summary from multiple
// per-module instances.		// per-module instances.
void FunctionInfoIndex::mergeFrom(std::unique_ptr<FunctionInfoIndex> Other,		void FunctionInfoIndex::mergeFrom(std::unique_ptr<FunctionInfoIndex> Other,
uint64_t NextModuleId) {		uint64_t NextModuleId) {

StringRef ModPath;		StringRef ModPath;
for (auto &OtherFuncInfoLists : *Other) {		for (auto &OtherFuncInfoLists : *Other) {
std::string FuncName = OtherFuncInfoLists.getKey();		uint64_t FuncGUID = OtherFuncInfoLists.first;
FunctionInfoList &List = OtherFuncInfoLists.second;		FunctionInfoList &List = OtherFuncInfoLists.second;

// Assert that the func info list only has one entry, since we shouldn't		// Assert that the func info list only has one entry, since we shouldn't
// have duplicate names within a single per-module index.		// have duplicate names within a single per-module index.
assert(List.size() == 1);		assert(List.size() == 1);
std::unique_ptr<FunctionInfo> Info = std::move(List.front());		std::unique_ptr<FunctionInfo> Info = std::move(List.front());

// Skip if there was no function summary section.		// Skip if there was no function summary section.
Show All 9 Lines	else
assert(ModPath == Info->functionSummary()->modulePath() &&		assert(ModPath == Info->functionSummary()->modulePath() &&
"Each module in the combined map should have a unique ID");		"Each module in the combined map should have a unique ID");

// Note the module path string ref was copied above and is still owned by		// Note the module path string ref was copied above and is still owned by
// the original per-module index. Reset it to the new module path		// the original per-module index. Reset it to the new module path
// string reference owned by the combined index.		// string reference owned by the combined index.
Info->functionSummary()->setModulePath(ModPath);		Info->functionSummary()->setModulePath(ModPath);

// If it is a local function, rename it.
if (GlobalValue::isLocalLinkage(
Info->functionSummary()->getFunctionLinkage())) {
// Any local functions are virtually renamed when being added to the
// combined index map, to disambiguate from other functions with
// the same name. The symbol table created for the combined index
// file should contain the renamed symbols.
FuncName =
FunctionInfoIndex::getGlobalNameForLocal(FuncName, NextModuleId);
}

// Add new function info to existing list. There may be duplicates when		// Add new function info to existing list. There may be duplicates when
// combining FunctionMap entries, due to COMDAT functions. Any local		// combining FunctionMap entries, due to COMDAT functions. Any local
// functions were virtually renamed above.		// functions were given unique global IDs.
addFunctionInfo(FuncName, std::move(Info));		addFunctionInfo(FuncGUID, std::move(Info));
}		}
}		}

llvm/trunk/lib/IR/Module.cpp

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	template class llvm::SymbolTableListTraits<GlobalVariable>;			template class llvm::SymbolTableListTraits<GlobalVariable>;
	template class llvm::SymbolTableListTraits<GlobalAlias>;			template class llvm::SymbolTableListTraits<GlobalAlias>;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Primitive Module methods.			// Primitive Module methods.
	//			//

	Module::Module(StringRef MID, LLVMContext &C)			Module::Module(StringRef MID, LLVMContext &C)
	: Context(C), Materializer(), ModuleID(MID), DL("") {			: Context(C), Materializer(), ModuleID(MID), SourceFileName(MID), DL("") {
	ValSymTab = new ValueSymbolTable();			ValSymTab = new ValueSymbolTable();
	NamedMDSymTab = new StringMap<NamedMDNode *>();			NamedMDSymTab = new StringMap<NamedMDNode *>();
	Context.addModule(this);			Context.addModule(this);
	}			}

	Module::~Module() {			Module::~Module() {
	Context.removeModule(this);			Context.removeModule(this);
	dropAllReferences();			dropAllReferences();
	▲ Show 20 Lines • Show All 429 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/IPO/FunctionImport.cpp

Show First 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	for (auto &I : BB) {
continue;		continue;
}		}
auto ImportedName = CalledFunction->getName();		auto ImportedName = CalledFunction->getName();
auto Renamed = (ImportedName + Suffix).str();		auto Renamed = (ImportedName + Suffix).str();
// Rename internal functions		// Rename internal functions
if (CalledFunction->hasInternalLinkage()) {		if (CalledFunction->hasInternalLinkage()) {
ImportedName = Renamed;		ImportedName = Renamed;
}		}
auto It = CalledFunctions.insert(ImportedName);		// Compute the global identifier used in the function index.
		auto CalledFunctionGlobalID = Function::getGlobalIdentifier(
		CalledFunction->getName(), CalledFunction->getLinkage(),
		CalledFunction->getParent()->getSourceFileName());
		auto It = CalledFunctions.insert(CalledFunctionGlobalID);
if (!It.second) {		if (!It.second) {
// This is a call to a function we already considered, skip.		// This is a call to a function we already considered, skip.
continue;		continue;
}		}
// Ignore functions already present in the destination module		// Ignore functions already present in the destination module
auto *SrcGV = DestModule.getNamedValue(ImportedName);		auto *SrcGV = DestModule.getNamedValue(ImportedName);
if (SrcGV) {		if (SrcGV) {
if (GlobalAlias *SGA = dyn_cast<GlobalAlias>(SrcGV))		if (GlobalAlias *SGA = dyn_cast<GlobalAlias>(SrcGV))
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	DEBUG(dbgs() << DestModule.getModuleIdentifier() << ": Importing "
<< CalledFunctionName << " from " << ModuleIdentifier << "\n");		<< CalledFunctionName << " from " << ModuleIdentifier << "\n");

auto &SrcModule = ModuleLoaderCache(ModuleIdentifier);		auto &SrcModule = ModuleLoaderCache(ModuleIdentifier);

// The function that we will import!		// The function that we will import!
GlobalValue *SGV = SrcModule.getNamedValue(CalledFunctionName);		GlobalValue *SGV = SrcModule.getNamedValue(CalledFunctionName);

if (!SGV) {		if (!SGV) {
// The destination module is referencing function using their renamed name		// The function is referenced by a global identifier, which has the
// when importing a function that was originally local in the source		// source file name prepended for functions that were originally local
// module. The source module we have might not have been renamed so we try		// in the source module. Strip any prepended name to recover the original
// to remove the suffix added during the renaming to recover the original
// name in the source module.		// name in the source module.
std::pair<StringRef, StringRef> Split =		std::pair<StringRef, StringRef> Split = CalledFunctionName.split(":");
CalledFunctionName.split(".llvm.");		SGV = SrcModule.getNamedValue(Split.second);
SGV = SrcModule.getNamedValue(Split.first);
assert(SGV && "Can't find function to import in source module");		assert(SGV && "Can't find function to import in source module");
}		}
if (!SGV) {		if (!SGV) {
report_fatal_error(Twine("Can't load function '") + CalledFunctionName +		report_fatal_error(Twine("Can't load function '") + CalledFunctionName +
"' in Module '" + SrcModule.getModuleIdentifier() +		"' in Module '" + SrcModule.getModuleIdentifier() +
"', error in the summary?\n");		"', error in the summary?\n");
}		}

▲ Show 20 Lines • Show All 195 Lines • Show Last 20 Lines

llvm/trunk/test/Bitcode/Inputs/source-filename.bc

This is a binary file.

llvm/trunk/test/Bitcode/source-filename.test

				; RUN: llvm-bcanalyzer -dump %p/Inputs/source-filename.bc \| FileCheck %s
				; CHECK: <SOURCE_FILENAME {{.*}} record string = 'source-filename.c'

llvm/trunk/test/tools/gold/X86/thinlto.ll

	Show All 18 Lines
	; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}/test/tools/gold/X86/Output/thinlto.ll.tmp{{.*}}.o'			; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}/test/tools/gold/X86/Output/thinlto.ll.tmp{{.*}}.o'
	; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}/test/tools/gold/X86/Output/thinlto.ll.tmp{{.*}}.o'			; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}/test/tools/gold/X86/Output/thinlto.ll.tmp{{.*}}.o'
	; COMBINED-NEXT: </MODULE_STRTAB_BLOCK			; COMBINED-NEXT: </MODULE_STRTAB_BLOCK
	; COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK			; COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK
	; COMBINED-NEXT: <COMBINED_ENTRY			; COMBINED-NEXT: <COMBINED_ENTRY
	; COMBINED-NEXT: <COMBINED_ENTRY			; COMBINED-NEXT: <COMBINED_ENTRY
	; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK			; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK
	; COMBINED-NEXT: <VALUE_SYMTAB			; COMBINED-NEXT: <VALUE_SYMTAB
	; COMBINED-NEXT: <COMBINED_FNENTRY {{.*}} record string = '{{f\|g}}'			; Check that the format is: op0=offset, op1=funcguid, where funcguid is
	; COMBINED-NEXT: <COMBINED_FNENTRY {{.*}} record string = '{{f\|g}}'			; the lower 64 bits of the function name MD5.
				; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785\|-5300342847281564238}}
				; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785\|-5300342847281564238}}
	; COMBINED-NEXT: </VALUE_SYMTAB			; COMBINED-NEXT: </VALUE_SYMTAB

	define void @f() {			define void @f() {
	entry:			entry:
	ret void			ret void
	}			}

llvm/trunk/test/tools/llvm-lto/thinlto.ll

	; Test combined function index generation for ThinLTO via llvm-lto.			; Test combined function index generation for ThinLTO via llvm-lto.
	; RUN: llvm-as -function-summary %s -o %t.o			; RUN: llvm-as -function-summary %s -o %t.o
	; RUN: llvm-as -function-summary %p/Inputs/thinlto.ll -o %t2.o			; RUN: llvm-as -function-summary %p/Inputs/thinlto.ll -o %t2.o
	; RUN: llvm-lto -thinlto -o %t3 %t.o %t2.o			; RUN: llvm-lto -thinlto -o %t3 %t.o %t2.o
	; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc \| FileCheck %s --check-prefix=COMBINED			; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc \| FileCheck %s --check-prefix=COMBINED
	; RUN: not test -e %t3			; RUN: not test -e %t3

	; COMBINED: <MODULE_STRTAB_BLOCK			; COMBINED: <MODULE_STRTAB_BLOCK
	; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}thinlto.ll.tmp{{.*}}.o'			; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}thinlto.ll.tmp{{.*}}.o'
	; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}thinlto.ll.tmp{{.*}}.o'			; COMBINED-NEXT: <ENTRY {{.}} record string = '{{.}}thinlto.ll.tmp{{.*}}.o'
	; COMBINED-NEXT: </MODULE_STRTAB_BLOCK			; COMBINED-NEXT: </MODULE_STRTAB_BLOCK
	; COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK			; COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK
	; COMBINED-NEXT: <COMBINED_ENTRY			; COMBINED-NEXT: <COMBINED_ENTRY
	; COMBINED-NEXT: <COMBINED_ENTRY			; COMBINED-NEXT: <COMBINED_ENTRY
	; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK			; COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK
	; COMBINED-NEXT: <VALUE_SYMTAB			; COMBINED-NEXT: <VALUE_SYMTAB
	; COMBINED-NEXT: <COMBINED_FNENTRY {{.*}} record string = '{{f\|g}}'			; Check that the format is: op0=offset, op1=funcguid, where funcguid is
	; COMBINED-NEXT: <COMBINED_FNENTRY {{.*}} record string = '{{f\|g}}'			; the lower 64 bits of the function name MD5.
				; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785\|-5300342847281564238}}
				; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785\|-5300342847281564238}}
	; COMBINED-NEXT: </VALUE_SYMTAB			; COMBINED-NEXT: </VALUE_SYMTAB

	define void @f() {			define void @f() {
	entry:			entry:
	ret void			ret void
	}			}

llvm/trunk/tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp

Show First 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	default: return nullptr;
STRINGIFY_CODE(MODULE_CODE, DEPLIB) // FIXME: Remove in 4.0		STRINGIFY_CODE(MODULE_CODE, DEPLIB) // FIXME: Remove in 4.0
STRINGIFY_CODE(MODULE_CODE, GLOBALVAR)		STRINGIFY_CODE(MODULE_CODE, GLOBALVAR)
STRINGIFY_CODE(MODULE_CODE, FUNCTION)		STRINGIFY_CODE(MODULE_CODE, FUNCTION)
STRINGIFY_CODE(MODULE_CODE, ALIAS)		STRINGIFY_CODE(MODULE_CODE, ALIAS)
STRINGIFY_CODE(MODULE_CODE, PURGEVALS)		STRINGIFY_CODE(MODULE_CODE, PURGEVALS)
STRINGIFY_CODE(MODULE_CODE, GCNAME)		STRINGIFY_CODE(MODULE_CODE, GCNAME)
STRINGIFY_CODE(MODULE_CODE, VSTOFFSET)		STRINGIFY_CODE(MODULE_CODE, VSTOFFSET)
STRINGIFY_CODE(MODULE_CODE, METADATA_VALUES)		STRINGIFY_CODE(MODULE_CODE, METADATA_VALUES)
		STRINGIFY_CODE(MODULE_CODE, SOURCE_FILENAME)
}		}
case bitc::IDENTIFICATION_BLOCK_ID:		case bitc::IDENTIFICATION_BLOCK_ID:
switch (CodeID) {		switch (CodeID) {
default:		default:
return nullptr;		return nullptr;
STRINGIFY_CODE(IDENTIFICATION_CODE, STRING)		STRINGIFY_CODE(IDENTIFICATION_CODE, STRING)
STRINGIFY_CODE(IDENTIFICATION_CODE, EPOCH)		STRINGIFY_CODE(IDENTIFICATION_CODE, EPOCH)
}		}
▲ Show 20 Lines • Show All 629 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Use MD5 hash in function index.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 47488

llvm/trunk/include/llvm/Bitcode/LLVMBitCodes.h

llvm/trunk/include/llvm/IR/Function.h

llvm/trunk/include/llvm/IR/FunctionInfo.h

llvm/trunk/include/llvm/IR/Module.h

llvm/trunk/lib/Bitcode/Reader/BitcodeReader.cpp

llvm/trunk/lib/Bitcode/Writer/BitcodeWriter.cpp

llvm/trunk/lib/IR/FunctionInfo.cpp

llvm/trunk/lib/IR/Module.cpp

llvm/trunk/lib/Transforms/IPO/FunctionImport.cpp

llvm/trunk/test/Bitcode/Inputs/source-filename.bc

llvm/trunk/test/Bitcode/source-filename.test

llvm/trunk/test/tools/gold/X86/thinlto.ll

llvm/trunk/test/tools/llvm-lto/thinlto.ll

llvm/trunk/tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp

[ThinLTO] Use MD5 hash in function index.
ClosedPublic