This is an archive of the discontinued LLVM Phabricator instance.

Enable interprocedural optimization in libquantum - LLVM-part [WIP]
Needs ReviewPublic

Authored by grosser on Oct 5 2017, 7:44 AM.

Download Raw Diff

Details

Reviewers

Summary

This is a work-in-progress-patch to enable interprocudural optimization of
libquantum with Polly. It is not yet intended for submission, but illustrates
some of the pass-pipeline changes needed to get end-to-end interprocedural loop
fusion working. Some of choices we take could potentially be improved using
inter-procedural scop modeling, but the LLVM inliner seems to be pretty close
to getting things right.

The core transformation in Polly that enables this optimization is the sparse
representation of a scop model, which I collaborated on with Jan Sjoedermann
(Student from David Chisnall) in the context of array bounds checking and then
later with Johannes Doerfert in the context of LBM and libquantum.

Allow partial inlining of vararg functions

Make gold-plugin compile for me

Add polly support to gold plugin

Disallow remainders in loop unroller

The loop unroller is (even as part of LTO mode) run in the per-TU compilations
and blows up code size without reason. Per-TU compilations should canonicalize,
not spezialized in LTO mode.

Adjust Pass manager for LTO+Polly+libquantum

Enable partial inlining

Add partial inling to LTO pipeline

Add Polly to LTO pipeline

Disable Polly in per-TU pipeline

Inliner: disable single-callsite static bonus

In libquantum the quantum*_ft should not be inlined as this inlining prevents
the inlining of the non-ft functions, which is needed to enable loop fusion with
polly.

As the fault-tolerant versions of the libquantum functions are rather large, the
LLVM inliner would not inline them by default. However, in certain cases the
fact the the _ft functions are only called once causes the single-call-site
bonus to be applied, which allows LLVM to inline the _ft functions and as a
result prevents later inlining of the non-ft functions and consequently prevents
later loop fusion. Interestingly, LLVM already checks in shouldBeDeferred if
inlining of a leave function prevents other possibly more beneficial inlining
opportunities such as inlining of the non-ft functions. While shouldBeDeferred
seems to work in general, the single-callsite bonus being applied drops the
overall CandidateCost below zero, such that shouldBeDeferred is effectively
useless.

It seems that we should evaluate "shouldBeDeferred" without the single
callsite bonus being applied and only add the single callsite bonus after
shouldBeDefferred has been evaluated. For now just disable the single callsite
bonus.

Diff Detail

Build Status

Buildable 10877
Build 10877: arc lint + arc unit

Event Timeline

grosser created this revision.Oct 5 2017, 7:44 AM

Herald added a reviewer: bollu. · View Herald TranscriptOct 5 2017, 7:44 AM

Herald added subscribers: eraman, mehdi_amini, mgorny. · View Herald Transcript

fhahn added a subscriber: fhahn.Oct 5 2017, 9:40 AM

Florian Hahn, feel free to pick this one up.

fhahn mentioned this in D39607: [PartialInliner] Inline vararg functions that forward varargs..Nov 3 2017, 10:43 AM

grosser added inline comments.Nov 16 2017, 9:39 AM

lib/Analysis/InlineCost.cpp
844	Hi Florian, are you interested in upstreaming this hack as well? AFAIU we should subtract this LastCallBonus only after shouldBeDeferred has been called as otherwise shouldBeDeferred might not have any effect?

fhahn added inline comments.Nov 21 2017, 8:05 AM

lib/Analysis/InlineCost.cpp
844	Sure, I'll look into it

Hi Florian,

any update on the partial inliner changes?

Hi Tobias,

I've committed the vararg support a while ago. There is a patch under review to enable partial inlining by default D40477, but it needs more benchmarking I think. I'll look into the LastCallBonus stuff soon!

Cheers,
Florian

Revision Contents

Path

Size

include/

llvm/

Transforms/

Utils/

Cloning.h

8 lines

CodeExtractor.h

7 lines

lib/

Analysis/

InlineCost.cpp

4 lines

Transforms/

IPO/

PartialInlining.cpp

78 lines

PassManagerBuilder.cpp

21 lines

Scalar/

LoopUnrollPass.cpp

6 lines

Utils/

CloneFunction.cpp

14 lines

CodeExtractor.cpp

14 lines

test/

Transforms/

CodeExtractor/

vararg-multi-reference.ll

51 lines

vararg-outlining-aborted.ll

30 lines

vararg.ll

41 lines

tools/

gold/

CMakeLists.txt

4 lines

gold-plugin.cpp

20 lines

Diff 117819

include/llvm/Transforms/Utils/Cloning.h

	Show First 20 Lines • Show All 120 Lines • ▼ Show 20 Lines
	/// the resultant function. The VMap is updated to include mappings from all of			/// the resultant function. The VMap is updated to include mappings from all of
	/// the instructions and basicblocks in the function from their old to new			/// the instructions and basicblocks in the function from their old to new
	/// values. The final argument captures information about the cloned code if			/// values. The final argument captures information about the cloned code if
	/// non-null.			/// non-null.
	///			///
	/// VMap contains no non-identity GlobalValue mappings and debug info metadata			/// VMap contains no non-identity GlobalValue mappings and debug info metadata
	/// will not be cloned.			/// will not be cloned.
	///			///
				/// In case VarargTypes is set, the cloned function is not code generated as
				/// vararg function, but instead a fixed set of arguments is provided that
				/// replace the variable set of arguments normally expected. This temporary set
				/// of arguments helps the partial inliner to later inline the cloned function at
				/// a specific callsite.
	Function CloneFunction(Function F, ValueToValueMapTy &VMap,			Function CloneFunction(Function F, ValueToValueMapTy &VMap,
	ClonedCodeInfo *CodeInfo = nullptr);			ClonedCodeInfo *CodeInfo = nullptr,
				std::vector<Type> VarargTypes = nullptr);

	/// Clone OldFunc into NewFunc, transforming the old arguments into references			/// Clone OldFunc into NewFunc, transforming the old arguments into references
	/// to VMap values. Note that if NewFunc already has basic blocks, the ones			/// to VMap values. Note that if NewFunc already has basic blocks, the ones
	/// cloned into it will be added to the end of the function. This function			/// cloned into it will be added to the end of the function. This function
	/// fills in a list of return instructions, and can optionally remap types			/// fills in a list of return instructions, and can optionally remap types
	/// and/or append the specified suffix to all values cloned.			/// and/or append the specified suffix to all values cloned.
	///			///
	/// If ModuleLevelChanges is false, VMap contains no non-identity GlobalValue			/// If ModuleLevelChanges is false, VMap contains no non-identity GlobalValue
	▲ Show 20 Lines • Show All 127 Lines • Show Last 20 Lines

include/llvm/Transforms/Utils/CodeExtractor.h

Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	public:
/// Behaves just like the generic code sequence constructor, but uses the		/// Behaves just like the generic code sequence constructor, but uses the
/// block sequence of the loop.		/// block sequence of the loop.
CodeExtractor(DominatorTree &DT, Loop &L, bool AggregateArgs = false,		CodeExtractor(DominatorTree &DT, Loop &L, bool AggregateArgs = false,
BlockFrequencyInfo *BFI = nullptr,		BlockFrequencyInfo *BFI = nullptr,
BranchProbabilityInfo *BPI = nullptr);		BranchProbabilityInfo *BPI = nullptr);

/// \brief Perform the extraction, returning the new function.		/// \brief Perform the extraction, returning the new function.
///		///
		/// @param VarArg Extract the given function as vararg function.
		///
/// Returns zero when called on a CodeExtractor instance where isEligible		/// Returns zero when called on a CodeExtractor instance where isEligible
/// returns false.		/// returns false.
Function *extractCodeRegion();		Function *extractCodeRegion(bool VarArg = false);

/// \brief Test whether this code extractor is eligible.		/// \brief Test whether this code extractor is eligible.
///		///
/// Based on the blocks used when constructing the code extractor,		/// Based on the blocks used when constructing the code extractor,
/// determine whether it is eligible for extraction.		/// determine whether it is eligible for extraction.
bool isEligible() const { return !Blocks.empty(); }		bool isEligible() const { return !Blocks.empty(); }

/// \brief Compute the set of input values and output values for the code.		/// \brief Compute the set of input values and output values for the code.
Show All 36 Lines	template <typename T> class ArrayRef;
private:		private:
void severSplitPHINodes(BasicBlock *&Header);		void severSplitPHINodes(BasicBlock *&Header);
void splitReturnBlocks();		void splitReturnBlocks();

Function *constructFunction(const ValueSet &inputs,		Function *constructFunction(const ValueSet &inputs,
const ValueSet &outputs,		const ValueSet &outputs,
BasicBlock *header,		BasicBlock *header,
BasicBlock newRootNode, BasicBlock newHeader,		BasicBlock newRootNode, BasicBlock newHeader,
Function oldFunction, Module M);		Function oldFunction, Module M,
		bool VarArg);

void moveCodeToFunction(Function *newFunction);		void moveCodeToFunction(Function *newFunction);

void calculateNewCallTerminatorWeights(		void calculateNewCallTerminatorWeights(
BasicBlock *CodeReplacer,		BasicBlock *CodeReplacer,
DenseMap<BasicBlock *, BlockFrequency> &ExitWeights,		DenseMap<BasicBlock *, BlockFrequency> &ExitWeights,
BranchProbabilityInfo *BPI);		BranchProbabilityInfo *BPI);

void emitCallAndSwitchStatement(Function *newFunction,		void emitCallAndSwitchStatement(Function *newFunction,
BasicBlock *newHeader,		BasicBlock *newHeader,
ValueSet &inputs,		ValueSet &inputs,
ValueSet &outputs);		ValueSet &outputs);
};		};
}		}

#endif		#endif

lib/Analysis/InlineCost.cpp

Show First 20 Lines • Show All 834 Lines • ▼ Show 20 Lines	void CallAnalyzer::updateThreshold(CallSite CS, Function &Callee) {
SingleBBBonus = Threshold * SingleBBBonusPercent / 100;		SingleBBBonus = Threshold * SingleBBBonusPercent / 100;
VectorBonus = Threshold * VectorBonusPercent / 100;		VectorBonus = Threshold * VectorBonusPercent / 100;

bool OnlyOneCallAndLocalLinkage =		bool OnlyOneCallAndLocalLinkage =
F.hasLocalLinkage() && F.hasOneUse() && &F == CS.getCalledFunction();		F.hasLocalLinkage() && F.hasOneUse() && &F == CS.getCalledFunction();
// If there is only one call of the function, and it has internal linkage,		// If there is only one call of the function, and it has internal linkage,
// the cost of inlining it drops dramatically. It may seem odd to update		// the cost of inlining it drops dramatically. It may seem odd to update
// Cost in updateThreshold, but the bonus depends on the logic in this method.		// Cost in updateThreshold, but the bonus depends on the logic in this method.
if (OnlyOneCallAndLocalLinkage)		// if (OnlyOneCallAndLocalLinkage)
Cost -= LastCallToStaticBonus;		// Cost -= LastCallToStaticBonus;
		grosserAuthorUnsubmitted Not Done Reply Inline Actions Hi Florian, are you interested in upstreaming this hack as well? AFAIU we should subtract this LastCallBonus only after shouldBeDeferred has been called as otherwise shouldBeDeferred might not have any effect? grosser: Hi Florian, are you interested in upstreaming this hack as well? AFAIU we should subtract…
		fhahnUnsubmitted Not Done Reply Inline Actions Sure, I'll look into it fhahn: Sure, I'll look into it
}		}

bool CallAnalyzer::visitCmpInst(CmpInst &I) {		bool CallAnalyzer::visitCmpInst(CmpInst &I) {
Value LHS = I.getOperand(0), RHS = I.getOperand(1);		Value LHS = I.getOperand(0), RHS = I.getOperand(1);
// First try to handle simplified comparisons.		// First try to handle simplified comparisons.
if (simplifyInstruction(I, [&](SmallVectorImpl<Constant *> &COps) {		if (simplifyInstruction(I, [&](SmallVectorImpl<Constant *> &COps) {
return ConstantExpr::getCompare(I.getPredicate(), COps[0], COps[1]);		return ConstantExpr::getCompare(I.getPredicate(), COps[0], COps[1]);
}))		}))
▲ Show 20 Lines • Show All 1,087 Lines • Show Last 20 Lines

lib/Transforms/IPO/PartialInlining.cpp

Show First 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	struct PartialInlinerImpl {
bool run(Module &M);		bool run(Module &M);
Function unswitchFunction(Function F);		Function unswitchFunction(Function F);

// This class speculatively clones the the function to be partial inlined.		// This class speculatively clones the the function to be partial inlined.
// At the end of partial inlining, the remaining callsites to the cloned		// At the end of partial inlining, the remaining callsites to the cloned
// function that are not partially inlined will be fixed up to reference		// function that are not partially inlined will be fixed up to reference
// the original function, and the cloned function will be erased.		// the original function, and the cloned function will be erased.
struct FunctionCloner {		struct FunctionCloner {
FunctionCloner(Function F, FunctionOutliningInfo OI);		FunctionCloner(Function F, FunctionOutliningInfo OI,
		CallInst *CallInst = nullptr);
~FunctionCloner();		~FunctionCloner();

// Prepare for function outlining: making sure there is only		// Prepare for function outlining: making sure there is only
// one incoming edge from the extracted/outlined region to		// one incoming edge from the extracted/outlined region to
// the return block.		// the return block.
void NormalizeReturnBlock();		void NormalizeReturnBlock();

// Do function outlining:		// Do function outlining:
▲ Show 20 Lines • Show All 512 Lines • ▼ Show 20 Lines	for (User *User : Users) {
if (Count)		if (Count)
CallSiteToProfCountMap[User] = *Count;		CallSiteToProfCountMap[User] = *Count;
else		else
CallSiteToProfCountMap[User] = 0;		CallSiteToProfCountMap[User] = 0;
}		}
}		}

PartialInlinerImpl::FunctionCloner::FunctionCloner(Function *F,		PartialInlinerImpl::FunctionCloner::FunctionCloner(Function *F,
FunctionOutliningInfo *OI)		FunctionOutliningInfo *OI,
		CallInst *VarargCaller)
: OrigFunc(F) {		: OrigFunc(F) {
ClonedOI = llvm::make_unique<FunctionOutliningInfo>();		ClonedOI = llvm::make_unique<FunctionOutliningInfo>();

// Clone the function, so that we can hack away on it.		// Clone the function, so that we can hack away on it.
ValueToValueMapTy VMap;		ValueToValueMapTy VMap;

		if (VarargCaller) {
		llvm::ClonedCodeInfo CCI;
		std::vector<Type*> VarargTypes;
		int ArgumentsFunc = VarargCaller->getFunctionType()->getNumParams();
		int ArgumentsCaller = VarargCaller->getNumArgOperands();
		for (int i = ArgumentsFunc; i < ArgumentsCaller; i++)
		VarargTypes.push_back(VarargCaller->getArgOperand(i)->getType());
		ClonedFunc = CloneFunction(F, VMap, &CCI, &VarargTypes);
		} else {
ClonedFunc = CloneFunction(F, VMap);		ClonedFunc = CloneFunction(F, VMap);
		}

ClonedOI->ReturnBlock = cast<BasicBlock>(VMap[OI->ReturnBlock]);		ClonedOI->ReturnBlock = cast<BasicBlock>(VMap[OI->ReturnBlock]);
ClonedOI->NonReturnBlock = cast<BasicBlock>(VMap[OI->NonReturnBlock]);		ClonedOI->NonReturnBlock = cast<BasicBlock>(VMap[OI->NonReturnBlock]);
for (BasicBlock *BB : OI->Entries) {		for (BasicBlock *BB : OI->Entries) {
ClonedOI->Entries.push_back(cast<BasicBlock>(VMap[BB]));		ClonedOI->Entries.push_back(cast<BasicBlock>(VMap[BB]));
}		}
for (BasicBlock *E : OI->ReturnBlockPreds) {		for (BasicBlock *E : OI->ReturnBlockPreds) {
BasicBlock *NewE = cast<BasicBlock>(VMap[E]);		BasicBlock *NewE = cast<BasicBlock>(VMap[E]);
ClonedOI->ReturnBlockPreds.push_back(NewE);		ClonedOI->ReturnBlockPreds.push_back(NewE);
}		}
// Go ahead and update all uses to the duplicate, so that we can just		// Go ahead and update all uses to the duplicate, so that we can just
// use the inliner functionality when we're done hacking.		// use the inliner functionality when we're done hacking.
		if (VarargCaller)
		VarargCaller->setCalledFunction(ClonedFunc);
		else
F->replaceAllUsesWith(ClonedFunc);		F->replaceAllUsesWith(ClonedFunc);
}		}

void PartialInlinerImpl::FunctionCloner::NormalizeReturnBlock() {		void PartialInlinerImpl::FunctionCloner::NormalizeReturnBlock() {

auto getFirstPHI = [](BasicBlock *BB) {		auto getFirstPHI = [](BasicBlock *BB) {
BasicBlock::iterator I = BB->begin();		BasicBlock::iterator I = BB->begin();
PHINode *FirstPhi = nullptr;		PHINode *FirstPhi = nullptr;
while (I != BB->end()) {		while (I != BB->end()) {
▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines	Function *PartialInlinerImpl::FunctionCloner::doFunctionOutlining() {
// Manually calculate a BlockFrequencyInfo and BranchProbabilityInfo.		// Manually calculate a BlockFrequencyInfo and BranchProbabilityInfo.
LoopInfo LI(DT);		LoopInfo LI(DT);
BranchProbabilityInfo BPI(*ClonedFunc, LI);		BranchProbabilityInfo BPI(*ClonedFunc, LI);
ClonedFuncBFI.reset(new BlockFrequencyInfo(*ClonedFunc, BPI, LI));		ClonedFuncBFI.reset(new BlockFrequencyInfo(*ClonedFunc, BPI, LI));

// Extract the body of the if.		// Extract the body of the if.
OutlinedFunc = CodeExtractor(ToExtract, &DT, /AggregateArgs/ false,		OutlinedFunc = CodeExtractor(ToExtract, &DT, /AggregateArgs/ false,
ClonedFuncBFI.get(), &BPI)		ClonedFuncBFI.get(), &BPI)
.extractCodeRegion();		.extractCodeRegion(OrigFunc->isVarArg());

if (OutlinedFunc) {		if (OutlinedFunc) {
OutliningCallBB = PartialInlinerImpl::getOneCallSiteTo(OutlinedFunc)		OutliningCallBB = PartialInlinerImpl::getOneCallSiteTo(OutlinedFunc)
.getInstruction()		.getInstruction()
->getParent();		->getParent();
assert(OutliningCallBB->getParent() == ClonedFunc);		assert(OutliningCallBB->getParent() == ClonedFunc);
}		}

return OutlinedFunc;		return OutlinedFunc;
}		}

PartialInlinerImpl::FunctionCloner::~FunctionCloner() {		PartialInlinerImpl::FunctionCloner::~FunctionCloner() {
// Ditch the duplicate, since we're done with it, and rewrite all remaining		// Ditch the duplicate, since we're done with it, and rewrite all remaining
// users (function pointers, etc.) back to the original function.		// users (function pointers, etc.) back to the original function.
ClonedFunc->replaceAllUsesWith(OrigFunc);		std::vector<CallInst*> Calls;
		for (User *U : ClonedFunc->users()) {
		CallInst *Call = dyn_cast<CallInst>(U);
		if (!Call)
		continue;
		Calls.push_back(Call);
		}
		for (CallInst *Call : Calls)
		Call->setCalledFunction(OrigFunc);
ClonedFunc->eraseFromParent();		ClonedFunc->eraseFromParent();
if (!IsFunctionInlined) {		if (!IsFunctionInlined) {
// Remove the function that is speculatively created if there is no		// Remove the function that is speculatively created if there is no
// reference.		// reference.
if (OutlinedFunc)		if (OutlinedFunc)
OutlinedFunc->eraseFromParent();		OutlinedFunc->eraseFromParent();
}		}
}		}
Show All 11 Lines	if (F->hasFnAttribute(Attribute::NoInline))
return nullptr;		return nullptr;

if (PSI->isFunctionEntryCold(F))		if (PSI->isFunctionEntryCold(F))
return nullptr;		return nullptr;

if (F->user_begin() == F->user_end())		if (F->user_begin() == F->user_end())
return nullptr;		return nullptr;

		if (F->isVarArg()) {
		std::vector<User *> Users(F->user_begin(), F->user_end());
		std::vector<Function *> OutlinedFunctions;

		for (llvm::User *User : Users) {
		CallInst *Caller = dyn_cast<CallInst>(User);

		if (!Caller)
		continue;

		std::unique_ptr<FunctionOutliningInfo> OI = computeOutliningInfo(F);

		if (!OI)
		return nullptr;

		FunctionCloner Cloner(F, OI.get(), Caller);
		Cloner.NormalizeReturnBlock();
		Function *OutlinedFunction = Cloner.doFunctionOutlining();

		if (OutlinedFunction)
		OutlinedFunctions.push_back(OutlinedFunction);

		bool AnyInline = tryPartialInline(Cloner);
		}

		if (OutlinedFunctions.size() > 0) {
		Function *CanonicalFunction = OutlinedFunctions.back();
		OutlinedFunctions.pop_back();

		for (Function *DuplicateFunction : OutlinedFunctions) {
		DuplicateFunction->replaceAllUsesWith(CanonicalFunction);
		DuplicateFunction->eraseFromParent();
		}
		return CanonicalFunction;
		}
		return nullptr;
		}

std::unique_ptr<FunctionOutliningInfo> OI = computeOutliningInfo(F);		std::unique_ptr<FunctionOutliningInfo> OI = computeOutliningInfo(F);

if (!OI)		if (!OI)
return nullptr;		return nullptr;

FunctionCloner Cloner(F, OI.get());		FunctionCloner Cloner(F, OI.get());
Cloner.NormalizeReturnBlock();		Cloner.NormalizeReturnBlock();
Function *OutlinedFunction = Cloner.doFunctionOutlining();		Function *OutlinedFunction = Cloner.doFunctionOutlining();
Show All 32 Lines	ORE.emit(OptimizationRemarkAnalysis(DEBUG_TYPE, "OutlineRegionTooSmall",
<< ore::NV("Function", Cloner.OrigFunc)		<< ore::NV("Function", Cloner.OrigFunc)
<< " not partially inlined into callers (Original Size = "		<< " not partially inlined into callers (Original Size = "
<< ore::NV("OutlinedRegionOriginalSize", Cloner.OutlinedRegionCost)		<< ore::NV("OutlinedRegionOriginalSize", Cloner.OutlinedRegionCost)
<< ", Size of call sequence to outlined function = "		<< ", Size of call sequence to outlined function = "
<< ore::NV("NewSize", SizeCost) << ")");		<< ore::NV("NewSize", SizeCost) << ")");
return false;		return false;
}		}

assert(Cloner.OrigFunc->user_begin() == Cloner.OrigFunc->user_end() &&		//assert(Cloner.OrigFunc->user_begin() == Cloner.OrigFunc->user_end() &&
"F's users should all be replaced!");		// "F's users should all be replaced!");

std::vector<User *> Users(Cloner.ClonedFunc->user_begin(),		std::vector<User *> Users(Cloner.ClonedFunc->user_begin(),
Cloner.ClonedFunc->user_end());		Cloner.ClonedFunc->user_end());

DenseMap<User *, uint64_t> CallSiteToProfCountMap;		DenseMap<User *, uint64_t> CallSiteToProfCountMap;
if (Cloner.OrigFunc->getEntryCount())		if (Cloner.OrigFunc->getEntryCount())
computeCallsiteToProfCountMap(Cloner.ClonedFunc, CallSiteToProfCountMap);		computeCallsiteToProfCountMap(Cloner.ClonedFunc, CallSiteToProfCountMap);

▲ Show 20 Lines • Show All 119 Lines • Show Last 20 Lines

lib/Transforms/IPO/PassManagerBuilder.cpp

Show All 37 Lines
#include "llvm/Transforms/Instrumentation.h"		#include "llvm/Transforms/Instrumentation.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/GVN.h"		#include "llvm/Transforms/Scalar/GVN.h"
#include "llvm/Transforms/Scalar/SimpleLoopUnswitch.h"		#include "llvm/Transforms/Scalar/SimpleLoopUnswitch.h"
#include "llvm/Transforms/Vectorize.h"		#include "llvm/Transforms/Vectorize.h"

using namespace llvm;		using namespace llvm;

		// Enable partial inliner here. It is needed for libquantum and it is not clear
		// to me how to pass -mllvm flags to the gold plugin.
static cl::opt<bool>		static cl::opt<bool>
RunPartialInlining("enable-partial-inlining", cl::init(false), cl::Hidden,		RunPartialInlining("enable-partial-inlining", cl::init(true), cl::Hidden,
cl::ZeroOrMore, cl::desc("Run Partial inlinining pass"));		cl::ZeroOrMore, cl::desc("Run Partial inlinining pass"));

static cl::opt<bool>		static cl::opt<bool>
RunLoopVectorization("vectorize-loops", cl::Hidden,		RunLoopVectorization("vectorize-loops", cl::Hidden,
cl::desc("Run the Loop vectorization passes"));		cl::desc("Run the Loop vectorization passes"));

static cl::opt<bool>		static cl::opt<bool>
RunSLPVectorization("vectorize-slp", cl::Hidden,		RunSLPVectorization("vectorize-slp", cl::Hidden,
▲ Show 20 Lines • Show All 508 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
// Thus both Float2Int and LoopRotate have to preserve AliasAnalysis for		// Thus both Float2Int and LoopRotate have to preserve AliasAnalysis for
// this to work. Fortunately, it is trivial to preserve AliasAnalysis		// this to work. Fortunately, it is trivial to preserve AliasAnalysis
// (doing nothing preserves it as it is required to be conservatively		// (doing nothing preserves it as it is required to be conservatively
// correct in the face of IR changes).		// correct in the face of IR changes).
MPM.add(createGlobalsAAWrapperPass());		MPM.add(createGlobalsAAWrapperPass());

MPM.add(createFloat2IntPass());		MPM.add(createFloat2IntPass());

addExtensionsToPM(EP_VectorizerStart, MPM);		// Do not run Polly before LTO. The per-TU optimizations before LTO should
		// be canonicalizations, not specializations. Polly clearly specializes too
		// much to be run early in LTO mode.
		// addExtensionsToPM(EP_VectorizerStart, MPM);

// Re-rotate loops in all our loop nests. These may have fallout out of		// Re-rotate loops in all our loop nests. These may have fallout out of
// rotated form due to GVN or other transformations, and the vectorizer relies		// rotated form due to GVN or other transformations, and the vectorizer relies
// on the rotated form. Disable header duplication at -Oz.		// on the rotated form. Disable header duplication at -Oz.
MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1));		MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1));

// Distribute loops to allow partial vectorization. I.e. isolate dependences		// Distribute loops to allow partial vectorization. I.e. isolate dependences
// into separate loop that would otherwise inhibit vectorization. This is		// into separate loop that would otherwise inhibit vectorization. This is
▲ Show 20 Lines • Show All 158 Lines • ▼ Show 20 Lines	void PassManagerBuilder::addLTOOptimizationPasses(legacy::PassManagerBase &PM) {

// Inline small functions		// Inline small functions
bool RunInliner = Inliner;		bool RunInliner = Inliner;
if (RunInliner) {		if (RunInliner) {
PM.add(Inliner);		PM.add(Inliner);
Inliner = nullptr;		Inliner = nullptr;
}		}

		// Add partial inliner to LTO mode. Some partial inlining opportunities
		// in libquantum (quantum_objcode_put and quantum_decohere) only arise in the
		// context of LTO.
		PM.add(createBarrierNoopPass());
		if (RunPartialInlining)
		PM.add(createPartialInliningPass());
		PM.add(createBarrierNoopPass());

PM.add(createPruneEHPass()); // Remove dead EH info.		PM.add(createPruneEHPass()); // Remove dead EH info.

// Optimize globals again if we ran the inliner.		// Optimize globals again if we ran the inliner.
if (RunInliner)		if (RunInliner)
PM.add(createGlobalOptimizerPass());		PM.add(createGlobalOptimizerPass());
PM.add(createGlobalDCEPass()); // Remove dead functions.		PM.add(createGlobalDCEPass()); // Remove dead functions.

// If we didn't decide to inline a function, check to see if we can		// If we didn't decide to inline a function, check to see if we can
Show All 24 Lines	void PassManagerBuilder::addLTOOptimizationPasses(legacy::PassManagerBase &PM) {
// More loops are countable; try to optimize them.		// More loops are countable; try to optimize them.
PM.add(createIndVarSimplifyPass());		PM.add(createIndVarSimplifyPass());
PM.add(createLoopDeletionPass());		PM.add(createLoopDeletionPass());
if (EnableLoopInterchange)		if (EnableLoopInterchange)
PM.add(createLoopInterchangePass());		PM.add(createLoopInterchangePass());

if (!DisableUnrollLoops)		if (!DisableUnrollLoops)
PM.add(createSimpleLoopUnrollPass(OptLevel)); // Unroll small loops		PM.add(createSimpleLoopUnrollPass(OptLevel)); // Unroll small loops

		// Add Polly as LTO optimization.
		addExtensionsToPM(EP_VectorizerStart, PM);

PM.add(createLoopVectorizePass(true, LoopVectorize));		PM.add(createLoopVectorizePass(true, LoopVectorize));
// The vectorizer may have significantly shortened a loop body; unroll again.		// The vectorizer may have significantly shortened a loop body; unroll again.
if (!DisableUnrollLoops)		if (!DisableUnrollLoops)
PM.add(createLoopUnrollPass(OptLevel));		PM.add(createLoopUnrollPass(OptLevel));

// Now that we've optimized loops (in particular loop induction variables),		// Now that we've optimized loops (in particular loop induction variables),
// we may have exposed more scalar opportunities. Run parts of the scalar		// we may have exposed more scalar opportunities. Run parts of the scalar
// optimizer again at this point.		// optimizer again at this point.
▲ Show 20 Lines • Show All 191 Lines • Show Last 20 Lines

lib/Transforms/Scalar/LoopUnrollPass.cpp

Show First 20 Lines • Show All 154 Lines • ▼ Show 20 Lines	static TargetTransformInfo::UnrollingPreferences gatherUnrollingPreferences(
UP.Count = 0;		UP.Count = 0;
UP.PeelCount = 0;		UP.PeelCount = 0;
UP.DefaultUnrollRuntimeCount = 8;		UP.DefaultUnrollRuntimeCount = 8;
UP.MaxCount = UINT_MAX;		UP.MaxCount = UINT_MAX;
UP.FullUnrollMaxCount = UINT_MAX;		UP.FullUnrollMaxCount = UINT_MAX;
UP.BEInsns = 2;		UP.BEInsns = 2;
UP.Partial = false;		UP.Partial = false;
UP.Runtime = false;		UP.Runtime = false;
UP.AllowRemainder = true;		// Do not expand loops before LTO phase. When running LTO we want to
		// per-TU passes to canonicalize, but not yet specialize for a given
		// piece of hardware as such specialization will hinder proper inlining
		// and also creates a loop structure that is harder to analyze for Polly.
		UP.AllowRemainder = false;
UP.UnrollRemainder = false;		UP.UnrollRemainder = false;
UP.AllowExpensiveTripCount = false;		UP.AllowExpensiveTripCount = false;
UP.Force = false;		UP.Force = false;
UP.UpperBound = false;		UP.UpperBound = false;
UP.AllowPeeling = true;		UP.AllowPeeling = true;

// Override with any target specific settings		// Override with any target specific settings
TTI.getUnrollingPreferences(L, SE, UP);		TTI.getUnrollingPreferences(L, SE, UP);
▲ Show 20 Lines • Show All 1,174 Lines • Show Last 20 Lines

lib/Transforms/Utils/CloneFunction.cpp

	Show First 20 Lines • Show All 221 Lines • ▼ Show 20 Lines

	/// Return a copy of the specified function and add it to that function's			/// Return a copy of the specified function and add it to that function's
	/// module. Also, any references specified in the VMap are changed to refer to			/// module. Also, any references specified in the VMap are changed to refer to
	/// their mapped value instead of the original one. If any of the arguments to			/// their mapped value instead of the original one. If any of the arguments to
	/// the function are in the VMap, the arguments are deleted from the resultant			/// the function are in the VMap, the arguments are deleted from the resultant
	/// function. The VMap is updated to include mappings from all of the			/// function. The VMap is updated to include mappings from all of the
	/// instructions and basicblocks in the function from their old to new values.			/// instructions and basicblocks in the function from their old to new values.
	///			///
				/// In case VarargTypes is set, the function is not cloned as vararg function,
				/// but instead a fixed set of arguments replaces the vararg part of the
				/// function declaration.
				///
	Function llvm::CloneFunction(Function F, ValueToValueMapTy &VMap,			Function llvm::CloneFunction(Function F, ValueToValueMapTy &VMap,
	ClonedCodeInfo *CodeInfo) {			ClonedCodeInfo *CodeInfo,
				std::vector<Type> VarargTypes) {
	std::vector<Type*> ArgTypes;			std::vector<Type*> ArgTypes;

	// The user might be deleting arguments to the function by specifying them in			// The user might be deleting arguments to the function by specifying them in
	// the VMap. If so, we need to not add the arguments to the arg ty vector			// the VMap. If so, we need to not add the arguments to the arg ty vector
	//			//
	for (const Argument &I : F->args())			for (const Argument &I : F->args())
	if (VMap.count(&I) == 0) // Haven't mapped the argument to anything yet?			if (VMap.count(&I) == 0) // Haven't mapped the argument to anything yet?
	ArgTypes.push_back(I.getType());			ArgTypes.push_back(I.getType());

				if (VarargTypes)
				for (Type T : VarargTypes)
				ArgTypes.push_back(T);

	// Create a new function type...			// Create a new function type...
				bool Vararg = F->getFunctionType()->isVarArg() && !VarargTypes;
	FunctionType *FTy = FunctionType::get(F->getFunctionType()->getReturnType(),			FunctionType *FTy = FunctionType::get(F->getFunctionType()->getReturnType(),
	ArgTypes, F->getFunctionType()->isVarArg());			ArgTypes, Vararg);

	// Create the new function...			// Create the new function...
	Function *NewF =			Function *NewF =
	Function::Create(FTy, F->getLinkage(), F->getName(), F->getParent());			Function::Create(FTy, F->getLinkage(), F->getName(), F->getParent());

	// Loop over the arguments, copying the names of the mapped arguments over...			// Loop over the arguments, copying the names of the mapped arguments over...
	Function::arg_iterator DestI = NewF->arg_begin();			Function::arg_iterator DestI = NewF->arg_begin();
	for (const Argument & I : F->args())			for (const Argument & I : F->args())
	▲ Show 20 Lines • Show All 583 Lines • Show Last 20 Lines

lib/Transforms/Utils/CodeExtractor.cpp

Show First 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	for (auto const &U : Curr->operands()) {
ToVisit.push_back(UU);		ToVisit.push_back(UU);
}		}
}		}

// Don't hoist code containing allocas, invokes, or vastarts.		// Don't hoist code containing allocas, invokes, or vastarts.
for (BasicBlock::const_iterator I = BB.begin(), E = BB.end(); I != E; ++I) {		for (BasicBlock::const_iterator I = BB.begin(), E = BB.end(); I != E; ++I) {
if (isa<AllocaInst>(I) \|\| isa<InvokeInst>(I))		if (isa<AllocaInst>(I) \|\| isa<InvokeInst>(I))
return false;		return false;
		// Allow the extraction of vastart. This is needed to partially inline
		// vararg calls. (It is also save to extract vastart calls, as long
		// as we can ensure that the final signature after partial inlining is
		// again a vararg call called with the same number of arguments.
		continue;
if (const CallInst *CI = dyn_cast<CallInst>(I))		if (const CallInst *CI = dyn_cast<CallInst>(I))
if (const Function *F = CI->getCalledFunction())		if (const Function *F = CI->getCalledFunction())
if (F->getIntrinsicID() == Intrinsic::vastart)		if (F->getIntrinsicID() == Intrinsic::vastart)
return false;		return false;
}		}

return true;		return true;
}		}
▲ Show 20 Lines • Show All 425 Lines • ▼ Show 20 Lines
/// f(in0, ..., inN, out0, ..., outN)		/// f(in0, ..., inN, out0, ..., outN)
///		///
Function *CodeExtractor::constructFunction(const ValueSet &inputs,		Function *CodeExtractor::constructFunction(const ValueSet &inputs,
const ValueSet &outputs,		const ValueSet &outputs,
BasicBlock *header,		BasicBlock *header,
BasicBlock *newRootNode,		BasicBlock *newRootNode,
BasicBlock *newHeader,		BasicBlock *newHeader,
Function *oldFunction,		Function *oldFunction,
Module *M) {		Module *M, bool VarArg) {
DEBUG(dbgs() << "inputs: " << inputs.size() << "\n");		DEBUG(dbgs() << "inputs: " << inputs.size() << "\n");
DEBUG(dbgs() << "outputs: " << outputs.size() << "\n");		DEBUG(dbgs() << "outputs: " << outputs.size() << "\n");

// This function returns unsigned, outputs will go back by reference.		// This function returns unsigned, outputs will go back by reference.
switch (NumExitBlocks) {		switch (NumExitBlocks) {
case 0:		case 0:
case 1: RetTy = Type::getVoidTy(header->getContext()); break;		case 1: RetTy = Type::getVoidTy(header->getContext()); break;
case 2: RetTy = Type::getInt1Ty(header->getContext()); break;		case 2: RetTy = Type::getInt1Ty(header->getContext()); break;
Show All 25 Lines	Function *CodeExtractor::constructFunction(const ValueSet &inputs,
});		});

StructType *StructTy;		StructType *StructTy;
if (AggregateArgs && (inputs.size() + outputs.size() > 0)) {		if (AggregateArgs && (inputs.size() + outputs.size() > 0)) {
StructTy = StructType::get(M->getContext(), paramTy);		StructTy = StructType::get(M->getContext(), paramTy);
paramTy.clear();		paramTy.clear();
paramTy.push_back(PointerType::getUnqual(StructTy));		paramTy.push_back(PointerType::getUnqual(StructTy));
}		}
FunctionType *funcType =		FunctionType *funcType = FunctionType::get(RetTy, paramTy, VarArg);
FunctionType::get(RetTy, paramTy, false);

// Create the new function		// Create the new function
Function *newFunction = Function::Create(funcType,		Function *newFunction = Function::Create(funcType,
GlobalValue::InternalLinkage,		GlobalValue::InternalLinkage,
oldFunction->getName() + "_" +		oldFunction->getName() + "_" +
header->getName(), M);		header->getName(), M);
// If the old function is no-throw, so is the new one.		// If the old function is no-throw, so is the new one.
if (oldFunction->doesNotThrow())		if (oldFunction->doesNotThrow())
▲ Show 20 Lines • Show All 389 Lines • ▼ Show 20 Lines	for (unsigned I = 0, E = BranchDist.Weights.size(); I < E; ++I) {
BranchProbability BP(Weight.Amount, BranchDist.Total);		BranchProbability BP(Weight.Amount, BranchDist.Total);
BPI->setEdgeProbability(CodeReplacer, Weight.TargetNode.Index, BP);		BPI->setEdgeProbability(CodeReplacer, Weight.TargetNode.Index, BP);
}		}
TI->setMetadata(		TI->setMetadata(
LLVMContext::MD_prof,		LLVMContext::MD_prof,
MDBuilder(TI->getContext()).createBranchWeights(BranchWeights));		MDBuilder(TI->getContext()).createBranchWeights(BranchWeights));
}		}

Function *CodeExtractor::extractCodeRegion() {		Function *CodeExtractor::extractCodeRegion(bool IsVarArg) {
if (!isEligible())		if (!isEligible())
return nullptr;		return nullptr;

ValueSet inputs, outputs, SinkingCands, HoistingCands;		ValueSet inputs, outputs, SinkingCands, HoistingCands;
BasicBlock *CommonExit = nullptr;		BasicBlock *CommonExit = nullptr;

// Assumption: this is a single-entry code region, and the header is the first		// Assumption: this is a single-entry code region, and the header is the first
// block in the region.		// block in the region.
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	for (BasicBlock *Block : Blocks) {
}		}
}		}
NumExitBlocks = ExitBlocks.size();		NumExitBlocks = ExitBlocks.size();

// Construct new function based on inputs/outputs & add allocas for all defs.		// Construct new function based on inputs/outputs & add allocas for all defs.
Function *newFunction = constructFunction(inputs, outputs, header,		Function *newFunction = constructFunction(inputs, outputs, header,
newFuncRoot,		newFuncRoot,
codeReplacer, oldFunction,		codeReplacer, oldFunction,
oldFunction->getParent());		oldFunction->getParent(), IsVarArg);

// Update the entry count of the function.		// Update the entry count of the function.
if (BFI) {		if (BFI) {
Optional<uint64_t> EntryCount =		Optional<uint64_t> EntryCount =
BFI->getProfileCountFromFreq(EntryFreq.getFrequency());		BFI->getProfileCountFromFreq(EntryFreq.getFrequency());
if (EntryCount.hasValue())		if (EntryCount.hasValue())
newFunction->setEntryCount(EntryCount.getValue());		newFunction->setEntryCount(EntryCount.getValue());
BFI->setBlockFreq(codeReplacer, EntryFreq.getFrequency());		BFI->setBlockFreq(codeReplacer, EntryFreq.getFrequency());
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

test/Transforms/CodeExtractor/vararg-multi-reference.ll

This file was added.

				; RUN: opt < %s -partial-inliner -S \| FileCheck %s
				; RUN: opt < %s -passes=partial-inliner -S \| FileCheck %s

				%"class.base" = type { %"struct.base"* }
				%"struct.base" = type opaque

				@g = external local_unnamed_addr global i32, align 4
				@status = external local_unnamed_addr global i32, align 4

				define i32 @vararg(i32 %count, ...) {
				bb:
				%tmp = alloca %"class.base", align 4
				%status_loaded = load i32, i32* @status, align 4
				%tmp4 = icmp slt i32 %status_loaded, 0
				br i1 %tmp4, label %bb6, label %bb5

				bb5: ; preds = %bb
				%tmp11 = bitcast %"class.base"* %tmp to i32*
				%tmp2 = load i32, i32* @g, align 4
				%tmp3 = add nsw i32 %tmp2, 1
				store i32 %tmp3, i32* %tmp11, align 4
				store i32 %tmp3, i32* @g, align 4
				call void @bar(i32* nonnull %tmp11) #2
				br label %bb6

				bb6: ; preds = %bb5, %bb
				%tmp7 = phi i32 [ 1, %bb5 ], [ 0, %bb ]
				ret i32 %tmp7
				}

				declare void @bar(i32*)

				define i32 @caller(i32 %arg) local_unnamed_addr {
				bb:
				%tmp = tail call i32 (i32, ...) @vararg(i32 %arg)
				ret i32 %tmp
				}

				; CHECK-LABEL: @caller
				; CHECK: codeRepl.i:
				; CHECK-NEXT: call void (%class.base, ...) @vararg.2_bb5(%class.base %tmp.i)

				define i32 @caller2(i32 %arg) local_unnamed_addr {
				bb:
				%tmp = tail call i32 (i32, ...) @vararg(i32 %arg)
				ret i32 %tmp
				}

				; CHECK-LABEL: @caller2
				; CHECK: codeRepl.i:
				; CHECK-NEXT: call void (%class.base, ...) @vararg.2_bb5(%class.base %tmp.i)

test/Transforms/CodeExtractor/vararg-outlining-aborted.ll

This file was added.

				; RUN: opt < %s -partial-inliner -S \| FileCheck %s
				; RUN: opt < %s -passes=partial-inliner -S \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				; For this test case the partial inliner tries to inline the call, but then
				; aborts. This test case verifies that the abort succeeds without crash.

				define void @quantum_bmeasure(i32 %pos) {
				entry:
				; CHECK: tail call void (i8, ...) @quantum_objcode_put(i8 zeroext undef, i32 %pos)
				tail call void (i8, ...) @quantum_objcode_put(i8 zeroext undef, i32 %pos)
				unreachable
				}

				define void @quantum_objcode_put(i8 zeroext %operation, ...) local_unnamed_addr {
				entry:
				br i1 undef, label %cleanup, label %if.end

				if.end: ; preds = %entry
				unreachable

				cleanup: ; preds = %entry
				ret void
				}

				!llvm.module.flags = !{!0}

				!0 = !{i32 1, !"wchar_size", i32 4}

test/Transforms/CodeExtractor/vararg.ll

This file was added.

				; RUN: opt < %s -partial-inliner -S \| FileCheck %s
				; RUN: opt < %s -passes=partial-inliner -S \| FileCheck %s

				%"class.base" = type { %"struct.base"* }
				%"struct.base" = type opaque

				@g = external local_unnamed_addr global i32, align 4
				@status = external local_unnamed_addr global i32, align 4

				define i32 @vararg(i32 %count, ...) {
				bb:
				%tmp = alloca %"class.base", align 4
				%status_loaded = load i32, i32* @status, align 4
				%tmp4 = icmp slt i32 %status_loaded, 0
				br i1 %tmp4, label %bb6, label %bb5

				bb5: ; preds = %bb
				%tmp11 = bitcast %"class.base"* %tmp to i32*
				%tmp2 = load i32, i32* @g, align 4
				%tmp3 = add nsw i32 %tmp2, 1
				store i32 %tmp3, i32* %tmp11, align 4
				store i32 %tmp3, i32* @g, align 4
				call void @bar(i32 %count, i32* nonnull %tmp11) #2
				br label %bb6

				bb6: ; preds = %bb5, %bb
				%tmp7 = phi i32 [ 1, %bb5 ], [ 0, %bb ]
				ret i32 %tmp7
				}

				declare void @bar(i32, i32*)

				define i32 @caller(i32 %arg) local_unnamed_addr {
				bb:
				%tmp = tail call i32 (i32, ...) @vararg(i32 %arg)
				ret i32 %tmp
				}

				; CHECK-LABEL: @caller
				; CHECK: codeRepl.i:
				; CHECK-NEXT: call void (%class.base, i32, ...) @vararg.1_bb5(%class.base %tmp.i, i32 %arg)

tools/gold/CMakeLists.txt

Show All 9 Lines	set(LLVM_LINK_COMPONENTS
BitWriter		BitWriter
IPO		IPO
)		)

add_llvm_loadable_module(LLVMgold		add_llvm_loadable_module(LLVMgold
gold-plugin.cpp		gold-plugin.cpp
)		)

		if(WITH_POLLY AND LINK_POLLY_INTO_TOOLS)
		target_link_libraries(LLVMgold PUBLIC Polly)
		endif(WITH_POLLY AND LINK_POLLY_INTO_TOOLS)

endif()		endif()

tools/gold/gold-plugin.cpp

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
// Precise and Debian Wheezy (binutils 2.23 is required)		// Precise and Debian Wheezy (binutils 2.23 is required)
#define LDPO_PIE 3		#define LDPO_PIE 3

#define LDPT_GET_SYMBOLS_V3 28		#define LDPT_GET_SYMBOLS_V3 28

using namespace llvm;		using namespace llvm;
using namespace lto;		using namespace lto;

		#ifdef LINK_POLLY_INTO_TOOLS
		namespace polly {
		void initializePollyPasses(llvm::PassRegistry &Registry);
		}
		#endif

static ld_plugin_status discard_message(int level, const char *format, ...) {		static ld_plugin_status discard_message(int level, const char *format, ...) {
// Die loudly. Recent versions of Gold pass ld_plugin_message as the first		// Die loudly. Recent versions of Gold pass ld_plugin_message as the first
// callback in the transfer vector. This should never be called.		// callback in the transfer vector. This should never be called.
abort();		abort();
}		}

static ld_plugin_release_input_file release_input_file = nullptr;		static ld_plugin_release_input_file release_input_file = nullptr;
static ld_plugin_get_input_file get_input_file = nullptr;		static ld_plugin_get_input_file get_input_file = nullptr;
▲ Show 20 Lines • Show All 203 Lines • ▼ Show 20 Lines

extern "C" ld_plugin_status onload(ld_plugin_tv *tv);		extern "C" ld_plugin_status onload(ld_plugin_tv *tv);
ld_plugin_status onload(ld_plugin_tv *tv) {		ld_plugin_status onload(ld_plugin_tv *tv) {
InitializeAllTargetInfos();		InitializeAllTargetInfos();
InitializeAllTargets();		InitializeAllTargets();
InitializeAllTargetMCs();		InitializeAllTargetMCs();
InitializeAllAsmParsers();		InitializeAllAsmParsers();
InitializeAllAsmPrinters();		InitializeAllAsmPrinters();
		PassRegistry &Registry = *PassRegistry::getPassRegistry();
		polly::initializePollyPasses(Registry);

// We're given a pointer to the first transfer vector. We read through them		// We're given a pointer to the first transfer vector. We read through them
// until we find one where tv_tag == LDPT_NULL. The REGISTER_* tagged values		// until we find one where tv_tag == LDPT_NULL. The REGISTER_* tagged values
// contain pointers to functions that we need to call to register our own		// contain pointers to functions that we need to call to register our own
// hooks. The others are addresses of functions we can use to call into gold		// hooks. The others are addresses of functions we can use to call into gold
// for services.		// for services.

bool registeredClaimFile = false;		bool registeredClaimFile = false;
▲ Show 20 Lines • Show All 326 Lines • ▼ Show 20 Lines	if (OldSuffix.empty() && NewSuffix.empty())
return Path;		return Path;
StringRef NewPath = Path;		StringRef NewPath = Path;
NewPath.consume_back(OldSuffix);		NewPath.consume_back(OldSuffix);
std::string NewNewPath = NewPath;		std::string NewNewPath = NewPath;
NewNewPath += NewSuffix;		NewNewPath += NewSuffix;
return NewNewPath;		return NewNewPath;
}		}

static bool isAlpha(char C) {		// These lines prevent compilation.
return ('a' <= C && C <= 'z') \|\| ('A' <= C && C <= 'Z') \|\| C == '_';		//
}		// TODO: Need to investigate. Maybe a version mismatch in some of my checkouts?
		// Disable this for now.
static bool isAlnum(char C) { return isAlpha(C) \|\| ('0' <= C && C <= '9'); }

// Returns true if S is valid as a C language identifier.		// Returns true if S is valid as a C language identifier.
static bool isValidCIdentifier(StringRef S) {		static bool isValidCIdentifier(StringRef S) {
return !S.empty() && isAlpha(S[0]) &&		return true;
std::all_of(S.begin() + 1, S.end(), isAlnum);
}		}

static void addModule(LTO &Lto, claimed_file &F, const void *View,		static void addModule(LTO &Lto, claimed_file &F, const void *View,
StringRef Filename) {		StringRef Filename) {
MemoryBufferRef BufferRef(StringRef((const char *)View, F.filesize),		MemoryBufferRef BufferRef(StringRef((const char *)View, F.filesize),
Filename);		Filename);
Expected<std::unique_ptr<InputFile>> ObjOrErr = InputFile::create(BufferRef);		Expected<std::unique_ptr<InputFile>> ObjOrErr = InputFile::create(BufferRef);

▲ Show 20 Lines • Show All 358 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Enable interprocedural optimization in libquantum - LLVM-part [WIP]Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 117819

include/llvm/Transforms/Utils/Cloning.h

include/llvm/Transforms/Utils/CodeExtractor.h

lib/Analysis/InlineCost.cpp

lib/Transforms/IPO/PartialInlining.cpp

lib/Transforms/IPO/PassManagerBuilder.cpp

lib/Transforms/Scalar/LoopUnrollPass.cpp

lib/Transforms/Utils/CloneFunction.cpp

lib/Transforms/Utils/CodeExtractor.cpp

test/Transforms/CodeExtractor/vararg-multi-reference.ll

test/Transforms/CodeExtractor/vararg-outlining-aborted.ll

test/Transforms/CodeExtractor/vararg.ll

tools/gold/CMakeLists.txt

tools/gold/gold-plugin.cpp

Enable interprocedural optimization in libquantum - LLVM-part [WIP]
Needs ReviewPublic