This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Linker/
-
llvm/
-
Linker/
1
Linker.h
-
lib/Linker/
-
Linker/
2/3
LinkModules.cpp
-
tools/llvm-link/
-
llvm-link/
9/10
llvm-link.cpp

Differential D30738

Don't internalize llvm GV's with InternalizeLinkedSymbols
ClosedPublic

Authored by JDevlieghere on Mar 8 2017, 7:45 AM.

Download Raw Diff

Details

Reviewers

tejohnson
mehdi_amini

Commits

rG5eb9c81d8280: [Linker] Provide callback for internalization
rC297649: [Linker] Provide callback for internalization
rL297649: [Linker] Provide callback for internalization

Summary

Passing llvm::Linker::Flags::InternalizeLinkedSymbols to the IR linker causes the linked symbols to be internalized, including the global variables llvm.global_ctors, llvm.global_dtors, llvm.used and llvm.compiler.used. This results in the module not being valid anymore. In particular it triggers the assertion "invalid linkage for intrinsic global variable" which checks that these GVs either don't have an initializer or have appending linkage.

Diff Detail

Repository: rL LLVM

Event Timeline

JDevlieghere created this revision.Mar 8 2017, 7:45 AM

JDevlieghere added a subscriber: llvm-commits.

Test case?

I'm not familiar with when the InternalizeLinkedSymbols is passed to the module linker. A quick search shows that it is typically passed for GPU code. What is the use case?

Can the users of this instead run llvm::internalizeModule on the linked module? Note that this will invoke InternalizePass::internalizeModule, which in turn already has handling for these special values (see the AlwaysPreserved inserts. This routine also adds things in the llvm.used set to the AlwaysPreserved set, which is the correct thing to do, and is missing from the code in the ModuleLinker that is force internalizing everything currently.

In D30738#695504, @tejohnson wrote:

Can the users of this instead run llvm::internalizeModule on the linked module? Note that this will invoke InternalizePass::internalizeModule, which in turn already has handling for these special values (see the AlwaysPreserved inserts. This routine also adds things in the llvm.used set to the AlwaysPreserved set, which is the correct thing to do, and is missing from the code in the ModuleLinker that is force internalizing everything currently.

Won't the pass internalize the whole module rather than only the linked values? Would it make sense to extend the InternalizePass::internalizeModule API to take a list of used global variables and call it from the module linker to take care of the internalization?

I'll add a test if we move forward with the change. My use case is not GPU related, rather something LTO like where we have bitcode modules with clashing function/symbol names but different implementations. The problem is that d we don't know which those will be before linking, so we want to internalize them.

In D30738#695541, @JDevlieghere wrote:

In D30738#695504, @tejohnson wrote:

Can the users of this instead run llvm::internalizeModule on the linked module? Note that this will invoke InternalizePass::internalizeModule, which in turn already has handling for these special values (see the AlwaysPreserved inserts. This routine also adds things in the llvm.used set to the AlwaysPreserved set, which is the correct thing to do, and is missing from the code in the ModuleLinker that is force internalizing everything currently.

Won't the pass internalize the whole module rather than only the linked values? Would it make sense to extend the InternalizePass::internalizeModule API to take a list of used global variables and call it from the module linker to take care of the internalization?

llvm::internalizeModule takes a callback that can be used to indicate that a GV should not be internalized (MustPreserveGV), so presumably you could use that. I think the right way to do this is from the ModuleLinker clients, not from in the ModuleLinker itself (i.e. remove the InternalizeLinkedSymbols flag and related handling). In fact, you would end up with a circular dependence calling this from the ModuleLinker, since internalizeModule is in Transforms/IPO which has a dependence on Linker.

I'll add a test if we move forward with the change. My use case is not GPU related, rather something LTO like where we have bitcode modules with clashing function/symbol names but different implementations. The problem is that d we don't know which those will be before linking, so we want to internalize them.

So once they are internalized, the clashing symbols are renamed automatically as they are linked in, I assume? If they are external to start with, how would the linking work in non-LTO mode?

I can see how this feature can be useful, but as Teresa mentioned it seems easier from a client point of view to:

Collect the list of symbol in the current module in a Set
Call the IRLinker
Invoke internalizeModule using the Set from 1) as "preservedGlobals".

That said the IRLinker could do this itself, so that llvm::Linker::Flags::InternalizeLinkedSymbols would still be exposed, if is wasn't for the circular dependency mentioned above. We could solve the dependency "easily" by injection: changing the IRLinker to take a callback to perform internalization for stage3. That would allow the client to pass in a pointer to internalizeModule that the IRlinker could use during 3) above.

@tejohnson The symbols are local, so it doesn't cause an issue with regular linking. It only happens when we throw all the modules together.

@mehdi_amini: Is this what you had in mind?

I've run the tests and this implementation doesn't break any existing tests regarding linking with the internalize flag.

In D30738#696733, @JDevlieghere wrote:

@tejohnson The symbols are local, so it doesn't cause an issue with regular linking. It only happens when we throw all the modules together.

I'm missing something - if the symbols are already local, why is internalization needed? A test case would help.

But this approach looks much better, thanks!

@mehdi_amini: Is this what you had in mind?

I've run the tests and this implementation doesn't break any existing tests regarding linking with the internalize flag.

There's a user in clang (under OPT_mlink_cuda_bitcode), I would think it would lose the internalization without providing a callback. Maybe there is no test for this case?

lib/Linker/LinkModules.cpp
110	Think the default param should be "{}" for consistency with linkInModules and with check below for a non-null InternalizeCallback.
554	Presumably the flag can go away, it is subsumed by the presence of the callback. There are a couple of uses in clang itself that will need to change to pass in a callback.
tools/llvm-link/llvm-link.cpp
44	Don't clang format the whole file with your patch. You can commit a separate patch with just the clang formating, or just clang format your changes.

Thanks for the feedback!

In D30738#696758, @tejohnson wrote:

I'm missing something - if the symbols are already local, why is internalization needed? A test case would help.

I have a test case on my work computer, I'll add it tomorrow.

But this approach looks much better, thanks!

@mehdi_amini: Is this what you had in mind?

I've run the tests and this implementation doesn't break any existing tests regarding linking with the internalize flag.

There's a user in clang (under OPT_mlink_cuda_bitcode), I would think it would lose the internalization without providing a callback. Maybe there is no test for this case?

That's my bad, I only ran the LLVM tests. I've removed the flag and updated the code in clang (D30792), which is now also passing all tests. I made sure there was an actual test for this by having the callback always return false, which caused some to fail.

JDevlieghere marked 3 inline comments as done.Mar 9 2017, 2:17 PM

JDevlieghere mentioned this in D30792: Use callback for internalizing linked symbols..

Thanks, a couple more questions.

tools/llvm-link/llvm-link.cpp
323	Why is this needed since linkInModule has a default for the InternalizeCallback parameter?
326	Previously the internalize flag was set before invoking this function from main(). With this being set after the first linkInModule it seems like a behavior change, or am I missing something?

JDevlieghere marked an inline comment as done.Mar 9 2017, 2:33 PM

JDevlieghere added inline comments.

tools/llvm-link/llvm-link.cpp
323	I'm sorry, I don't understand the question. The function is called with only two parameters so the default argument is used for the callback. No callback means no internalization, so thats what we want in the else branch? Am I overlooking something?
326	Indeed, however on line 275 it is cleared on the first iteration by AND'ing it with overrideFromSrc flag.

mehdi_amini added inline comments.Mar 9 2017, 2:45 PM

tools/llvm-link/llvm-link.cpp
99	Please don't mix these formatting change. Did you run clang-format on the full file to have these? Usually `git clang-format` will format only the part of the code you change.
319	Usually we avoid using a Pass outside of the PassManager, can you just call `internalizeModule` with your callback? Also any reason you're taking the StringSet by value instead of const ref?

tejohnson added inline comments.Mar 9 2017, 3:02 PM

tools/llvm-link/llvm-link.cpp
323	You are right, nevermind this comment! Looked at it too fast.

tejohnson added inline comments.Mar 9 2017, 3:24 PM

tools/llvm-link/llvm-link.cpp
326	Ah, that's what I was missing. Please add a comment about why this isn't being applied to the first iteration.

Is the callback really necessary? I don't see anything fundamentally wrong with the first version of the patch.

In D30738#697024, @pcc wrote:

Is the callback really necessary? I don't see anything fundamentally wrong with the first version of the patch.

I don't think we should be duplicating logic that is already available elsewhere.

The fix can be simplified to

if (P.first().startswith("llvm."))
  continue;

That is two lines of code. Even if we simplified the internalize pass in the same way, we should not add this much complexity just to avoid duplicating two lines.

In D30738#697051, @pcc wrote:
The fix can be simplified to
if (P.first().startswith("llvm."))
  continue;

I wouldn't want to see this, this is still duplicating logic. I'd be OK with a helper function`legalToInternalize(GlobalValue &)` that would factor out the logic to decide this.

In D30738#697051, @pcc wrote:
The fix can be simplified to
if (P.first().startswith("llvm."))
  continue;
That is two lines of code. Even if we simplified the internalize pass in the same way, we should not add this much complexity just to avoid duplicating two lines.

Sure that is simple from a # lines standpoint, but the internalizer already had all the right logic, so we wouldn't have had this issue if we used internalizeModule to start with. My original suggestion was to move it out of here completely and have the clients invoke internalizeModule directly if they want to internalize, since I'm not sure the module linker should be in the business of doing internalization.

In D30738#697081, @mehdi_amini wrote:
In D30738#697051, @pcc wrote:
The fix can be simplified to
if (P.first().startswith("llvm."))
  continue;
I wouldn't want to see this, this is still duplicating logic. I'd be OK with a helper function`legalToInternalize(GlobalValue &)` that would factor out the logic to decide this.

Works for me, I guess.

In D30738#697082, @tejohnson wrote:
In D30738#697051, @pcc wrote:
The fix can be simplified to
if (P.first().startswith("llvm."))
  continue;
That is two lines of code. Even if we simplified the internalize pass in the same way, we should not add this much complexity just to avoid duplicating two lines.
Sure that is simple from a # lines standpoint, but the internalizer already had all the right logic, so we wouldn't have had this issue if we used internalizeModule to start with. My original suggestion was to move it out of here completely and have the clients invoke internalizeModule directly if they want to internalize, since I'm not sure the module linker should be in the business of doing internalization.

I'm not sure about that either. We may want to do something like what you propose, but it seems orthogonal to fixing the bug.

Feedback from Teresa and Mehdi

Regarding Peter's comment: While I like the idea of the legalToInternalize, I'm not convinced that it's really better than the callback. Most of the code in InternalizePass is concerned with deciding which GVs are legal to be internalized. Performing the actual internalization is only a small part, so why not keep a unified interface like it is today? Either way you'll need info from the linker about what symbols to internalize, and the callback nicely combines (1) indicating whether it is necessary and (2) performing it.

PS: I'm working on a test case.

In D30738#697456, @JDevlieghere wrote:

Feedback from Teresa and Mehdi

Regarding Peter's comment: While I like the idea of the legalToInternalize, I'm not convinced that it's really better than the callback. Most of the code in InternalizePass is concerned with deciding which GVs are legal to be internalized. Performing the actual internalization is only a small part, so why not keep a unified interface like it is today? Either way you'll need info from the linker about what symbols to internalize, and the callback nicely combines (1) indicating whether it is necessary and (2) performing it.

I remembered the other reason I wanted the internalizer to do this - if we have a llvm.used, then likely there is a GV in its aggregate that shouldn't be internalized. The internalizer will handle this appropriately. It also can't be (efficiently) encapsulated in a per-GV "legalToInternalize" helper.

I was thinking more last night about just having the clients call internalizeModule() after module linking (directly, not from a callback) , but the issue becomes that the symbols that were previously in the combined module shouldn't be internalized. So you either need a mechanism for extracting the final value of ValuesToLink from the module linker, and only internalize those, or you need to build a set of defined values in the combined module before calling linkInModule, and use it to disallow internalization of those symbols. I think the callback approach is better, since the module linker has the best knowledge of *which* GVs should be candidates for internalization (aside from the legality of internalization, which is what is being fixed by using internalizeModule).

PS: I'm working on a test case.

Please include one that has a GV listed in a llvm.used or llvm.compiler.used, and make sure it isn't internalized.

Will take a look at the new version of the patch in a little bit. Thanks!

Code looks good, just a few nits about comments left.

include/llvm/Linker/Linker.h
45	Document new parameter
lib/Linker/LinkModules.cpp
40	Document these members (doxygen-style "///")
tools/llvm-link/llvm-link.cpp
274	s/falgs/flags/

Added test
Feedback from Teresa
Unified diff as suggested by Mehdi in D30792

tejohnson added inline comments.Mar 13 2017, 7:09 AM

clang/include/clang/CodeGen/CodeGenAction.h
40 ↗	(On Diff #91525)	Something I completely missed while reviewing D30792 - I don't see how this is ever getting set. It looks like this line would need to be modified: (From http://llvm-cs.pcc.me.uk/tools/clang/lib/CodeGen/CodeGenAction.cpp#819) LinkModules.push_back( {std::move(ModuleOrErr.get()), F.PropagateAttrs, F.LinkFlags}); That's where the values from BitcodeFileToLink, which has the new Internalize flag being set CompilerInvocation.cpp, should be used to initialize the new LinkModule object. It looks like there are tests in tools/clang/CodeGenCUDA/ that are checking to see if internalization happened properly and should fail. Ah - because of the field ordering, I think the LinkFlags value is essentially being applied to the Internalize flag, so it may be getting lucky on that front. But then I would think the LinkFlags field would be uninitialized, and the desired flag of llvm::Linker::Flags::LinkOnlyNeeded not being set in the CUDA case. Can you look at why test/CodeGenCUDA/link-device-bitcode.cu, which appears to test for both internalization and the only needed linking, isn't failing? If the test is not sufficient, please augment it so that it is failing due to this problem.
llvm/test/Linker/link-flags.ll
25 ↗	(On Diff #91525)	Suggest making the type of the new @foo different, so it is clear which one is internalized vs not.

Addressed code review comments from @tejohnson. Thanks again for reviewing!

clang/include/clang/CodeGen/CodeGenAction.h
40 ↗	(On Diff #91525)	Thanks! I remember going over this line and telling myself "don't forget to add the new field here" but apparently I still forgot. The test wasn't triggered because I didn't have the NVPTX target enabled. With all target enabled it failed, as expected.

LGTM. Thanks!

This revision is now accepted and ready to land.Mar 13 2017, 11:02 AM

Closed by commit rL297649: [Linker] Provide callback for internalization (authored by JDevlieghere). · Explain WhyMar 13 2017, 11:20 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

include/

llvm/

Linker/

Linker.h

14 lines

lib/

Linker/

LinkModules.cpp

32 lines

tools/

llvm-link/

llvm-link.cpp

65 lines

Diff 91192

include/llvm/Linker/Linker.h

//===- Linker.h - Module Linker Interface ------------------------ C++ --===//		//===- Linker.h - Module Linker Interface ------------------------ C++ --===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_LINKER_LINKER_H		#ifndef LLVM_LINKER_LINKER_H
#define LLVM_LINKER_LINKER_H		#define LLVM_LINKER_LINKER_H

#include "llvm/Linker/IRMover.h"		#include "llvm/Linker/IRMover.h"
		#include "llvm/ADT/StringSet.h"

namespace llvm {		namespace llvm {
class Module;		class Module;
class StructType;		class StructType;
class Type;		class Type;

/// This class provides the core functionality of linking in LLVM. It keeps a		/// This class provides the core functionality of linking in LLVM. It keeps a
/// pointer to the merged module so far. It doesn't take ownership of the		/// pointer to the merged module so far. It doesn't take ownership of the
Show All 13 Lines	public:
Linker(Module &M);		Linker(Module &M);

/// \brief Link \p Src into the composite.		/// \brief Link \p Src into the composite.
///		///
/// Passing OverrideSymbols as true will have symbols from Src		/// Passing OverrideSymbols as true will have symbols from Src
/// shadow those in the Dest.		/// shadow those in the Dest.
///		///
/// Returns true on error.		/// Returns true on error.
bool linkInModule(std::unique_ptr<Module> Src, unsigned Flags = Flags::None);		bool linkInModule(std::unique_ptr<Module> Src, unsigned Flags = Flags::None,
		std::function<void(Module &, StringSet<>)>
		tejohnsonUnsubmitted Not Done Reply Inline Actions Document new parameter tejohnson: Document new parameter
static bool linkModules(Module &Dest, std::unique_ptr<Module> Src,		InternalizeCallback = {});
unsigned Flags = Flags::None);
		static bool
		linkModules(Module &Dest, std::unique_ptr<Module> Src,
		unsigned Flags = Flags::None,
		std::function<void(Module &, StringSet<>)>
		InternalizeCallback = {});
};		};

} // End llvm namespace		} // End llvm namespace

#endif		#endif

lib/Linker/LinkModules.cpp

//===- lib/Linker/LinkModules.cpp - Module Linker Implementation ----------===//		//===- lib/Linker/LinkModules.cpp - Module Linker Implementation ----------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements the LLVM module linker.		// This file implements the LLVM module linker.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "LinkDiagnosticInfo.h"		#include "LinkDiagnosticInfo.h"
#include "llvm-c/Linker.h"		#include "llvm-c/Linker.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/StringSet.h"
#include "llvm/IR/Comdat.h"		#include "llvm/IR/Comdat.h"
#include "llvm/IR/DiagnosticPrinter.h"		#include "llvm/IR/DiagnosticPrinter.h"
#include "llvm/IR/GlobalValue.h"		#include "llvm/IR/GlobalValue.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Linker/Linker.h"		#include "llvm/Linker/Linker.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
using namespace llvm;		using namespace llvm;

namespace {		namespace {

/// This is an implementation class for the LinkModules function, which is the		/// This is an implementation class for the LinkModules function, which is the
/// entrypoint for this file.		/// entrypoint for this file.
class ModuleLinker {		class ModuleLinker {
IRMover &Mover;		IRMover &Mover;
std::unique_ptr<Module> SrcM;		std::unique_ptr<Module> SrcM;

SetVector<GlobalValue *> ValuesToLink;		SetVector<GlobalValue *> ValuesToLink;
StringSet<> Internalize;

/// For symbol clashes, prefer those from Src.		/// For symbol clashes, prefer those from Src.
unsigned Flags;		unsigned Flags;

		StringSet<> Internalize;
		std::function<void(Module &, StringSet<>)> InternalizeCallback;
		tejohnsonUnsubmitted Not Done Reply Inline Actions Document these members (doxygen-style "///") tejohnson: Document these members (doxygen-style "///")

/// Used as the callback for lazy linking.		/// Used as the callback for lazy linking.
/// The mover has just hit GV and we have to decide if it, and other members		/// The mover has just hit GV and we have to decide if it, and other members
/// of the same comdat, should be linked. Every member to be linked is passed		/// of the same comdat, should be linked. Every member to be linked is passed
/// to Add.		/// to Add.
void addLazyFor(GlobalValue &GV, const IRMover::ValueAdder &Add);		void addLazyFor(GlobalValue &GV, const IRMover::ValueAdder &Add);

bool shouldOverrideFromSrc() { return Flags & Linker::OverrideFromSrc; }		bool shouldOverrideFromSrc() { return Flags & Linker::OverrideFromSrc; }
bool shouldLinkOnlyNeeded() { return Flags & Linker::LinkOnlyNeeded; }		bool shouldLinkOnlyNeeded() { return Flags & Linker::LinkOnlyNeeded; }
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	class ModuleLinker {
/// Drop GV if it is a member of a comdat that we are dropping.		/// Drop GV if it is a member of a comdat that we are dropping.
/// This can happen with COFF's largest selection kind.		/// This can happen with COFF's largest selection kind.
void dropReplacedComdat(GlobalValue &GV,		void dropReplacedComdat(GlobalValue &GV,
const DenseSet<const Comdat *> &ReplacedDstComdats);		const DenseSet<const Comdat *> &ReplacedDstComdats);

bool linkIfNeeded(GlobalValue &GV);		bool linkIfNeeded(GlobalValue &GV);

public:		public:
ModuleLinker(IRMover &Mover, std::unique_ptr<Module> SrcM, unsigned Flags)		ModuleLinker(IRMover &Mover, std::unique_ptr<Module> SrcM, unsigned Flags,
: Mover(Mover), SrcM(std::move(SrcM)), Flags(Flags) {}		std::function<void(Module &, StringSet<>)> InternalizeCallback =
		std::function<void(Module &, StringSet<>)>())
		tejohnsonUnsubmitted Done Reply Inline Actions Think the default param should be "{}" for consistency with linkInModules and with check below for a non-null InternalizeCallback. tejohnson: Think the default param should be "{}" for consistency with linkInModules and with check below…
		: Mover(Mover), SrcM(std::move(SrcM)), Flags(Flags),
		InternalizeCallback(std::move(InternalizeCallback)) {}

bool run();		bool run();
};		};
}		}

static GlobalValue::VisibilityTypes		static GlobalValue::VisibilityTypes
getMinVisibility(GlobalValue::VisibilityTypes A,		getMinVisibility(GlobalValue::VisibilityTypes A,
GlobalValue::VisibilityTypes B) {		GlobalValue::VisibilityTypes B) {
▲ Show 20 Lines • Show All 425 Lines • ▼ Show 20 Lines	if (Error E = Mover.move(std::move(SrcM), ValuesToLink.getArrayRef(),
handleAllErrors(std::move(E), [&](ErrorInfoBase &EIB) {		handleAllErrors(std::move(E), [&](ErrorInfoBase &EIB) {
DstM.getContext().diagnose(LinkDiagnosticInfo(DS_Error, EIB.message()));		DstM.getContext().diagnose(LinkDiagnosticInfo(DS_Error, EIB.message()));
HasErrors = true;		HasErrors = true;
});		});
}		}
if (HasErrors)		if (HasErrors)
return true;		return true;

for (auto &P : Internalize) {		if (shouldInternalizeLinkedSymbols() && InternalizeCallback)
		tejohnsonUnsubmitted Done Reply Inline Actions Presumably the flag can go away, it is subsumed by the presence of the callback. There are a couple of uses in clang itself that will need to change to pass in a callback. tejohnson: Presumably the flag can go away, it is subsumed by the presence of the callback. There are a…
GlobalValue *GV = DstM.getNamedValue(P.first());		InternalizeCallback(DstM, Internalize);
GV->setLinkage(GlobalValue::InternalLinkage);
}

return false;		return false;
}		}

Linker::Linker(Module &M) : Mover(M) {}		Linker::Linker(Module &M) : Mover(M) {}

bool Linker::linkInModule(std::unique_ptr<Module> Src, unsigned Flags) {		bool Linker::linkInModule(
ModuleLinker ModLinker(Mover, std::move(Src), Flags);		std::unique_ptr<Module> Src, unsigned Flags,
		std::function<void(Module &, StringSet<>)> InternalizeCallback) {
		ModuleLinker ModLinker(Mover, std::move(Src), Flags,
		std::move(InternalizeCallback));
return ModLinker.run();		return ModLinker.run();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// LinkModules entrypoint.		// LinkModules entrypoint.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// This function links two modules together, with the resulting Dest module		/// This function links two modules together, with the resulting Dest module
/// modified to be the composite of the two input modules. If an error occurs,		/// modified to be the composite of the two input modules. If an error occurs,
/// true is returned and ErrorMsg (if not null) is set to indicate the problem.		/// true is returned and ErrorMsg (if not null) is set to indicate the problem.
/// Upon failure, the Dest module could be in a modified state, and shouldn't be		/// Upon failure, the Dest module could be in a modified state, and shouldn't be
/// relied on to be consistent.		/// relied on to be consistent.
bool Linker::linkModules(Module &Dest, std::unique_ptr<Module> Src,		bool Linker::linkModules(
unsigned Flags) {		Module &Dest, std::unique_ptr<Module> Src, unsigned Flags,
		std::function<void(Module &, StringSet<>)> InternalizeCallback) {
Linker L(Dest);		Linker L(Dest);
return L.linkInModule(std::move(Src), Flags);		return L.linkInModule(std::move(Src), Flags, std::move(InternalizeCallback));
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// C API.		// C API.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

LLVMBool LLVMLinkModules2(LLVMModuleRef Dest, LLVMModuleRef Src) {		LLVMBool LLVMLinkModules2(LLVMModuleRef Dest, LLVMModuleRef Src) {
Module *D = unwrap(Dest);		Module *D = unwrap(Dest);
std::unique_ptr<Module> M(unwrap(Src));		std::unique_ptr<Module> M(unwrap(Src));
return Linker::linkModules(*D, std::move(M));		return Linker::linkModules(*D, std::move(M));
}		}

tools/llvm-link/llvm-link.cpp

Show All 28 Lines
#include "llvm/Support/ManagedStatic.h"		#include "llvm/Support/ManagedStatic.h"
#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"
#include "llvm/Support/PrettyStackTrace.h"		#include "llvm/Support/PrettyStackTrace.h"
#include "llvm/Support/Signals.h"		#include "llvm/Support/Signals.h"
#include "llvm/Support/SourceMgr.h"		#include "llvm/Support/SourceMgr.h"
#include "llvm/Support/SystemUtils.h"		#include "llvm/Support/SystemUtils.h"
#include "llvm/Support/ToolOutputFile.h"		#include "llvm/Support/ToolOutputFile.h"
#include "llvm/Transforms/IPO/FunctionImport.h"		#include "llvm/Transforms/IPO/FunctionImport.h"
		#include "llvm/Transforms/IPO/Internalize.h"
#include "llvm/Transforms/Utils/FunctionImportUtils.h"		#include "llvm/Transforms/Utils/FunctionImportUtils.h"

#include <memory>		#include <memory>
#include <utility>		#include <utility>
using namespace llvm;		using namespace llvm;

static cl::list<std::string>		static cl::list<std::string> InputFilenames(cl::Positional, cl::OneOrMore,
		tejohnsonUnsubmitted Done Reply Inline Actions Don't clang format the whole file with your patch. You can commit a separate patch with just the clang formating, or just clang format your changes. tejohnson: Don't clang format the whole file with your patch. You can commit a separate patch with just…
InputFilenames(cl::Positional, cl::OneOrMore,
cl::desc("<input bitcode files>"));		cl::desc("<input bitcode files>"));

static cl::list<std::string> OverridingInputs(		static cl::list<std::string> OverridingInputs(
"override", cl::ZeroOrMore, cl::value_desc("filename"),		"override", cl::ZeroOrMore, cl::value_desc("filename"),
cl::desc(		cl::desc(
"input bitcode file which can override previously defined symbol(s)"));		"input bitcode file which can override previously defined symbol(s)"));

// Option to simulate function importing for testing. This enables using		// Option to simulate function importing for testing. This enables using
// llvm-link to simulate ThinLTO backend processes.		// llvm-link to simulate ThinLTO backend processes.
static cl::list<std::string> Imports(		static cl::list<std::string> Imports(
"import", cl::ZeroOrMore, cl::value_desc("function:filename"),		"import", cl::ZeroOrMore, cl::value_desc("function:filename"),
cl::desc("Pair of function name and filename, where function should be "		cl::desc("Pair of function name and filename, where function should be "
"imported from bitcode in filename"));		"imported from bitcode in filename"));

// Option to support testing of function importing. The module summary		// Option to support testing of function importing. The module summary
// must be specified in the case were we request imports via the -import		// must be specified in the case were we request imports via the -import
// option, as well as when compiling any module with functions that may be		// option, as well as when compiling any module with functions that may be
// exported (imported by a different llvm-link -import invocation), to ensure		// exported (imported by a different llvm-link -import invocation), to ensure
// consistent promotion and renaming of locals.		// consistent promotion and renaming of locals.
static cl::opt<std::string>		static cl::opt<std::string>
SummaryIndex("summary-index", cl::desc("Module summary index filename"),		SummaryIndex("summary-index", cl::desc("Module summary index filename"),
cl::init(""), cl::value_desc("filename"));		cl::init(""), cl::value_desc("filename"));

static cl::opt<std::string>		static cl::opt<std::string> OutputFilename("o",
OutputFilename("o", cl::desc("Override output filename"), cl::init("-"),		cl::desc("Override output filename"),
		cl::init("-"),
cl::value_desc("filename"));		cl::value_desc("filename"));

static cl::opt<bool>		static cl::opt<bool> Internalize("internalize",
Internalize("internalize", cl::desc("Internalize linked symbols"));		cl::desc("Internalize linked symbols"));

static cl::opt<bool>		static cl::opt<bool>
DisableDITypeMap("disable-debug-info-type-map",		DisableDITypeMap("disable-debug-info-type-map",
cl::desc("Don't use a uniquing type map for debug info"));		cl::desc("Don't use a uniquing type map for debug info"));

static cl::opt<bool>		static cl::opt<bool> OnlyNeeded("only-needed",
OnlyNeeded("only-needed", cl::desc("Link only needed symbols"));		cl::desc("Link only needed symbols"));

static cl::opt<bool>		static cl::opt<bool> Force("f", cl::desc("Enable binary output on terminals"));
Force("f", cl::desc("Enable binary output on terminals"));

static cl::opt<bool>		static cl::opt<bool> DisableLazyLoad("disable-lazy-loading",
DisableLazyLoad("disable-lazy-loading",
cl::desc("Disable lazy module loading"));		cl::desc("Disable lazy module loading"));

static cl::opt<bool>		static cl::opt<bool>
OutputAssembly("S", cl::desc("Write output as LLVM assembly"), cl::Hidden);		OutputAssembly("S", cl::desc("Write output as LLVM assembly"), cl::Hidden);

static cl::opt<bool>		static cl::opt<bool> Verbose("v",
Verbose("v", cl::desc("Print information about actions taken"));		cl::desc("Print information about actions taken"));

static cl::opt<bool>		static cl::opt<bool> DumpAsm("d", cl::desc("Print assembly as linked"),
DumpAsm("d", cl::desc("Print assembly as linked"), cl::Hidden);		cl::Hidden);

static cl::opt<bool>		static cl::opt<bool> SuppressWarnings("suppress-warnings",
SuppressWarnings("suppress-warnings", cl::desc("Suppress all linking warnings"),		cl::desc("Suppress all linking warnings"),
cl::init(false));		cl::init(false));
		mehdi_aminiUnsubmitted Done Reply Inline Actions Please don't mix these formatting change. Did you run clang-format on the full file to have these? Usually `git clang-format` will format only the part of the code you change. mehdi_amini: Please don't mix these formatting change. Did you run clang-format on the full file to have…

static cl::opt<bool> PreserveBitcodeUseListOrder(		static cl::opt<bool> PreserveBitcodeUseListOrder(
"preserve-bc-uselistorder",		"preserve-bc-uselistorder",
cl::desc("Preserve use-list order when writing LLVM bitcode."),		cl::desc("Preserve use-list order when writing LLVM bitcode."),
cl::init(true), cl::Hidden);		cl::init(true), cl::Hidden);

static cl::opt<bool> PreserveAssemblyUseListOrder(		static cl::opt<bool> PreserveAssemblyUseListOrder(
"preserve-ll-uselistorder",		"preserve-ll-uselistorder",
cl::desc("Preserve use-list order when writing LLVM assembly."),		cl::desc("Preserve use-list order when writing LLVM assembly."),
cl::init(false), cl::Hidden);		cl::init(false), cl::Hidden);

static ExitOnError ExitOnErr;		static ExitOnError ExitOnErr;

// Read the specified bitcode file in and return it. This routine searches the		// Read the specified bitcode file in and return it. This routine searches the
// link path for the specified file to try to find it...		// link path for the specified file to try to find it...
//		//
static std::unique_ptr<Module> loadFile(const char *argv0,		static std::unique_ptr<Module> loadFile(const char *argv0,
const std::string &FN,		const std::string &FN,
LLVMContext &Context,		LLVMContext &Context,
bool MaterializeMetadata = true) {		bool MaterializeMetadata = true) {
SMDiagnostic Err;		SMDiagnostic Err;
if (Verbose) errs() << "Loading '" << FN << "'\n";		if (Verbose)
		errs() << "Loading '" << FN << "'\n";
std::unique_ptr<Module> Result;		std::unique_ptr<Module> Result;
if (DisableLazyLoad)		if (DisableLazyLoad)
Result = parseIRFile(FN, Err, Context);		Result = parseIRFile(FN, Err, Context);
else		else
Result = getLazyIRFileModule(FN, Err, Context, !MaterializeMetadata);		Result = getLazyIRFileModule(FN, Err, Context, !MaterializeMetadata);

if (!Result) {		if (!Result) {
Err.print(argv0, errs());		Err.print(argv0, errs());
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	static bool importFunctions(const char *argv0, Module &DestModule) {
};		};
FunctionImporter Importer(*Index, CachedModuleLoader);		FunctionImporter Importer(*Index, CachedModuleLoader);
ExitOnErr(Importer.importFunctions(DestModule, ImportList));		ExitOnErr(Importer.importFunctions(DestModule, ImportList));

return true;		return true;
}		}

static bool linkFiles(const char *argv0, LLVMContext &Context, Linker &L,		static bool linkFiles(const char *argv0, LLVMContext &Context, Linker &L,
const cl::list<std::string> &Files,		const cl::list<std::string> &Files, unsigned Flags) {
unsigned Flags) {
// Filter out flags that don't apply to the first file we load.		// Filter out flags that don't apply to the first file we load.
unsigned ApplicableFlags = Flags & Linker::Flags::OverrideFromSrc;		unsigned ApplicableFlags = Flags & Linker::Flags::OverrideFromSrc;
for (const auto &File : Files) {		for (const auto &File : Files) {
		tejohnsonUnsubmitted Not Done Reply Inline Actions s/falgs/flags/ tejohnson: s/falgs/flags/
std::unique_ptr<Module> M = loadFile(argv0, File, Context);		std::unique_ptr<Module> M = loadFile(argv0, File, Context);
if (!M.get()) {		if (!M.get()) {
errs() << argv0 << ": error loading file '" << File << "'\n";		errs() << argv0 << ": error loading file '" << File << "'\n";
return false;		return false;
}		}

// Note that when ODR merging types cannot verify input files in here When		// Note that when ODR merging types cannot verify input files in here When
// doing that debug metadata in the src module might already be pointing to		// doing that debug metadata in the src module might already be pointing to
Show All 22 Lines	if (!SummaryIndex.empty()) {
// Promotion		// Promotion
if (renameModuleForThinLTO(M, Index))		if (renameModuleForThinLTO(M, Index))
return true;		return true;
}		}

if (Verbose)		if (Verbose)
errs() << "Linking in '" << File << "'\n";		errs() << "Linking in '" << File << "'\n";

if (L.linkInModule(std::move(M), ApplicableFlags))		if (L.linkInModule(
		std::move(M), ApplicableFlags, [](Module &M, StringSet<> GVS) {
		InternalizePass IP([&M, &GVS](const GlobalValue &GV) {
		return !GV.hasName() \|\| (GVS.count(GV.getName()) == 0);
		});
		IP.internalizeModule(M);
		}))
		mehdi_aminiUnsubmitted Done Reply Inline Actions Usually we avoid using a Pass outside of the PassManager, can you just call `internalizeModule` with your callback? Also any reason you're taking the StringSet by value instead of const ref? mehdi_amini: Usually we avoid using a Pass outside of the PassManager, can you just call `internalizeModule`…
return false;		return false;
// All linker flags apply to linking of subsequent files.		// All linker flags apply to linking of subsequent files.
ApplicableFlags = Flags;		ApplicableFlags = Flags;
}		}
		tejohnsonUnsubmitted Done Reply Inline Actions Why is this needed since linkInModule has a default for the InternalizeCallback parameter? tejohnson: Why is this needed since linkInModule has a default for the InternalizeCallback parameter?
		JDevlieghereAuthorUnsubmitted Done Reply Inline Actions I'm sorry, I don't understand the question. The function is called with only two parameters so the default argument is used for the callback. No callback means no internalization, so thats what we want in the else branch? Am I overlooking something? JDevlieghere: I'm sorry, I don't understand the question. The function is called with only two parameters so…
		tejohnsonUnsubmitted Done Reply Inline Actions You are right, nevermind this comment! Looked at it too fast. tejohnson: You are right, nevermind this comment! Looked at it too fast.

return true;		return true;
}		}
		tejohnsonUnsubmitted Done Reply Inline Actions Previously the internalize flag was set before invoking this function from main(). With this being set after the first linkInModule it seems like a behavior change, or am I missing something? tejohnson: Previously the internalize flag was set before invoking this function from main(). With this…
		JDevlieghereAuthorUnsubmitted Done Reply Inline Actions Indeed, however on line 275 it is cleared on the first iteration by AND'ing it with overrideFromSrc flag. JDevlieghere: Indeed, however on line 275 it is cleared on the first iteration by AND'ing it with…
		tejohnsonUnsubmitted Done Reply Inline Actions Ah, that's what I was missing. Please add a comment about why this isn't being applied to the first iteration. tejohnson: Ah, that's what I was missing. Please add a comment about why this isn't being applied to the…

int main(int argc, char **argv) {		int main(int argc, char **argv) {
// Print a stack trace if we signal out.		// Print a stack trace if we signal out.
sys::PrintStackTraceOnErrorSignal(argv[0]);		sys::PrintStackTraceOnErrorSignal(argv[0]);
PrettyStackTraceProgram X(argc, argv);		PrettyStackTraceProgram X(argc, argv);

ExitOnErr.setBanner(std::string(argv[0]) + ": ");		ExitOnErr.setBanner(std::string(argv[0]) + ": ");

LLVMContext Context;		LLVMContext Context;
Context.setDiagnosticHandler(diagnosticHandler, nullptr, true);		Context.setDiagnosticHandler(diagnosticHandler, nullptr, true);

llvm_shutdown_obj Y; // Call llvm_shutdown() on exit.		llvm_shutdown_obj Y; // Call llvm_shutdown() on exit.
cl::ParseCommandLineOptions(argc, argv, "llvm linker\n");		cl::ParseCommandLineOptions(argc, argv, "llvm linker\n");

if (!DisableDITypeMap)		if (!DisableDITypeMap)
Context.enableDebugTypeODRUniquing();		Context.enableDebugTypeODRUniquing();

auto Composite = make_unique<Module>("llvm-link", Context);		auto Composite = make_unique<Module>("llvm-link", Context);
Linker L(*Composite);		Linker L(*Composite);

Show All 11 Lines	int main(int argc, char **argv) {
if (!linkFiles(argv[0], Context, L, OverridingInputs,		if (!linkFiles(argv[0], Context, L, OverridingInputs,
Flags \| Linker::Flags::OverrideFromSrc))		Flags \| Linker::Flags::OverrideFromSrc))
return 1;		return 1;

// Import any functions requested via -import		// Import any functions requested via -import
if (!importFunctions(argv[0], *Composite))		if (!importFunctions(argv[0], *Composite))
return 1;		return 1;

if (DumpAsm) errs() << "Here's the assembly:\n" << *Composite;		if (DumpAsm)
		errs() << "Here's the assembly:\n" << *Composite;

std::error_code EC;		std::error_code EC;
tool_output_file Out(OutputFilename, EC, sys::fs::F_None);		tool_output_file Out(OutputFilename, EC, sys::fs::F_None);
if (EC) {		if (EC) {
errs() << EC.message() << '\n';		errs() << EC.message() << '\n';
return 1;		return 1;
}		}

if (verifyModule(*Composite, &errs())) {		if (verifyModule(*Composite, &errs())) {
errs() << argv[0] << ": error: linked module is broken!\n";		errs() << argv[0] << ": error: linked module is broken!\n";
return 1;		return 1;
}		}

if (Verbose) errs() << "Writing bitcode...\n";		if (Verbose)
		errs() << "Writing bitcode...\n";
if (OutputAssembly) {		if (OutputAssembly) {
Composite->print(Out.os(), nullptr, PreserveAssemblyUseListOrder);		Composite->print(Out.os(), nullptr, PreserveAssemblyUseListOrder);
} else if (Force \|\| !CheckBitcodeOutputToConsole(Out.os(), true))		} else if (Force \|\| !CheckBitcodeOutputToConsole(Out.os(), true))
WriteBitcodeToFile(Composite.get(), Out.os(), PreserveBitcodeUseListOrder);		WriteBitcodeToFile(Composite.get(), Out.os(), PreserveBitcodeUseListOrder);

// Declare success.		// Declare success.
Out.keep();		Out.keep();

return 0;		return 0;
}		}