This introduces a resolution-based LTO API. The main advantage of this API over
existing APIs is that it allows the linker to supply a resolution for each
symbol in each object, rather than the combined object as a whole. This will
become increasingly important for use cases such as ThinLTO which require us
to process symbol resolutions in a more complicated way than just adjusting
linkage.
Details
Diff Detail
Event Timeline
lib/CodeGen/LTOBackend.cpp | ||
---|---|---|
142–149 ↗ | (On Diff #60111) | Apologies if this is a naive question - I'm still wrapping my head around this patch, but could opt/codegen/splitCodeGen be std::functions on Config rather than being plain functions that introspect the Config object to build the pass pipelines? This might allow you to decouple some of the pipeline setup logic from the LTO interface. It may also allow you to remove some of the hooks: E.g. If the user is supplying 'codegen', they can just add any module inspection code to their custom implementation, rather than having to set up 'PreCodeGenModuleHook'. |
I like Lang approach to decouple a bit more the configuration.
include/llvm/CodeGen/LTOBackend.h | ||
---|---|---|
59 ↗ | (On Diff #60111) | "to add its own resolved symbols": so the linker will modify the module? (I notice the Module is not const in the hook prototype) |
62 ↗ | (On Diff #60111) | What level of "thread-safety" are we talking about? Is it just about the linker internal structure? Can the linker safely assume that only one thread will call one of these callbacks for a given Module? For a given LLVMContext? |
67 ↗ | (On Diff #60111) | What is the Task "ID" at this point? This would be a task that will not codegen in the case of a parallel backend? |
94 ↗ | (On Diff #60111) | If the intent is just -save-temps, I'd rather have the Index passed as a const ref. |
107 ↗ | (On Diff #60111) | Why the Triple string and Target here? |
120 ↗ | (On Diff #60111) | This is a strange API at this level. |
136 ↗ | (On Diff #60111) | Why this LTOLLVMContext instead of a factory method in the Config class? |
include/llvm/LTO/LTO.h | ||
88 | I think some doc may be nice for the enum values (I have no idea what is comdat_alias right now). | |
91 | doc (public API in public header) | |
290 | Not clear why ParallelCodeGenParallelismLevel is not part of Config. | |
321 | There is too much state here. | |
383 | Why do we need this here? |
include/llvm/CodeGen/LTOBackend.h | ||
---|---|---|
59 ↗ | (On Diff #60111) | Yes, it would modify the module. The use case is to work around API deficiencies; the specific issue I had in mind was the special handling for common symbols required in the gold plugin (search for addCommons). I wouldn't be surprised if we had to implement a similar workaround in libLTO as well, but I would not expect lld for example to need to modify the module, as we control both sides of the interface. That said, the gold plugin only requires a mutable hook at one point (which happens to be post-internalize at the moment, but I think also pre-opt would work); maybe we should only allow a mutable hook at one of those points. |
62 ↗ | (On Diff #60111) | Yes, this is with regard to the linker's internal structure. On the LLVM side I think the right guarantee is a per-module guarantee (so the linker would be able to freely inspect and modify the provided module without having to worry about thread safety). Although of course at the moment this would be a per-context guarantee unless/until we made LLVMContext thread-safe. I will change this comment to mention this. |
67 ↗ | (On Diff #60111) |
I'm not sure what you are asking here. The task ID generally stays consistent across the whole pipeline.
Do you mean a distributed backend? In that case, none of the hooks will be called for ThinLTO tasks other than the combined module hook. I will document that. |
94 ↗ | (On Diff #60111) | Makes sense, will do. |
107 ↗ | (On Diff #60111) | You're right, we don't need these. I'll change the implementation to use Module::getTargetTriple(). |
120 ↗ | (On Diff #60111) | Yes, agree. After I finish the reorganization I mentioned earlier, I want to move this API to a library-internal header. |
136 ↗ | (On Diff #60111) | Because the LTOLLVMContext needs to own the DiagHandler. |
include/llvm/LTO/LTO.h | ||
88 | Will do. | |
91 | Will do | |
290 | The idea was that Config controls everything except how code generation is "orchestrated" (that's why I'm passing in a ThinBackend here for example).
I will document that more clearly. | |
321 | I think the code is made more straightforward and easier to understand by incrementally building the combined module (for regular LTO) or the combined index (for ThinLTO) from add(), as you can see from the code exactly what happens for each module. That requires us to keep this state here. Anyway, this state is an implementation detail, so it is not as important as the public API. | |
383 | Okay, since it looks like we're only using memory buffers here, we can probably just have an array of MemoryBuffers here (or maintain ModuleMap as a MapVector, or something). | |
lib/CodeGen/LTOBackend.cpp | ||
142–149 ↗ | (On Diff #60111) | That's an interesting suggestion. My initial concern with that was that it bakes the required state for each pipeline phase into the API. For example, codegen requires a Triple and a Target, and many of the ThinLTO phases would require a combined module index, and those would need to be passed into the pipeline function via the "hook" in Config. If we ever change the required state, that would mean a lot of churn for each user of the API. Although it occurred to me that we could avoid baking in the state like this: // In LTOBackend.cpp struct PromoteState { ModuleSummaryIndex &CombinedIndex; }; struct CodeGenState { Config &C; StringRef TheTriple; const Target *TheTarget; AddStreamFn AddStream; }; // In LTOBackend.h struct PromoteState; bool promote(size_t Task, Module &M, PromoteState &State); struct CodeGenState; bool codegen(size_t Task, Module &M, CodeGenState &State); struct Config { // ... std::function<bool(size_t Task, Module &M, PromoteState &State)> Promote; std::function<bool(size_t Task, Module &M, CodeGenState &State)> CodeGen; // ... }; that would still make the signatures of the Config "hooks" non-uniform, which would also make it more complicated to implement save-temps, which is about 90% of the point of these hooks. (Suppose I want to add save-temps to each pipeline phase, including "pre-optimization". I would not only need to write an individual wrapper function for each phase, but I would also need to know which phase comes "first" in order to run my pre-opt save-temps processing before it.) I would prefer to keep the hooks uniform unless there is a good reason not to do so. |
- Address review comments
- Remove TheTarget and TheTriple arguments and unneeded error_category (now that we have StringError)
- Pass index as const ref to CombinedIndexHook
- Remove ThinObjs, and change ModuleMap's type to MapVector
- Add Config::addSaveTemps; add initial llvm-lto2 test
- Wire the gold plugin up to addSaveTemps, and fix several tests that were failing before (didn't notice due to stale outputs in the test directory)
- Add a .resolution.txt to save-temps, and port comdat.ll
- Port emit-llvm.ll test to resolution.txt
- Remove api file feature
- Move LTO.cpp to Resolution subdirectory
- Move LTOBackend to LTO/Resolution; inline upgradeLinkage
- Remove unnecessary #includes from gold-plugin.cpp
- Add comments
- Remove unnecessary dep
In this version of the patch I started exercising the test harness so you can see what that looks like. As previously mentioned I also moved the code to a separate library (lib/LTO/Resolution) so we can depend on it from clang.
I started working on the clang side backend, but there's a hairy output stream yak to shave there, so it'll take a little more time.
include/llvm/CodeGen/LTOBackend.h | ||
---|---|---|
120 ↗ | (On Diff #60111) | I ended up folding this into the caller on the regular LTO side, as ThinLTO linkage upgrades now go via the module summary index. |
include/llvm/LTO/LTO.h | ||
88 | I ended up removing this error category and just using StringError for errors produced by this module. | |
91 | See above |
include/llvm/CodeGen/LTOBackend.h | ||
---|---|---|
67 ↗ | (On Diff #60111) | I meant the LTO parallel codegen: after merging the IR you will call this hook with a tuple <task-ID, module> that will never go through the codegen. |
include/llvm/LTO/Resolution/LTOBackend.h | ||
71 ↗ | (On Diff #61155) | The thread safety in LLVM is at the LLVMContext level, not the module level. |
include/llvm/CodeGen/LTOBackend.h | ||
---|---|---|
67 ↗ | (On Diff #60111) | In this case the task id would be 0, and it would be split into tasks 0..N-1. Although task 0 will end up using a different module for codegen (i.e. whichever partition happened to be assigned task id 0), the module provided to the hook at this point is still significant, as any changes made to the module (e.g. adding common symbols for gold) will be reflected in the partitions. |
include/llvm/LTO/Resolution/LTOBackend.h | ||
71 ↗ | (On Diff #61155) | Yes, but I wanted to describe the intention here should we ever relax the thread safety in LLVM. (For example, if we ever made it safe for multiple code generation threads to operate on a single module, the guarantee here would be stronger than the default guarantee.) |
Looks like it is getting very close. A few comments/questions below.
Also, it is a huge patch - suggest committing in a few different pieces:
- Move existing LTO.{cpp,h} to Resolution subdirectory
- Add LTOBackend.cpp, changes to LTO.{cpp,h}, and llvm-lto2 + tests
- Changes to gold-plugin.cpp to use new interfaces
include/llvm/LTO/Resolution/LTO.h | ||
---|---|---|
321 ↗ | (On Diff #61155) | Is the ThinLTO support still TODO? It looks like this can be removed as I see it here now. |
include/llvm/LTO/Resolution/LTOBackend.h | ||
106 ↗ | (On Diff #61155) | Although it is not called for the backend process (which also has a Config object per D21545). I do know what you are trying to say here - I think the distinction is that this is always invoked via LTO::run regardless of whether the backend is in-process. Whereas the earlier hooks are only invoked from the backends themselves (regarless of whether invoked in process or in a separate backend process) |
111 ↗ | (On Diff #61155) | Add doxygen comment |
lib/LTO/Resolution/LTO.cpp | ||
167 ↗ | (On Diff #61155) | Can we avoid creating this if we are not going to perform ThinLTO? E.g. Create lazily in runThinLto if it is still null? |
370 ↗ | (On Diff #61155) | Could move this to parent class since it is in both derived classes |
561 ↗ | (On Diff #61155) | Can you document why we are initializing Task to ParallelCodeGenParallelismLevel? I thought earlier it said somewhere that regular LTO (parallel or not) was TaskId 0 and ThinLTO started at TaskId 1. |
lib/LTO/Resolution/LTOBackend.cpp | ||
32 ↗ | (On Diff #61155) | I'm confused by what is going on here with OldHook. |
test/LTO/Resolution/X86/comdat.ll | ||
14 | Does this need to be specified here? It looks like a suitable default (with none of the flags) will be constructed by default if not. | |
73 | What causes the difference w.r.t. gold here (protected vs not)? | |
test/tools/gold/X86/coff.ll | ||
19 | Why the loss of local_unnamed_addr? | |
test/tools/gold/X86/thinlto_alias.ll | ||
24 | Since thinBackend() now invokes thinLTOInternalizeModule, why is this failing? | |
tools/gold/gold-plugin.cpp | ||
739 | Can we skip this if options::thinlto_index_only? Otherwise Backend is simply overwritten just below. | |
tools/llvm-lto2/llvm-lto2.cpp | ||
3 | Should probably add a comment at the top of the file about why this exists then. | |
21 | Need to describe format of "resolution" (e.g. 'p','l','x' and meanings). Also, it would be good to mention the default resolution for anything not specified here (which from the code it appears to be non-p, non-l, and non-x). |
Also, it is a huge patch - suggest committing in a few different pieces:
Makes sense. Could also split it up into differential revisions if it makes it easier to review.
include/llvm/LTO/Resolution/LTOBackend.h | ||
---|---|---|
106 ↗ | (On Diff #61155) | Clarified. |
lib/LTO/Resolution/LTO.cpp | ||
167 ↗ | (On Diff #61155) | Maybe, but that sounds like a premature optimization to me. |
lib/LTO/Resolution/LTOBackend.cpp | ||
32 ↗ | (On Diff #61155) | This is calling the hook provided by the linker, which still needs to run (e.g. in order to add common symbols). If the linker's hook returned false, we need to pass that result through. |
test/LTO/Resolution/X86/comdat.ll | ||
14 | After thinking about it some more, I reckon that in order to reduce the possibility of mistakes, we probably don't want to implement a default resolution in llvm-lto2, nor should we accept resolutions for non-existent symbols. I have made that change to the test harness. | |
73 | Unlike the existing gold plugin, we do not resolve the visibility of each symbol and apply it to the combined module. It is unnecessary to do so because gold has already resolved the symbol's visibility using information provided by the plugin (i.e. LDPV_*) and will apply it to the final output file. The corresponding gold test (test/LTO/Resolution/X86/comdat.ll) shows that the symbol receives the correct visibility. | |
test/tools/gold/X86/coff.ll | ||
19 | LTO only tracks whether the symbol is global unnamed_addr. We do not track local_unnamed_addr for similar reasons as for visibility: it is the linker's job to keep track of whether the presence of that attribute can permit internalization (or, for Mach-O, auto-hide), and its presence in the IR shouldn't affect the generated code. | |
test/tools/gold/X86/thinlto_alias.ll | ||
24 | The internalization done in thinLTOInternalizeModule for symbols that are not externally visible is separate from the internalization done for preempted alias targets (search for InternalLinkage in IRMover.cpp), which is not yet implemented for ThinLTO. I believe the practical effect here is that the linker will end up re-resolving the weak symbol from the resulting native object files. (This is similar to how we currently handle (non-ODR) weak or linkonce symbols, as we currently don't have a summary resolution that means "discard this symbol".) | |
tools/gold/gold-plugin.cpp | ||
739 | It seems more straightforward to just let it be overwritten. |
LGTM, with one nit below.
However, I'd like to wait to commit this until:
- Mehdi signs off
- D22302 goes in, enabling weak/linkonce resolution for gold. Your patch has the side effect of doing this enabling, and I think it is cleaner to have that functionality added separately from this patch adding the new API.
Mehdi, can you take a look at this patch and D22302 asap - this is getting on the critical path of some fixes and follow on patches I need to add for the distributed backend case (will send a separate update on that).
lib/LTO/Resolution/LTOBackend.cpp | ||
---|---|---|
32 ↗ | (On Diff #61155) | Ok, maybe rename to LinkerHook and/or add a comment. |
Can you update description / title (it is still marked as "wip" which make me think the revision is not ready).
Can you update description / title (it is still marked as "wip" which make me think the revision is not ready).
Done.
Can you add one test for the .resolution.txt file with llvm-lto2?
include/llvm/LTO/Config.h | ||
---|---|---|
83 | "will never be called *concurrently* from multiple threads" -> I don't think we want to promise to pin a Module or a task to a given thread. | |
147 | You may add that the real reason to have a class here is that you don't want to tie the lifetime of the context to the configuration object and you need to keep a copy of the diagnostic handler alive. | |
include/llvm/LTO/LTO.h | ||
103 | IIUC the reason we need a context is because we can't get the information we want on an InputFile without a context for now.
| |
321 | At least, can we wrap them in struct to separate this clearly: struct { unsigned ParallelCodeGenParallelismLevel; LTOLLVMContext Ctx; std::unique_ptr<Module> CombinedModule; IRMover Mover; } LTOState; // These fields are for ThinLTO. struct { ThinBackend Backend; ModuleSummaryIndex CombinedIndex; MapVector<StringRef, MemoryBufferRef> ModuleMap; DenseMap<GlobalValue::GUID, StringRef> PrevailingModuleForGUID; } ThinLTOState; It'll make the code more clear on violation (ThinLTO code accessing LTO state or vice-versa) | |
lib/LTO/LTO.cpp | ||
281 | This isn't clear to me "the client may want to add symbol definitions to it". Why doesn't the client add a regular LTO module if he needs to add its stuff? | |
lib/LTO/LTOBackend.cpp | ||
59 | How is this supposed to work with static archive? It'll write a bunch of file next to the input shared library? | |
150 | Nothing critical but we just already created a TM for the optimization pipeline. | |
tools/llvm-lto2/llvm-lto2.cpp | ||
3 | Agree | |
35 | "r" is really not very explicit for an option, I think we usually have more self-explaining names. | |
45 | Are there incompatible combination? (I don't think so, but just checking). |
lib/LTO/LTOBackend.cpp | ||
---|---|---|
59 | Yes, it will write the temp files next to the archive library - each constituent bitcode in the archive is given a unique module ID by the gold-plugin (see D20559 claim_file_hook()). I find it more intuitive to put the saved temp files next to the input object or archive, but open to other suggestions - that should probably change in a separate patch though as this is maintaining the current gold-plugin behavior. |
- Address review comments
- Add resolution test based on llvm-lto2
include/llvm/LTO/LTO.h | ||
---|---|---|
103 | Sorry, but I'd prefer not to do this.
Let's not pretend that we can parse input files in a multithreaded way that allows them to be used later for linking until we can actually do it. As things stand, doing this will probably regress gold plugin performance as we will need to parse the input file an additional time.
It doesn't seem like it will be particularly difficult to remove this argument in clients later, it would probably be on the order of a trivial mechanical change. | |
321 | Good idea, done. | |
lib/LTO/LTO.cpp | ||
281 | Because there may be no regular LTO module available as part of the link. For the gold plugin for example if some of the ThinLTO modules define common symbols then we need to make sure that one of the object files contains the resolved common symbols. In the case where there are no regular LTO modules that can be an object file containing just the common symbols. | |
lib/LTO/LTOBackend.cpp | ||
150 | Sure, but it's a little simpler to create this where needed, especially when using parallel code gen. Eventually we might want to just create this once in backend() and thinBackend(), but that would require TargetMachine to be thread safe. | |
tools/llvm-lto2/llvm-lto2.cpp | ||
35 | This flag will need to be passed multiple times, and should be familiar to any user of this tool, so it seems appropriate to give it a short name. | |
45 | I don't think so either. |
include/llvm/LTO/LTO.h | ||
---|---|---|
94 | I missed how this friend allows access Obj earlier, and it is not great. It is used to create a dependency on having a Module available and on having "GlobalValue *", while we want to be able to make all the resolution without loading IR. As soon as we have a proper symbol table, InputFile should not even expose access to a module or IR at all. | |
103 |
I'm not sure what you mean: this is *already* what we do with libLTO: The file is lazily loaded in its own context, this is cheap: the IR is not fully parsed but only the list of symbols. It is then reloaded separately, and non lazily, in the LTO context later for the purpose of LTO merging. Right now this choice is penalizing ThinLTO that won't be able to load symbols in parallel and still have to load two times anyway. | |
lib/LTO/LTO.cpp | ||
281 |
Client can just create one then... Module M; Lto.add(M); // done
I don't know what you're referring to here? | |
lib/LTO/LTOBackend.cpp | ||
151 | For ThinLTO at least, we don't support parallel codegen, so we don't need TM to be thread safe to be able to have only one TM created in thinBackend() and reused for both opt and codegen. I even think that it should be already achievable by having:
This way you only recreate an extra one in splitcodegen() when parallel codegen is enabled. Unless I missed something this is a net win. |
- Avoid creating another TargetMachine
- Add FIXME
include/llvm/LTO/LTO.h | ||
---|---|---|
94 | I agree with you, but until we have a symbol table, there needs to be some way for LTO to access IR entities in the input file. Added a FIXME for the removal of these friends. | |
103 |
Please let me know what you find. Bear in mind that you are the one advocating for a change here (the status quo in the only current client of this interface, the gold plugin, is a single context) so the burden is on you to justify it. If you wish to make a change, you are welcome to propose it separately from this patch. | |
lib/LTO/LTO.cpp | ||
281 |
Not as easily as that. Roughly: if (/* there are no regular LTO objects, oh, looks like we'll need more API surface */) { LLVMContext Ctx; Module M(Ctx); M.setTargetTriple(??? /* yet more API surface */); // add common symbols std::string BC; raw_str_ostream OS(BC); WriteBitcodeToFile(M, OS); ErrorOr<unique_ptr<InputFile>> F = InputFile::create(BC); if (!F) { /* error handling */ } Lto.add(std::move(*F)); } Seems much simpler to always have a module with index 0.
As I previously mentioned on this code review:
|
include/llvm/LTO/LTO.h | ||
---|---|---|
94 |
It is not clear to me why LTO needs to access any IR entity? If you need access an *IR entity* now, how could we don't need access to them tomorrow (<- read: with a symbol table)? It seems to me that anywhere you're accessing an IR entity directly through Obj right now is a place that should have an API on the InputFile or the Symbol class (like the one that exists: InputFile::getSourceFileName()). Assuming "Symbol" is abstracting the entries in the symbol table, for instance:
for (GlobalVariable &GV : M.globals()) if (GV.hasAppendingLinkage()) Keep.push_back(&GV); which should iterate on Symbols on the InputFile.
GlobalValue *GV = Input->Obj->getSymbolGV(Sym.I->getRawDataRefImpl()); if (Res.Prevailing && GV) ThinLto.PrevailingModuleForGUID[GV->getGUID()] = MBRef.getBufferIdentifier();
| |
103 | I'm not sure, it seems easy to turn it around: "you are the one proposing an API that is targeted for performance reason with a specific use case (and client) in mind, so the burden should be on you to justify it if you want this to get it in.", otherwise write a cleaner API. | |
lib/LTO/LTO.cpp | ||
256 | Is is possible for GV to be null here? | |
281 |
"Simpler" from the gold point of view, my point of view is that when adding 1000 ThinLTO modules we'll copy 1000 DataLayout around for no reason. | |
306 | (Same question here, can GV be null?) |
include/llvm/LTO/LTO.h | ||
---|---|---|
103 | Running ld64 on the 701 bitcode files that make an LTO build of llc (X86 only), and exiting right before the LTO optimizer starts, i.e. after the LTO merge happens (best of 10 times, on a 4-cores laptop):
|
include/llvm/LTO/LTO.h | ||
---|---|---|
94 |
The way I see this evolving is that any uses of IR constructs here will be replaced with uses of APIs for accessing the bitcode symbol table (except in places such as LTO::addRegularLto which actually need a Module). The functionality exposed via InputFile and Symbol will remain the primary way by which clients will access the object (i.e. they wouldn't access the symbol table directly). I would prefer not to expose the properties you propose to expose via InputFile/Symbol because they aren't strictly needed by clients for symbol resolution. | |
103 | Thanks. To me it isn't clear how much of that is due to parsing as opposed to parallel resolution, but I can accept that this would simplify clients. So I suppose I can try to make the change you suggested. | |
lib/LTO/LTO.cpp | ||
256 | Yes, if the symbol is defined by inline asm. | |
281 | I suppose we could avoid creating an empty regular LTO object unless a hook is set. | |
306 | Ditto |
include/llvm/LTO/LTO.h | ||
---|---|---|
103 | I chatted with Duncan and thought you may be interested by some history: his recollections of the original motivation for parsing in separate context was not parallelism but the amount of things that leaks to the context, while the modules wouldn't be used if they come from static archives anyway. |
include/llvm/LTO/LTO.h | ||
---|---|---|
103 | Interesting. LLD originally used multiple contexts, but now uses a single context because of the reversed tradeoff.
Huh, I would expect the linker to use the archive symbol table to load only the needed modules. |
include/llvm/LTO/LTO.h | ||
---|---|---|
103 |
Going from 29.86187751s to 29.814533787s does not justify clobbering any API IMO. |
include/llvm/LTO/LTO.h | ||
---|---|---|
103 | Sure. I previously wasn't aware that Rafael had previously measured this. |
- Remove LTO argument from InputFile::create
- Only generate code for the regular LTO module when necessary
Ping - Mehdi, any more comments? I am relying on the follow-on patch D21545 adding support for weak resolution and internalization in the ThinLTO backends.
include/llvm/LTO/LTO.h | ||
---|---|---|
126 | What about return Flags & object::BasicSymbolRef::SF_Global || Flags & object::BasicSymbolRef::SF_FormatSpecific; | |
lib/LTO/LTO.cpp | ||
281 | This is marked done, but doesn't seem to be? | |
lib/LTO/LTOBackend.cpp | ||
60 | Deriving an output temp file from a user-supplied *output* is fine, what does not make sense to me is to write stuff where the *input* files are. This is unlike what we usually do (the static archives could be in a read-only system directory for instance, this is quite common on our setup). You can maintain whatever behavior Gold currently has by passing an option to addSaveTemps maybe. Right now the API being addSaveTemps(std::string OutputFileName) I don't expect files to be written anywhere unexpected. Also the doxygen is pretty clear about what should be expected from this API: /// This is a convenience function that configures this Config object to write /// temporary files named after the given OutputFileName for each of the LTO /// phases to disk. A client can use this function to implement -save-temps. |
- Refresh
- Address review comments
- Implement linked objects file feature
lib/LTO/LTO.cpp | ||
---|---|---|
281 | See the first three lines of LTO::run. | |
lib/LTO/LTOBackend.cpp | ||
60 | I agree with you that we should move to a temp file naming scheme based on output files, however like Teresa I think that should be discussed separately from this patch. I don't think we need a flag either, whatever we decide to do here should apply to all linkers. Added a FIXME here to change the naming scheme. |
Last comments, we should be able to iterate in-tree after that.
include/llvm/LTO/LTO.h | ||
---|---|---|
310 | Should be RegularLTO I think. | |
319 | Similarly, should be ThinLTO | |
368 | addRegularLTO | |
370 | addThinLTO | |
373 | runRegularLTO | |
374 | runThinLTO | |
lib/LTO/LTO.cpp | ||
243 | HasThinLTOSummary | |
281 | I see, I was looking for an empty LTO module (RegularLto.CombinedModule) instead of object (as in produced object). | |
316 | Is this test what you intend? It seems reversed (if so, then it's probably not tested). | |
318 | This is still gonna copy a few SmallVector + a couple of string + a few other fields. And it's gonna happens for each and every ThinLTO module. This is just not the right place for that. Adding a triple/datalayout for LTO should not be handled here, it is just not the right place. LTO::run could handle it by poking at ThinLto.ModuleMap if needed. Similarly, RegularLto.CombinedModule should not be created unless explicitly added/requested by the linker (i.e. not call to make_unique<Module>("ld-temp.o", Ctx); in the ctor). | |
358 | Doc the above tests. | |
lib/LTO/LTOBackend.cpp | ||
61 | It seems we are not looking at this from the same angle: I was looking at it as a *new* API, not a refactoring of whatever the gold plugin was doing. So I can't see "this is gold's behavior" as a valid motivation here (if you want this to be NFC from the gold point of view, the gold-plugin can be patched in the first place). |
pcc asked me to take over the patch since he is out on vacation this month. Responses below. I have made most of the suggested changes, but not a couple for the reasons stated. Let me see if I can upload the new patch even though pcc created this one.
include/llvm/LTO/LTO.h | ||
---|---|---|
310 | Done | |
319 | Done | |
368 | Done | |
370 | Done | |
373 | Done | |
374 | Done | |
lib/LTO/LTO.cpp | ||
243 | Done | |
316 | I think what was meant here was to check if the RegularLTO.CombinedModule's target triple was still empty, and if so set it. I have changed it to that. | |
318 | It only does the copy once with the change I made, so there isn't any duplication in the copying anymore. This is the simplest place to set it based on an added ThinLTO module. I looked at delaying the creation of the CombinedModule, however, it turns out we frequently need it since even if we don't add any regular LTO files, we typically have callback hooks into the linker defined where the linker could add its own resolved symbols to the combined module, in which case it needs to be valid. I don't think we gain much by lazily creating the empty Module. | |
358 | Done | |
lib/LTO/LTOBackend.cpp | ||
61 | I removed this path so that the OutputFileName is always used, so it now matches the doxygen comment for the API. |
lib/LTO/LTO.cpp | ||
---|---|---|
318 | The ", we typically have callback hooks into the linker defined where the linker could add its own resolved symbols to the combined module" is not clear to me at all (LTOCodeGenerator will never do that AFAIK for example). |
(Note: I already approved this patch as good enough to be iterated on in-tree, so feel free to commit)
lib/LTO/LTO.cpp | ||
---|---|---|
318 | To be more explicit: I don't like this behavior because it creates a relatively strong coupling between the linker and the plugin. The linker changes the module in an unpredictable way that can conflict with assumption that the plugin implementation could make. It makes it harder to follow invariant in the plugin implementation, and makes it easy to break the client of the API (the linker). |
lib/LTO/LTO.cpp | ||
---|---|---|
318 | True, this is the case now because gold is the only user. I went ahead and committed (temporarily, as it turned out, will recommit after fixing some bot failures) so that I can also get pcc's follow-on patch in that I need. But will also work up a patch to make the creation of the combined module lazy. I'm not sure how to address your other concern about the linker changing the module in unpredictable ways, but the API does document that the linker may add symbols to the combined module via the callback hooks. | |
lib/LTO/LTOBackend.cpp | ||
61 | I didn't notice that this change was causing an issue since I had old output files around that caused tests to pass, but this caused a bot failure due to missing output files. The problem is not just that the names are different, but that they become difficult to correlate to the corresponding input source name. E.g. in the gold/X86/thinlto_linkonceresolution.ll test, where we have "%gold ... -o %t3.o %t2.o %t.o", the old .opt.bc temp files were named %t.o.4.opt.bc and %t2.o.4.opt.bc, but with the change I made become %t3.o.1.4.opt.bc and %t3.o.2.4.opt.bc. It isn't at all obvious which output file corresponds to which input file (numbering will depend on the ModuleMap iteration order). What I did was to revert the change I had made, but add a new bool flag parameter on addSaveTemps (UseInputModulePath) that is set to true by gold and will provoke the old behavior. We can iterate on a better solution. Probably pass in a (temp) directory name and output all the files in a tree rooted there (note that you could have same named module identifiers at different paths, so you can't just disambiguate by appending the basename of the input module). |
llvm/trunk/tools/gold/gold-plugin.cpp | ||
---|---|---|
95 ↗ | (On Diff #67690) | I think this one should be size_t as lto::InputFile::Symbol::getCommonSize returns size_t and then the std::max in line 629 fails to build in targets where size_t is unsigned int. Can you confirm if changing to size_t makes sense here? It does fix the build indeed. If this is OK I can commit the change. |
"will never be called *concurrently* from multiple threads" -> I don't think we want to promise to pin a Module or a task to a given thread.