Details
Diff Detail
Event Timeline
Sorry, but I think it is a bit too early for this.
There are just too many moving parts right now both in lld and thin lto. Even regular LTO is not complete. We are still not internalizing when producing .so files and not all tests pass on a clang bootstrap.
I still need to finish refactoring how we handle relocations and recover the performance hit we took when handling debug sections.
Can you put this on hold for 2 weeks?
I certainly agree that regular LTO should reach a certain level of maturity before this goes in. That's part of why I've been spending time improving LLD to the point that I can link and run Chromium with LTO before working on this change.
Anyway, happy to continue looking at regular LTO bugs for a while first if others agree that we should hold off for now.
Thanks for working on this. Saw the follow up discussion about whether it is too early, but I had a couple comments that I thought I would share now regardless. Overall looks good and pretty straightforward.
ELF/InputFiles.cpp | ||
---|---|---|
605 | The comment sounds like the opposite of what is being done here, but I am probably misunderstanding something...if ThinLto is true then this is a ThinLTO object not a regular LTO object, and it sounds like we are making it visible to regular LTO object files by setting this flag. | |
ELF/LTO.cpp | ||
295–312 | add SaveTemps handling? | |
299 | Maybe exit early above here if there are no ThinModules, so that the ThreadPool setup isn't done unnecessarily. |
I like the patch. Please hold on until Rafael finishes with relocations or general code churn and I'll take a closer look.
ELF/Driver.cpp | ||
---|---|---|
22 ↗ | (On Diff #55939) | Ditto. |
ELF/Driver.h | ||
17 | Do you need this? | |
ELF/Error.cpp | ||
55 | Even though this function is passed as std::function<> object, it should be verb rather than noun because it is a function. | |
ELF/Error.h | ||
16–18 | Remove blank lines. | |
ELF/LTO.cpp | ||
315–319 | I can see the reason why you chose to use auto here, but still I'd use real types for consistency even though it's a bit too verbose. | |
ELF/LTO.h | ||
33–35 | Remove blank lines. | |
ELF/SymbolTable.cpp | ||
49–53 | Is this what clang-format formatted? | |
231–234 | You can write if (auto *BC = dyn_cast<BitcodeFile>(File)) return !BC->ThinLto; return false; |
- Address review comments
ELF/Driver.cpp | ||
---|---|---|
22 ↗ | (On Diff #55939) | Removed. |
ELF/Driver.h | ||
17 | Yes. A definition of LLVMContext is required for the field LinkerDriver::Context. We were previously getting the definition from LTO.h via SymbolTable.h, until I removed the #include of LTO.h to SymbolTable.cpp. | |
ELF/InputFiles.cpp | ||
605 | The idea with this comment was that if the regular LTO object defines a symbol, we should mark it as used if we see a ThinLTO object with an undefined reference to that symbol. But yes, it works the other way as well. I have moved this comment to isRegularLtoInputFile with a slightly better explanation. | |
ELF/LTO.cpp | ||
295–312 | We can most likely add that separately. | |
ELF/SymbolTable.cpp | ||
49–53 | Reformatted. |
ELF/InputFiles.h | ||
---|---|---|
229 | I would make the name a bit more descriptive. How about hasThinLtoSummary? | |
ELF/LTO.cpp | ||
28 | Just to make this easier to review, can the first version be single threaded? | |
151 | Add a comment on why thin-lto requires initializing these bits early. | |
168 | Making this a static helper is a nice independent cleanup. Please commit and rebase. | |
228 | This logic is mostly duplicated with the regular lto case. Can you add a helper that return S or null? It should basically return S when we need to call undefine. | |
294 | This is duplicated with what splitCodeGen does. We should populate the passes in one place. | |
ELF/LTO.h | ||
70 | A more descriptive name would be nice. | |
ELF/SymbolTable.cpp | ||
50 | Why? |
ELF/LTO.cpp | ||
---|---|---|
74 | This cause the new lld to depend on Linker::linkModules, which should really not happen. Mehdi, you are working on having the function import pass use just the ir mover, correct? What is the next step in that? |
Note: I haven't looked at the patch, or the LTO implementation in lld, but my take on the current state of the gold plugin is that we expose far too much in the linker. I understand that the libLTO API was not nice, but I think the solution implemented in the gold plugin is terrible because it does not expose any generic API. The proper solution should be in my opinion to have a "LTOCodeGenerator" (and/or a "ThinLTOCodeGenerator") that exposes an appropriate interface to the linker, that would be shared by lld, gold, etc.
For instance any call to "setLinkage" should never happen in a linker-specific code.
Feel free to implement it as you want in lld, but I don't want to be bound by any "external" logic as I have been in the past with the gold plugin (i.e. I'll break lld and won't try to fix it).
ELF/LTO.cpp | ||
---|---|---|
74 | I believe the only reason I didn't go all the way was because the "Linker" handles some specific things about comdat that are not implemented in ThinLTO (we don't have comdat information in the summary). And since we don't have comdat on Darwin, I'm not use to work with them. |
The proper solution should be in my opinion to have a "LTOCodeGenerator" (and/or a "ThinLTOCodeGenerator") that exposes an appropriate interface to the linker, that would be shared by lld, gold, etc.
Maybe, but I'm not sure that we can know what the right interface should be without working implementations in multiple linkers.
I'll break lld and won't try to fix it
That doesn't really seem reasonable to me. We all have a responsibility to keep the bots green.
That said, I'm hoping to keep the API surface area used here relatively small, which should make it easier to update. One example of this is that as I mentioned before I'd like to move promotion into the compiler so that the linker doesn't need to do anything about it.
There *is* an LTO implementation that is working in multiple linker through libLTO (and gold that went with a dedicated plugin).
I'll break lld and won't try to fix it
That doesn't really seem reasonable to me. We all have a responsibility to keep the bots green.
That seems reasonable, but the conclusion on my side is that I then disagree with implementing ThinLTO in lld at all at this point as it is adding duplicating code that should not be there in the first place, and will just prevent progress in LLVM (the gold-plugin is already "technical debt")
Just to clarify my position: ThinLTO needs some coupling with the linker on multiple aspects, if the interface is not clearly defined and the logic is not shared, it means that evolving the function importer requires to change each linker plugin implementation, duplicating the same logic, which is a no-go IMO (this is the current state of ThinLTO in LLVM with some logic unfortunately duplicated between ThinLTOCodeGenerator and the gold-plugin).
I'm mostly talking about ThinLTO. ld64 is the only linker that uses ThinLTOCodeGenerator, to my knowledge.
I'll break lld and won't try to fix it
That doesn't really seem reasonable to me. We all have a responsibility to keep the bots green.
That seems reasonable, but the conclusion on my side is that I then disagree with implementing ThinLTO in lld at all at this point as it is adding duplicating code that should not be there in the first place, and will just prevent progress in LLVM (the gold-plugin is already "technical debt")
Not having ThinLTO in lld is already preventing progress in LLVM for me. It prevents me from being able to implement the extensions I need for CFI, for example.
If you're saying that you think that it would be better to extend ThinLTOCodeGenerator with what lld/gold/etc requires, I disagree, as I think it's premature to do that, especially since it's tied to the stable libLTO interface.
Just to clarify my position: ThinLTO needs some coupling with the linker on multiple aspects, if the interface is not clearly defined and the logic is not shared, it means that evolving the function importer requires to change each linker plugin implementation, duplicating the same logic, which is a no-go IMO (this is the current state of ThinLTO in LLVM with some logic unfortunately duplicated between ThinLTOCodeGenerator and the gold-plugin).
I agree that in the long term each linker should be using some well defined interface to ThinLTO, but I think we don't know what that interface should be yet. That's part of the reason why I'm implementing this directly first.
Besides, it's not a lot of logic, and as I already mentioned, I want to keep the API surface here small.
I'll break lld and won't try to fix it
That doesn't really seem reasonable to me. We all have a responsibility to keep the bots green.
That seems reasonable, but the conclusion on my side is that I then disagree with implementing ThinLTO in lld at all at this point as it is adding duplicating code that should not be there in the first place, and will just prevent progress in LLVM (the gold-plugin is already "technical debt")
Not having ThinLTO in lld is already preventing progress in LLVM for me. It prevents me from being able to implement the extensions I need for CFI, for example.
It is not clear to me: why can't you implement CFI in ThinLTO without ThinLTO in lld?
Is CFI is tied to lld in particular? I thought it is in production with Gold, even though admittedly I haven't followed closely all that stuff.
If you're saying that you think that it would be better to extend ThinLTOCodeGenerator with what lld/gold/etc requires, I disagree, as I think it's premature to do that, especially since it's tied to the stable libLTO interface.
The stable libLTO interface is not going away soon AFAIK, and we'll have to live with this for some time. The LTOCodeGenerator is terrible, and the ThinLTOCodeGenerator is not any better! Except maybe that it exposes a smaller API surface, and less constrained (for instance the number of object files produced does not have to match the number of inputs).
So do I think the ThinLTOCodeGenerator should be extended? No, on the contrary, it should be made a lot thinner by refactoring it, leaving in the end only the minimum possible that I'd expect a linker plugin to be: i.e. a bridge between the information needed by the ThinLTO logic and the linker itself.
The fact that the gold-plugin was developed as a monolithic blob is unfortunate, the fact that lld didn't bother refactoring any API with the gold-plugin seems wrong to me.
Just to clarify my position: ThinLTO needs some coupling with the linker on multiple aspects, if the interface is not clearly defined and the logic is not shared, it means that evolving the function importer requires to change each linker plugin implementation, duplicating the same logic, which is a no-go IMO (this is the current state of ThinLTO in LLVM with some logic unfortunately duplicated between ThinLTOCodeGenerator and the gold-plugin).
I agree that in the long term each linker should be using some well defined interface to ThinLTO, but I think we don't know what that interface should be yet. That's part of the reason why I'm implementing this directly first.
Besides, it's not a lot of logic, and as I already mentioned, I want to keep the API surface here small.
This will necessarily put some constrains on the FunctionImporter for instance, I already hit this with the GoldPlugin. This is what I mean by "I don't want to signup for maintaining this".
Of course I could go the way the gold-plugin and lld went: just for the FunctionImporter into ThinLTOCodeGenerator, but I don't think that how LLVM is usually developed.
CFI with regular LTO works in all linkers, but the ThinLTO version will be based on lld because of some missing features in the gold plugin interface. See this thread: https://groups.google.com/forum/#!topic/llvm-dev/OWmVaxNrIxo
If you're saying that you think that it would be better to extend ThinLTOCodeGenerator with what lld/gold/etc requires, I disagree, as I think it's premature to do that, especially since it's tied to the stable libLTO interface.
The stable libLTO interface is not going away soon AFAIK, and we'll have to live with this for some time. The LTOCodeGenerator is terrible, and the ThinLTOCodeGenerator is not any better! Except maybe that it exposes a smaller API surface, and less constrained (for instance the number of object files produced does not have to match the number of inputs).
So do I think the ThinLTOCodeGenerator should be extended? No, on the contrary, it should be made a lot thinner by refactoring it, leaving in the end only the minimum possible that I'd expect a linker plugin to be: i.e. a bridge between the information needed by the ThinLTO logic and the linker itself.
The fact that the gold-plugin was developed as a monolithic blob is unfortunate, the fact that lld didn't bother refactoring any API with the gold-plugin seems wrong to me.
It's not so much a matter of "not bothering" as the fact that it didn't really seem worth it due to the small amount of code needed, and it would just introduce another abstraction layer that people would need to understand.
But maybe there's some simple interface that we could implement now that could be shared at least between the gold plugin and lld. I suppose in the future in order to allow the interface to be used by ThinLTOCodeGenerator we may want to add caching or other features there, so it doesn't seem too unreasonable to introduce a place for those features to be added for all linkers. Let me see if I can come up with something.
Currently the support in gold-plugin for ThinLTO is pretty minimal, as it is here with the lld patch: read the per-module indexes, combine them, launch backend threads (which do the renameModuleForThinLTO and pass down the combined index). Most of the code in the gold-plugin is the thread launching/management.
What I would like to do is to refactor some of the handling out of libLTO, as noted earlier, but want to wait until the distributed backend communication is in place.
ELF/LTO.cpp | ||
---|---|---|
74 | For ld64/libLTO Mehdi has the support in to determine which linkonce values are needed so the lazy symbol linking is not used. However, we need that on the gold path, along with comdat handling which will need more info in the summary (added to my TODO list). I only want to do this once I have the distributed backend communication in place (D19636) so that there can be a single mechanism to communicate this and any linkage changes to the backends/importer. Then some of the handling that is in libLTO can be refactored out and used on all paths. Eventually this will lead to the function importer being able to use IRMover directly. |
Okay, I'm putting this on hold again until more bits are ready on the thinlto side of things.
I have updated the patch. I just want to make sure that people are happy with the situation around the ThinLTO API. I've been working on an LTO/ThinLTO API that could be used by both the gold plugin and lld which should address Mehdi's concerns, but it will take some time until that's ready.
Should we close this revision? Now that the new API is in LLVM and it is targeted by D24492 ; this seems obsolete, isn't it?
Do you need this?