This is an archive of the discontinued LLVM Phabricator instance.

ELF: Add initial ThinLTO support.
AbandonedPublic

Authored by pcc on Apr 20 2016, 6:49 PM.

Diff Detail

Event Timeline

pcc updated this revision to Diff 54451.Apr 20 2016, 6:49 PM
pcc retitled this revision from to ELF: Add initial ThinLTO support..
pcc updated this object.
pcc added a subscriber: llvm-commits.
rafael edited edge metadata.Apr 20 2016, 7:12 PM

Sorry, but I think it is a bit too early for this.

There are just too many moving parts right now both in lld and thin lto. Even regular LTO is not complete. We are still not internalizing when producing .so files and not all tests pass on a clang bootstrap.

I still need to finish refactoring how we handle relocations and recover the performance hit we took when handling debug sections.

Can you put this on hold for 2 weeks?

pcc added a comment.Apr 20 2016, 7:29 PM

I certainly agree that regular LTO should reach a certain level of maturity before this goes in. That's part of why I've been spending time improving LLD to the point that I can link and run Chromium with LTO before working on this change.

Anyway, happy to continue looking at regular LTO bugs for a while first if others agree that we should hold off for now.

tejohnson edited edge metadata.Apr 20 2016, 10:16 PM

Thanks for working on this. Saw the follow up discussion about whether it is too early, but I had a couple comments that I thought I would share now regardless. Overall looks good and pretty straightforward.

ELF/InputFiles.cpp
605

The comment sounds like the opposite of what is being done here, but I am probably misunderstanding something...if ThinLto is true then this is a ThinLTO object not a regular LTO object, and it sounds like we are making it visible to regular LTO object files by setting this flag.

ELF/LTO.cpp
295–312

add SaveTemps handling?

299

Maybe exit early above here if there are no ThinModules, so that the ThreadPool setup isn't done unnecessarily.

grimar added a subscriber: grimar.Apr 21 2016, 2:02 AM
davide edited edge metadata.Apr 23 2016, 3:28 PM

I like the patch. Please hold on until Rafael finishes with relocations or general code churn and I'll take a closer look.

pcc updated this revision to Diff 55939.May 2 2016, 8:13 PM
pcc edited edge metadata.

Refresh

ruiu added inline comments.May 2 2016, 8:41 PM
ELF/Driver.cpp
22 ↗(On Diff #55939)

Ditto.

ELF/Driver.h
17

Do you need this?

ELF/Error.cpp
55

Even though this function is passed as std::function<> object, it should be verb rather than noun because it is a function.

ELF/Error.h
16–18

Remove blank lines.

ELF/LTO.cpp
315–319

I can see the reason why you chose to use auto here, but still I'd use real types for consistency even though it's a bit too verbose.

ELF/LTO.h
33–35

Remove blank lines.

ELF/SymbolTable.cpp
49–53

Is this what clang-format formatted?

231–234

You can write

if (auto *BC = dyn_cast<BitcodeFile>(File))
  return !BC->ThinLto;
return false;
pcc updated this revision to Diff 56043.May 3 2016, 12:15 PM
pcc marked 6 inline comments as done.
  • Address review comments
ELF/Driver.cpp
22 ↗(On Diff #55939)

Removed.

ELF/Driver.h
17

Yes. A definition of LLVMContext is required for the field LinkerDriver::Context. We were previously getting the definition from LTO.h via SymbolTable.h, until I removed the #include of LTO.h to SymbolTable.cpp.

ELF/InputFiles.cpp
605

The idea with this comment was that if the regular LTO object defines a symbol, we should mark it as used if we see a ThinLTO object with an undefined reference to that symbol. But yes, it works the other way as well. I have moved this comment to isRegularLtoInputFile with a slightly better explanation.

ELF/LTO.cpp
295–312

We can most likely add that separately.

ELF/SymbolTable.cpp
49–53

Reformatted.

rafael added inline comments.May 3 2016, 2:29 PM
ELF/InputFiles.h
229

I would make the name a bit more descriptive. How about hasThinLtoSummary?

ELF/LTO.cpp
28

Just to make this easier to review, can the first version be single threaded?

151

Add a comment on why thin-lto requires initializing these bits early.

168

Making this a static helper is a nice independent cleanup. Please commit and rebase.

228

This logic is mostly duplicated with the regular lto case. Can you add a helper that return S or null? It should basically return S when we need to call undefine.

294

This is duplicated with what splitCodeGen does. We should populate the passes in one place.

ELF/LTO.h
70

A more descriptive name would be nice.

ELF/SymbolTable.cpp
50

Why?

rafael added inline comments.May 3 2016, 2:32 PM
ELF/LTO.cpp
74

This cause the new lld to depend on Linker::linkModules, which should really not happen.

Mehdi, you are working on having the function import pass use just the ir mover, correct? What is the next step in that?

Note: I haven't looked at the patch, or the LTO implementation in lld, but my take on the current state of the gold plugin is that we expose far too much in the linker. I understand that the libLTO API was not nice, but I think the solution implemented in the gold plugin is terrible because it does not expose any generic API. The proper solution should be in my opinion to have a "LTOCodeGenerator" (and/or a "ThinLTOCodeGenerator") that exposes an appropriate interface to the linker, that would be shared by lld, gold, etc.

For instance any call to "setLinkage" should never happen in a linker-specific code.

Feel free to implement it as you want in lld, but I don't want to be bound by any "external" logic as I have been in the past with the gold plugin (i.e. I'll break lld and won't try to fix it).

ELF/LTO.cpp
74

I believe the only reason I didn't go all the way was because the "Linker" handles some specific things about comdat that are not implemented in ThinLTO (we don't have comdat information in the summary). And since we don't have comdat on Darwin, I'm not use to work with them.
I don't know if Teresa has this in her TODO list.

pcc added a comment.May 3 2016, 4:40 PM

The proper solution should be in my opinion to have a "LTOCodeGenerator" (and/or a "ThinLTOCodeGenerator") that exposes an appropriate interface to the linker, that would be shared by lld, gold, etc.

Maybe, but I'm not sure that we can know what the right interface should be without working implementations in multiple linkers.

I'll break lld and won't try to fix it

That doesn't really seem reasonable to me. We all have a responsibility to keep the bots green.

That said, I'm hoping to keep the API surface area used here relatively small, which should make it easier to update. One example of this is that as I mentioned before I'd like to move promotion into the compiler so that the linker doesn't need to do anything about it.

In D19351#420587, @pcc wrote:

The proper solution should be in my opinion to have a "LTOCodeGenerator" (and/or a "ThinLTOCodeGenerator") that exposes an appropriate interface to the linker, that would be shared by lld, gold, etc.

Maybe, but I'm not sure that we can know what the right interface should be without working implementations in multiple linkers.

There *is* an LTO implementation that is working in multiple linker through libLTO (and gold that went with a dedicated plugin).

I'll break lld and won't try to fix it

That doesn't really seem reasonable to me. We all have a responsibility to keep the bots green.

That seems reasonable, but the conclusion on my side is that I then disagree with implementing ThinLTO in lld at all at this point as it is adding duplicating code that should not be there in the first place, and will just prevent progress in LLVM (the gold-plugin is already "technical debt")

Just to clarify my position: ThinLTO needs some coupling with the linker on multiple aspects, if the interface is not clearly defined and the logic is not shared, it means that evolving the function importer requires to change each linker plugin implementation, duplicating the same logic, which is a no-go IMO (this is the current state of ThinLTO in LLVM with some logic unfortunately duplicated between ThinLTOCodeGenerator and the gold-plugin).

pcc added a comment.May 3 2016, 5:23 PM
In D19351#420588, @joker.eph wrote:
In D19351#420587, @pcc wrote:

The proper solution should be in my opinion to have a "LTOCodeGenerator" (and/or a "ThinLTOCodeGenerator") that exposes an appropriate interface to the linker, that would be shared by lld, gold, etc.

Maybe, but I'm not sure that we can know what the right interface should be without working implementations in multiple linkers.

There *is* an LTO implementation that is working in multiple linker through libLTO (and gold that went with a dedicated plugin).

I'm mostly talking about ThinLTO. ld64 is the only linker that uses ThinLTOCodeGenerator, to my knowledge.

I'll break lld and won't try to fix it

That doesn't really seem reasonable to me. We all have a responsibility to keep the bots green.

That seems reasonable, but the conclusion on my side is that I then disagree with implementing ThinLTO in lld at all at this point as it is adding duplicating code that should not be there in the first place, and will just prevent progress in LLVM (the gold-plugin is already "technical debt")

Not having ThinLTO in lld is already preventing progress in LLVM for me. It prevents me from being able to implement the extensions I need for CFI, for example.

If you're saying that you think that it would be better to extend ThinLTOCodeGenerator with what lld/gold/etc requires, I disagree, as I think it's premature to do that, especially since it's tied to the stable libLTO interface.

Just to clarify my position: ThinLTO needs some coupling with the linker on multiple aspects, if the interface is not clearly defined and the logic is not shared, it means that evolving the function importer requires to change each linker plugin implementation, duplicating the same logic, which is a no-go IMO (this is the current state of ThinLTO in LLVM with some logic unfortunately duplicated between ThinLTOCodeGenerator and the gold-plugin).

I agree that in the long term each linker should be using some well defined interface to ThinLTO, but I think we don't know what that interface should be yet. That's part of the reason why I'm implementing this directly first.

Besides, it's not a lot of logic, and as I already mentioned, I want to keep the API surface here small.

I'll break lld and won't try to fix it

That doesn't really seem reasonable to me. We all have a responsibility to keep the bots green.

That seems reasonable, but the conclusion on my side is that I then disagree with implementing ThinLTO in lld at all at this point as it is adding duplicating code that should not be there in the first place, and will just prevent progress in LLVM (the gold-plugin is already "technical debt")

Not having ThinLTO in lld is already preventing progress in LLVM for me. It prevents me from being able to implement the extensions I need for CFI, for example.

It is not clear to me: why can't you implement CFI in ThinLTO without ThinLTO in lld?
Is CFI is tied to lld in particular? I thought it is in production with Gold, even though admittedly I haven't followed closely all that stuff.

If you're saying that you think that it would be better to extend ThinLTOCodeGenerator with what lld/gold/etc requires, I disagree, as I think it's premature to do that, especially since it's tied to the stable libLTO interface.

The stable libLTO interface is not going away soon AFAIK, and we'll have to live with this for some time. The LTOCodeGenerator is terrible, and the ThinLTOCodeGenerator is not any better! Except maybe that it exposes a smaller API surface, and less constrained (for instance the number of object files produced does not have to match the number of inputs).
So do I think the ThinLTOCodeGenerator should be extended? No, on the contrary, it should be made a lot thinner by refactoring it, leaving in the end only the minimum possible that I'd expect a linker plugin to be: i.e. a bridge between the information needed by the ThinLTO logic and the linker itself.
The fact that the gold-plugin was developed as a monolithic blob is unfortunate, the fact that lld didn't bother refactoring any API with the gold-plugin seems wrong to me.

Just to clarify my position: ThinLTO needs some coupling with the linker on multiple aspects, if the interface is not clearly defined and the logic is not shared, it means that evolving the function importer requires to change each linker plugin implementation, duplicating the same logic, which is a no-go IMO (this is the current state of ThinLTO in LLVM with some logic unfortunately duplicated between ThinLTOCodeGenerator and the gold-plugin).

I agree that in the long term each linker should be using some well defined interface to ThinLTO, but I think we don't know what that interface should be yet. That's part of the reason why I'm implementing this directly first.

Besides, it's not a lot of logic, and as I already mentioned, I want to keep the API surface here small.

This will necessarily put some constrains on the FunctionImporter for instance, I already hit this with the GoldPlugin. This is what I mean by "I don't want to signup for maintaining this".
Of course I could go the way the gold-plugin and lld went: just for the FunctionImporter into ThinLTOCodeGenerator, but I don't think that how LLVM is usually developed.

pcc added a comment.May 3 2016, 7:04 PM
In D19351#420615, @joker.eph wrote:

I'll break lld and won't try to fix it

That doesn't really seem reasonable to me. We all have a responsibility to keep the bots green.

That seems reasonable, but the conclusion on my side is that I then disagree with implementing ThinLTO in lld at all at this point as it is adding duplicating code that should not be there in the first place, and will just prevent progress in LLVM (the gold-plugin is already "technical debt")

Not having ThinLTO in lld is already preventing progress in LLVM for me. It prevents me from being able to implement the extensions I need for CFI, for example.

It is not clear to me: why can't you implement CFI in ThinLTO without ThinLTO in lld?
Is CFI is tied to lld in particular? I thought it is in production with Gold, even though admittedly I haven't followed closely all that stuff.

CFI with regular LTO works in all linkers, but the ThinLTO version will be based on lld because of some missing features in the gold plugin interface. See this thread: https://groups.google.com/forum/#!topic/llvm-dev/OWmVaxNrIxo

If you're saying that you think that it would be better to extend ThinLTOCodeGenerator with what lld/gold/etc requires, I disagree, as I think it's premature to do that, especially since it's tied to the stable libLTO interface.

The stable libLTO interface is not going away soon AFAIK, and we'll have to live with this for some time. The LTOCodeGenerator is terrible, and the ThinLTOCodeGenerator is not any better! Except maybe that it exposes a smaller API surface, and less constrained (for instance the number of object files produced does not have to match the number of inputs).
So do I think the ThinLTOCodeGenerator should be extended? No, on the contrary, it should be made a lot thinner by refactoring it, leaving in the end only the minimum possible that I'd expect a linker plugin to be: i.e. a bridge between the information needed by the ThinLTO logic and the linker itself.
The fact that the gold-plugin was developed as a monolithic blob is unfortunate, the fact that lld didn't bother refactoring any API with the gold-plugin seems wrong to me.

It's not so much a matter of "not bothering" as the fact that it didn't really seem worth it due to the small amount of code needed, and it would just introduce another abstraction layer that people would need to understand.

But maybe there's some simple interface that we could implement now that could be shared at least between the gold plugin and lld. I suppose in the future in order to allow the interface to be used by ThinLTOCodeGenerator we may want to add caching or other features there, so it doesn't seem too unreasonable to introduce a place for those features to be added for all linkers. Let me see if I can come up with something.

In D19351#420655, @pcc wrote:

If you're saying that you think that it would be better to extend ThinLTOCodeGenerator with what lld/gold/etc requires, I disagree, as I think it's premature to do that, especially since it's tied to the stable libLTO interface.

The stable libLTO interface is not going away soon AFAIK, and we'll have to live with this for some time. The LTOCodeGenerator is terrible, and the ThinLTOCodeGenerator is not any better! Except maybe that it exposes a smaller API surface, and less constrained (for instance the number of object files produced does not have to match the number of inputs).
So do I think the ThinLTOCodeGenerator should be extended? No, on the contrary, it should be made a lot thinner by refactoring it, leaving in the end only the minimum possible that I'd expect a linker plugin to be: i.e. a bridge between the information needed by the ThinLTO logic and the linker itself.
The fact that the gold-plugin was developed as a monolithic blob is unfortunate, the fact that lld didn't bother refactoring any API with the gold-plugin seems wrong to me.

It's not so much a matter of "not bothering" as the fact that it didn't really seem worth it due to the small amount of code needed, and it would just introduce another abstraction layer that people would need to understand.

But maybe there's some simple interface that we could implement now that could be shared at least between the gold plugin and lld. I suppose in the future in order to allow the interface to be used by ThinLTOCodeGenerator we may want to add caching or other features there, so it doesn't seem too unreasonable to introduce a place for those features to be added for all linkers. Let me see if I can come up with something.

Currently the support in gold-plugin for ThinLTO is pretty minimal, as it is here with the lld patch: read the per-module indexes, combine them, launch backend threads (which do the renameModuleForThinLTO and pass down the combined index). Most of the code in the gold-plugin is the thread launching/management.

What I would like to do is to refactor some of the handling out of libLTO, as noted earlier, but want to wait until the distributed backend communication is in place.

ELF/LTO.cpp
74

For ld64/libLTO Mehdi has the support in to determine which linkonce values are needed so the lazy symbol linking is not used. However, we need that on the gold path, along with comdat handling which will need more info in the summary (added to my TODO list). I only want to do this once I have the distributed backend communication in place (D19636) so that there can be a single mechanism to communicate this and any linkage changes to the backends/importer. Then some of the handling that is in libLTO can be refactored out and used on all paths. Eventually this will lead to the function importer being able to use IRMover directly.

pcc planned changes to this revision.May 5 2016, 11:59 AM

Okay, I'm putting this on hold again until more bits are ready on the thinlto side of things.

pcc updated this revision to Diff 57094.May 12 2016, 1:07 PM
pcc marked 6 inline comments as done.
  • Refresh, address comments
pcc added a comment.May 12 2016, 1:13 PM

I have updated the patch. I just want to make sure that people are happy with the situation around the ThinLTO API. I've been working on an LTO/ThinLTO API that could be used by both the gold plugin and lld which should address Mehdi's concerns, but it will take some time until that's ready.

Should we close this revision? Now that the new API is in LLVM and it is targeted by D24492 ; this seems obsolete, isn't it?

pcc abandoned this revision.Sep 15 2016, 1:18 PM