This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Out-of-process CodeGenerator for legacy C API
Needs ReviewPublic

Authored by steven_wu on Nov 29 2018, 2:55 PM.

Details

Summary

Add an out-of-process ThinLTOCodeGenerator for legacy C API. Instead of
spawning threads and run codegen inside libLTO, OutOfProcess
CodeGenerator constructs clang invocations and uses "clang -fthinlto-index="
option to generate object files. The goal of this design is to enable
libLTO ThinLTOCodeGenerator to expose more information to the build
system so the build system can manage the system resource better.

The other goals/benefits for this design are:

  • No linker change is needed using this model. libLTO is in charge of

constructing the job, waiting for the completion and handing back the results
to linker with the exact same API.

  • Although this patch only comes with a proof-of-concept out-of-process

code generator which invokes clang locally, it can be easily extended. It
provides a ThinLTOCodegenManager abstraction, which provides customization
points to adapt to any protocol or system.

  • It is currenlty using a cl::opt to switch between different ThinLTO

codegen mode which means user can easily toggle between different codegen
models.

Based on patch by Daniel Grumberg

Diff Detail

Event Timeline

steven_wu created this revision.Nov 29 2018, 2:55 PM

Ping. Is there any comments of implementing C API in this approach?

Ping. Is there any comments of implementing C API in this approach?

Sorry for not responding earlier. I'll take a closer look this week, but a couple of high level questions below. I also added @pcc for thoughts.

Can you give a description of how this interacts with your distributed build system? I see references to something called XPC, but I don't know what that is. Is this something that is Apple build system specific? AFAICT btw this approach would not work with our build system, although we are using the new LTO API so not affected directly. But I have a concern that this may make it harder to migrate the old to the new LTO API, but hopefully it can be designed so that the two approaches could both be supported in a single LTO API.

Regarding linker changes, the only linker changes required by our distributed build approach is to exit the linker early (after the thin link), and to write out the individual index files (which is actually handled within the new LTO API so not much needs to be done in the linker for that). Presumably you could do that latter part inside the old LTO API as well (as you are doing here in this patch).

Thanks for taking a look. This patch is adding the customization points for thinLTO legacy API, which the code generator constructs clang invocations to do code generation. There is no dependency on any build system here and it only has a prove of concept codegen manager which invokes clang directly and collect the result back. You can replace this codegen manager with any protocol that is needed to talk to build system to run clang codegen.
XPC is the way to send information between process on Darwin, which is probably what we are going to use to talk to build system. If interested, I can post a patch which have example how to construct XPC communications, but there isn’t a build system you can use to listen on the other side to run the job yet.
When I say there are no code change for linker, I really mean there is no need to change a single line of code (maybe we need to add an API to select codegen manager in the future). ld64 really has a different approach using C API, which it tries to map the object file output back to the bitcode it gets as input. Terminating and relaunching the linker might has unexpected semantic changes for LTO. In the long run, maybe ld64 needs to design a new set of APIs to use the new C++ APIs but this is out of scope of this patch.

dang added a comment.Dec 13 2018, 4:41 AM

Thanks for taking a look. This patch is adding the customization points for thinLTO legacy API, which the code generator constructs clang invocations to do code generation. There is no dependency on any build system here and it only has a prove of concept codegen manager which invokes clang directly and collect the result back. You can replace this codegen manager with any protocol that is needed to talk to build system to run clang codegen.
XPC is the way to send information between process on Darwin, which is probably what we are going to use to talk to build system. If interested, I can post a patch which have example how to construct XPC communications, but there isn’t a build system you can use to listen on the other side to run the job yet.
When I say there are no code change for linker, I really mean there is no need to change a single line of code (maybe we need to add an API to select codegen manager in the future). ld64 really has a different approach using C API, which it tries to map the object file output back to the bitcode it gets as input. Terminating and relaunching the linker might has unexpected semantic changes for LTO. In the long run, maybe ld64 needs to design a new set of APIs to use the new C++ APIs but this is out of scope of this patch.

To reiterate @steven_wu 's point the aim of this patch is to hide the mechanics of out-of-process code generation from ld64, because we can't exit the linker early-like with gold. This is because ld64 tries to remap symbol information deduced by looking at the bitcode to whatever it gets back after thinlto code generation. These customisation point are very similar to what a ThinBackendProc is in the new LTO API. In this case LocalProcessCodeGenManager is like a ThinBackendProc, that emits sliced indices and constructs a clang invocation that performs the codegen and then manages communicating the outputs to the linker via a callback.

Thanks for taking a look. This patch is adding the customization points for thinLTO legacy API, which the code generator constructs clang invocations to do code generation. There is no dependency on any build system here and it only has a prove of concept codegen manager which invokes clang directly and collect the result back. You can replace this codegen manager with any protocol that is needed to talk to build system to run clang codegen.
XPC is the way to send information between process on Darwin, which is probably what we are going to use to talk to build system. If interested, I can post a patch which have example how to construct XPC communications, but there isn’t a build system you can use to listen on the other side to run the job yet.
When I say there are no code change for linker, I really mean there is no need to change a single line of code (maybe we need to add an API to select codegen manager in the future). ld64 really has a different approach using C API, which it tries to map the object file output back to the bitcode it gets as input. Terminating and relaunching the linker might has unexpected semantic changes for LTO. In the long run, maybe ld64 needs to design a new set of APIs to use the new C++ APIs but this is out of scope of this patch.

To reiterate @steven_wu 's point the aim of this patch is to hide the mechanics of out-of-process code generation from ld64, because we can't exit the linker early-like with gold. This is because ld64 tries to remap symbol information deduced by looking at the bitcode to whatever it gets back after thinlto code generation. These customisation point are very similar to what a ThinBackendProc is in the new LTO API. In this case LocalProcessCodeGenManager is like a ThinBackendProc, that emits sliced indices and constructs a clang invocation that performs the codegen and then manages communicating the outputs to the linker via a callback.

Thanks for the clarifications. Would it be possible to utilize ThinBackendProc for this instead of a new CodeGenManager class? I.e. make a new derived version that does the index file write and spawns the local processes? The advantage is that it would start converging the implementations. And I think this could aid in refactoring suggested below to avoid duplication. Another advantage is that both LTO API's would have access to all backend implementations (in process, write indexes and exit, write indexes and use local processes, etc).

lib/LTO/ThinLTOOutOfProcessCodeGenerator.cpp
182

There's a huge amount of code duplication between this and the base ThinLTOCodeGenerator::run(). Perhaps ThinLTOCodeGenerator can be refactored to use a CodegenManager, and have an in-process thread version of CodegenManager so that both can use the same base run() method but the customization points would be in the CodegenManager virtual methods. Or even better, refactor to use ThinBackendProc (see comment above)?

Thanks for the clarifications. Would it be possible to utilize ThinBackendProc for this instead of a new CodeGenManager class? I.e. make a new derived version that does the index file write and spawns the local processes? The advantage is that it would start converging the implementations. And I think this could aid in refactoring suggested below to avoid duplication. Another advantage is that both LTO API's would have access to all backend implementations (in process, write indexes and exit, write indexes and use local processes, etc).

It think that is definitely possible, but as you mentioned above, most of the code duplication is coming from ThinLTOCodeGenerator. Most of the duplication is to help me maintain is downstream so the patch can be able to constantly rebasing it to master. I want to see how much value does this have to upstream for other legacy C API users before I refactor both new and old API to share more interfaces. I will ping Sony linker team offline to see if they have any comments on this.

The other possibility is to refactor the ThinLTOCodegenator to use ThinBackendProc as a starting point. I haven't done any research to see how easy it is to adapt the C APIs to the new ones but I think that should be doable as well.