Page MenuHomePhabricator

[LTO] Allow client to skip code gen
AbandonedPublic

Authored by tejohnson on Mar 17 2017, 2:01 PM.

Details

Reviewers
pcc
Summary

This is useful when invoking the ThinLTO backend via clang, which
already has handling (immediately after the existing early return
when invoking the ThinLTO backend) for setting up the requested code
generation.

We noticed that when invoking the thinBackend via clang (for the
distributed build case) that flags like -ffunction-sections and
-emit-llvm were not having the intended effect. This could have been
fixed by setting up the TargetOptions and the CodeGenFileType in the LTO
Config, but since clang already has handling for all of this, it is
straightforward to just let it do the handling.

Event Timeline

tejohnson created this revision.Mar 17 2017, 2:01 PM

We noticed that when invoking the thinBackend via clang (for the distributed build case) that flags like -ffunction-sections and -emit-llvm were not having the intended effect. This could have been fixed by setting up the TargetOptions and the CodeGenFileType in the LTO Config, but since clang already has handling for all of this, it is straightforward to just let it do the handling.

Well, I'm not convinced it is the right thing to do. It is the most straightforward thing to do for the distributed case, but it creates a discrepancy with the non-distributed case, and it won't allow to honor -ffunction-sections this way.
I also believe that it won't help to guarantee that wether you're using a distributed build or not you get the same binary.

We noticed that when invoking the thinBackend via clang (for the distributed build case) that flags like -ffunction-sections and -emit-llvm were not having the intended effect. This could have been fixed by setting up the TargetOptions and the CodeGenFileType in the LTO Config, but since clang already has handling for all of this, it is straightforward to just let it do the handling.

Well, I'm not convinced it is the right thing to do. It is the most straightforward thing to do for the distributed case, but it creates a discrepancy with the non-distributed case, and it won't allow to honor -ffunction-sections this way.

Can you clarify the concern? We do pass down -ffunction-sections correctly in the non-distributed case, e.g. via the gold plugin. The gold plugin which is only handling LTO doesn't have any built in code gen handling (anymore with the new LTO API).

I also believe that it won't help to guarantee that wether you're using a distributed build or not you get the same binary.

How so?

We noticed that when invoking the thinBackend via clang (for the distributed build case) that flags like -ffunction-sections and -emit-llvm were not having the intended effect. This could have been fixed by setting up the TargetOptions and the CodeGenFileType in the LTO Config, but since clang already has handling for all of this, it is straightforward to just let it do the handling.

Well, I'm not convinced it is the right thing to do. It is the most straightforward thing to do for the distributed case, but it creates a discrepancy with the non-distributed case, and it won't allow to honor -ffunction-sections this way.

Can you clarify the concern? We do pass down -ffunction-sections correctly in the non-distributed case, e.g. via the gold plugin. The gold plugin which is only handling LTO doesn't have any built in code gen handling (anymore with the new LTO API).

OK, then how do we pass down -ffunction-sections to the LTO API in the non-distributed case? I expect the same flow to happen here.

I also believe that it won't help to guarantee that wether you're using a distributed build or not you get the same binary.

How so?

The Target setup can take many options in the way it is created. Having the exact same flow and API used in both cases helps figuring out where any discrepancy comes from (it is also less moving pieces).

We noticed that when invoking the thinBackend via clang (for the distributed build case) that flags like -ffunction-sections and -emit-llvm were not having the intended effect. This could have been fixed by setting up the TargetOptions and the CodeGenFileType in the LTO Config, but since clang already has handling for all of this, it is straightforward to just let it do the handling.

Well, I'm not convinced it is the right thing to do. It is the most straightforward thing to do for the distributed case, but it creates a discrepancy with the non-distributed case, and it won't allow to honor -ffunction-sections this way.

Can you clarify the concern? We do pass down -ffunction-sections correctly in the non-distributed case, e.g. via the gold plugin. The gold plugin which is only handling LTO doesn't have any built in code gen handling (anymore with the new LTO API).

OK, then how do we pass down -ffunction-sections to the LTO API in the non-distributed case? I expect the same flow to happen here.

See AddGoldPlugin in clang for where we translate it into the equivalent llvm internal option passed to the plugin. It eventually gets used to init the target options in the Config in the plugin.

I also believe that it won't help to guarantee that wether you're using a distributed build or not you get the same binary.

How so?

The Target setup can take many options in the way it is created. Having the exact same flow and API used in both cases helps figuring out where any discrepancy comes from (it is also less moving pieces).

We're more limited by what gets passed to the plugin and how, e.g. in the function sections case.

We noticed that when invoking the thinBackend via clang (for the distributed build case) that flags like -ffunction-sections and -emit-llvm were not having the intended effect. This could have been fixed by setting up the TargetOptions and the CodeGenFileType in the LTO Config, but since clang already has handling for all of this, it is straightforward to just let it do the handling.

Well, I'm not convinced it is the right thing to do. It is the most straightforward thing to do for the distributed case, but it creates a discrepancy with the non-distributed case, and it won't allow to honor -ffunction-sections this way.

Can you clarify the concern? We do pass down -ffunction-sections correctly in the non-distributed case, e.g. via the gold plugin. The gold plugin which is only handling LTO doesn't have any built in code gen handling (anymore with the new LTO API).

OK, then how do we pass down -ffunction-sections to the LTO API in the non-distributed case? I expect the same flow to happen here.

See AddGoldPlugin in clang for where we translate it into the equivalent llvm internal option passed to the plugin. It eventually gets used to init the target options in the Config in the plugin.

Yes saw that, this is what I was expecting "somehow".

We're setting this up as a side-channel using cl::ParseCommandLineOptions(NumOpts, &options::extra[0]); in gold-plugin.cpp (oh boy I don't like the backend in general, and this is no exception).

And then: Conf.Options = InitTargetOptionsFromCodeGenFlags(); (Conf is lto::Config here, and Conf.Options is TargetOptions)

So I rather get clang to initialize the TargetOptions in the LTO config. I think it should be easy to extract EmitAssemblyHelper::CreateTargetOptions out of EmitAssemblyHelper::CreateTargetMachine and reuse it for this purpose.

I also believe that it won't help to guarantee that wether you're using a distributed build or not you get the same binary.

How so?

The Target setup can take many options in the way it is created. Having the exact same flow and API used in both cases helps figuring out where any discrepancy comes from (it is also less moving pieces).

We're more limited by what gets passed to the plugin and how, e.g. in the function sections case.

Not sure what you mean here, I suspect this is why I'm against using a different flow: we're not supposed to be passing flags differently one case from the other.

We noticed that when invoking the thinBackend via clang (for the distributed build case) that flags like -ffunction-sections and -emit-llvm were not having the intended effect. This could have been fixed by setting up the TargetOptions and the CodeGenFileType in the LTO Config, but since clang already has handling for all of this, it is straightforward to just let it do the handling.

Well, I'm not convinced it is the right thing to do. It is the most straightforward thing to do for the distributed case, but it creates a discrepancy with the non-distributed case, and it won't allow to honor -ffunction-sections this way.

Can you clarify the concern? We do pass down -ffunction-sections correctly in the non-distributed case, e.g. via the gold plugin. The gold plugin which is only handling LTO doesn't have any built in code gen handling (anymore with the new LTO API).

OK, then how do we pass down -ffunction-sections to the LTO API in the non-distributed case? I expect the same flow to happen here.

See AddGoldPlugin in clang for where we translate it into the equivalent llvm internal option passed to the plugin. It eventually gets used to init the target options in the Config in the plugin.

Yes saw that, this is what I was expecting "somehow".

We're setting this up as a side-channel using cl::ParseCommandLineOptions(NumOpts, &options::extra[0]); in gold-plugin.cpp (oh boy I don't like the backend in general, and this is no exception).

And then: Conf.Options = InitTargetOptionsFromCodeGenFlags(); (Conf is lto::Config here, and Conf.Options is TargetOptions)

So I rather get clang to initialize the TargetOptions in the LTO config. I think it should be easy to extract EmitAssemblyHelper::CreateTargetOptions out of EmitAssemblyHelper::CreateTargetMachine and reuse it for this purpose.

That is doable, but I don't see the advantage of duplicating the logic in EmitAssemblyHelper (e.g. CreateTargetMachine), just so we can invoke the codegen through LTO, since we have to be able to do this outside LTO in EmitAssemblyHelper in the non-LTO case in clang. We have to set up TargetOptions in either case, see below.

I also believe that it won't help to guarantee that wether you're using a distributed build or not you get the same binary.

How so?

The Target setup can take many options in the way it is created. Having the exact same flow and API used in both cases helps figuring out where any discrepancy comes from (it is also less moving pieces).

We're more limited by what gets passed to the plugin and how, e.g. in the function sections case.

Not sure what you mean here, I suspect this is why I'm against using a different flow: we're not supposed to be passing flags differently one case from the other.

There's not much choice here I think. In the case of doing a ThinLTO backend via clang, this information is passed in memory in the CodeGenOptions data structure. That is used to fill in the TargetOptions data structure in EmitAssemblyHelper::CreateTargetMachine.

In the in-process gold case, it needs to be passed from clang into the gold plugin somehow, which can't be done directly via the CodeGenOptions data structure, so we pass internal options. Then the gold plugin is responsible for getting those to init the TargetOptions (as you note above it does that via InitTargetOptionsFromCodeGenFlags).

So regardless of whether the clang invocation of the ThinLTO backends uses LTO to do the code gen, and sets up the TargetOptions struct within lto::Config, or just uses the existing EmitAssemblyHelper which sets up the TargetOptions there, the setup of TargetOptions is up to the LTO client.

So I rather get clang to initialize the TargetOptions in the LTO config. I think it should be easy to extract EmitAssemblyHelper::CreateTargetOptions out of EmitAssemblyHelper::CreateTargetMachine and reuse it for this purpose.

That is doable, but I don't see the advantage of duplicating the logic in EmitAssemblyHelper (e.g. CreateTargetMachine), just so we can invoke the codegen through LTO, since we have to be able to do this outside LTO in EmitAssemblyHelper in the non-LTO case in clang. We have to set up TargetOptions in either case, see below.

Not clear which duplicating logic you're referring to right now? I'm talking about *reusing* the logic (OK I may miss something because I haven't looked deep enough).

Not sure what you mean here, I suspect this is why I'm against using a different flow: we're not supposed to be passing flags differently one case from the other.

There's not much choice here I think. In the case of doing a ThinLTO backend via clang, this information is passed in memory in the CodeGenOptions data structure. That is used to fill in the TargetOptions data structure in EmitAssemblyHelper::CreateTargetMachine.

In the in-process gold case, it needs to be passed from clang into the gold plugin somehow, which can't be done directly via the CodeGenOptions data structure, so we pass internal options. Then the gold plugin is responsible for getting those to init the TargetOptions (as you note above it does that via InitTargetOptionsFromCodeGenFlags).

So regardless of whether the clang invocation of the ThinLTO backends uses LTO to do the code gen, and sets up the TargetOptions struct within lto::Config, or just uses the existing EmitAssemblyHelper which sets up the TargetOptions there, the setup of TargetOptions is up to the LTO client.

What I'm loooking at is to have the guarantee that by dumping (serializing) the lto::Config, you get all the information to reproduce. Ultimately I'd like to include this as part of "save-temps" and be able to reload the config to replay a backend. Having a single API all the way helps to setup something like this.

So I rather get clang to initialize the TargetOptions in the LTO config. I think it should be easy to extract EmitAssemblyHelper::CreateTargetOptions out of EmitAssemblyHelper::CreateTargetMachine and reuse it for this purpose.

That is doable, but I don't see the advantage of duplicating the logic in EmitAssemblyHelper (e.g. CreateTargetMachine), just so we can invoke the codegen through LTO, since we have to be able to do this outside LTO in EmitAssemblyHelper in the non-LTO case in clang. We have to set up TargetOptions in either case, see below.

Not clear which duplicating logic you're referring to right now? I'm talking about *reusing* the logic (OK I may miss something because I haven't looked deep enough).

I figured it'll be easier to figure what I had in mind by implementing it: https://reviews.llvm.org/D31114

tejohnson updated this revision to Diff 93534.Mar 30 2017, 1:38 PM

Add support for allowing LTO client to emit LLVM IR assembly or bitcode.

tejohnson abandoned this revision.Mar 31 2017, 6:55 AM

Subsumed by D31534