This is an archive of the discontinued LLVM Phabricator instance.

[LTO][Legacy] Decouple option parsing from LTOCodeGenerator
ClosedPublic

Authored by w2yehia on Dec 3 2020, 2:43 PM.

Details

Summary

in this patch we add a new libLTO API to specify debug options independent of an lto_code_gen_t.
This allows clients to pass codegen flags (through libLTO) which otherwise today are ignored.

Diff Detail

Event Timeline

w2yehia created this revision.Dec 3 2020, 2:43 PM
w2yehia requested review of this revision.Dec 3 2020, 2:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 3 2020, 2:43 PM
steven_wu requested changes to this revision.Dec 7 2020, 9:50 AM

Feedback has been provided on the mailing list. Using function attributes is preferred and changing legacy C API is not allowed.

This revision now requires changes to proceed.Dec 7 2020, 9:50 AM
w2yehia updated this revision to Diff 331729.Mar 18 2021, 5:21 PM

in this patch we add a new libLTO API to specify debug options independent of an lto_code_gen_t.
This allows clients to pass codegen flags (through libLTO) which otherwise today are ignored.

w2yehia updated this revision to Diff 331735.Mar 18 2021, 5:49 PM
w2yehia added a reviewer: fhahn.
steven_wu requested changes to this revision.Mar 19 2021, 9:41 AM

Like replied in email. The non-scaleable option is very intentional and debug_option is by its name for compiler developer only and should not be used in production. The preferred option is to create API to set every single attributes you want to set.

llvm/tools/lto/lto.cpp
468

If you have to pick an interface, use the new _array interface.

This is also not correct since you skipped lto_add_attrs function when you called your version.

This revision now requires changes to proceed.Mar 19 2021, 9:41 AM
w2yehia added inline comments.Mar 19 2021, 1:29 PM
llvm/tools/lto/lto.cpp
468

This is also not correct since you skipped lto_add_attrs function when you called your version.

I might be missing something.
lto_add_attrs will still run where it usually runs, in maybeParseOptions.
Note that I introduced a new boolean, parsedOptionsEarly, to indicate whether early parsing happens.
Also, I kept the code in maybeParseOptions untouched, so when we call unwrap(cg)->parseCodeGenDebugOptions() no parsing would happen if we haven't called unwrap(cg)->setCodeGenDebugOptions earlier.

I think some function and variable names could be improved to reflect their intended purpose.
For example, LTOCodeGenerator::parseCodeGenDebugOptions ==> LTOCodeGenerator::parseSavedCodeGenDebugOptions.
Also I can add a LTOCodeGenerator::postOptionParsing function, to contain code such as the save-temps registration.

w2yehia updated this revision to Diff 332043.Mar 19 2021, 6:01 PM

changes made:

  • use _array interface in lto_codegen_debug_options_early
  • replace parsedOptionsEarly and parsedOptions with a single enum variable.
w2yehia edited the summary of this revision. (Show Details)Mar 22 2021, 1:04 PM

@steven_wu I've addressed your comment. Please review when you have a chance.

I still don't like this implementation, but fine if you have to do this.

But please rename the function, this is no longer a lto_codegen function. Just named it something like lto_set_debug_options, which is also cleaner.

llvm/tools/lto/lto.cpp
426

Shouldn't this be:

if (optionParsingState ==  OptParsingState::NotParsed)
  unwrap(cg)->parseCodeGenDebugOptions();

if (optionParsingState !=  OptParsingState::Done)
    lto_add_attrs(cg);

optionParsingState = OptParsingState::Done;
w2yehia added inline comments.Mar 24 2021, 5:46 AM
llvm/tools/lto/lto.cpp
426

I considered that... the only disadvantage with guarding unwrap(cg)->parseCodeGenDebugOptions(); is that now it's not going to run when early option parsing occurs and we would need an additional function in LTOCodeGenerator to contain code that we want to execute after debug option parsing has occurred. For example the code to register save-temps (that D98183 is planning to add).

Other than that I think your suggestion is abit more intuitive when reading the code.
I'll let you decide which way we go.
Thanks

w2yehia updated this revision to Diff 332996.EditedMar 24 2021, 8:09 AM

renamed lto_codegen_debug_options_early to lto_set_debug_options, and added an entry to lto.exports

@steven_wu please take a look when you have a chance. Thanks

steven_wu added inline comments.Mar 30 2021, 9:31 AM
llvm/tools/lto/lto.cpp
426

D98183 is really a debug option which is not used for production and it doesn't matter when you parse that option.

Actually thinking about this again, maybe keep this as it was if the command line can just be parsed twice. The way you implement it makes two options exclusive. If you use your parsing function, lto_add_attrs cannot be reached so you lost some codegen options.

@steven_wu thanks for your review. Is this good to be merged then?

@steven_wu thanks for your review. Is this good to be merged then?

Can you just revert to use parsedOptions as boolean? The state you used right now doesn't add anything. You can just leave lto_set_debug_options not guarded right?

@steven_wu thanks for your review. Is this good to be merged then?

Can you just revert to use parsedOptions as boolean? The state you used right now doesn't add anything. You can just leave lto_set_debug_options not guarded right?

I consider the two APIs lto_set_debug_options and lto_codegen_debug_options to be mutually exclusive, so I need to keep track whether lto_set_debug_options was called.
Before creating the enum state, I had two booleans parsedOptions and parsedOptionsEarly where parsedOptionsEarly is a boolean I added to tell that early options has been set (i.e. lto_set_debug_options called).
However, I cannot set parsedOptions when my new API function is called because I want maybeParseOptions to still do it's thing. So I needed that second boolean.

So I then created the state variable in place of the two booleans (cleaner in my opinion and avoids having to check combinations of the two booleans) to tell which state we're in:

NotParsed, // Initial state. before either lto_set_debug_options or maybeParseOptions is called.
Early,     // After lto_set_debug_options is called. This state will be skipped if user calls lto_codegen_debug_options.
Done       // After maybeParseOptions is called (i.e. unwrap(cg)->parseCodeGenDebugOptions() and lto_add_attrs(cg) have executed)

The state can either change NotParsed -> Done or NotParsed -> Early -> Done depending on which API was used to set debug options.

@steven_wu thanks for your review. Is this good to be merged then?

Can you just revert to use parsedOptions as boolean? The state you used right now doesn't add anything. You can just leave lto_set_debug_options not guarded right?

I consider the two APIs lto_set_debug_options and lto_codegen_debug_options to be mutually exclusive, so I need to keep track whether lto_set_debug_options was called.
Before creating the enum state, I had two booleans parsedOptions and parsedOptionsEarly where parsedOptionsEarly is a boolean I added to tell that early options has been set (i.e. lto_set_debug_options called).
However, I cannot set parsedOptions when my new API function is called because I want maybeParseOptions to still do it's thing. So I needed that second boolean.

So I then created the state variable in place of the two booleans (cleaner in my opinion and avoids having to check combinations of the two booleans) to tell which state we're in:

NotParsed, // Initial state. before either lto_set_debug_options or maybeParseOptions is called.
Early,     // After lto_set_debug_options is called. This state will be skipped if user calls lto_codegen_debug_options.
Done       // After maybeParseOptions is called (i.e. unwrap(cg)->parseCodeGenDebugOptions() and lto_add_attrs(cg) have executed)

The state can either change NotParsed -> Done or NotParsed -> Early -> Done depending on which API was used to set debug options.

Is there a problem if both APIs are called (providing they don't conflict with each other or duplicate arguments?)

Is there a problem if both APIs are called (providing they don't conflict with each other or duplicate arguments?)

If you call both APIs, then you run cl::ParseCommandLineOptions twice. I haven't tested it, but I'm pretty sure it's not intended to be called twice.

steven_wu accepted this revision.Mar 30 2021, 2:00 PM

Ok, fine with that.

This revision is now accepted and ready to land.Mar 30 2021, 2:00 PM
This revision was landed with ongoing or failed builds.Mar 31 2021, 9:44 AM
This revision was automatically updated to reflect the committed changes.