This is an archive of the discontinued LLVM Phabricator instance.

[LTO][ELF] Add --lto-exit-on option
AbandonedPublic

Authored by Northbadge on Jun 14 2022, 1:01 PM.

Details

Summary

Allows the linker to "exit early" after running specific LTO module hooks.

This is useful in combination with --save-temps= (D127778) as it gives the flexibility of stopping lld once the desired files are emitted. This minimizes wasted time (from performing optimizations, codegen, linking, etc) when all that is desired from the linker is the "--save-temps=import" output (which is completed before the aforementioned tasks), as in the case of extracting a training corpus for MLGO.

Diff Detail

Event Timeline

Northbadge created this revision.Jun 14 2022, 1:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 1:01 PM
Northbadge requested review of this revision.Jun 14 2022, 1:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2022, 1:01 PM
Northbadge edited the summary of this revision. (Show Details)Jun 14 2022, 1:04 PM
MaskRay added inline comments.Jun 14 2022, 10:58 PM
lld/ELF/Driver.cpp
1112

"unknown --lto-exit-on: " + xxx. If unknown, config->ltoExitOn should be empty.

Northbadge marked an inline comment as done.

Resolved inline comments, synced with parent, and removed -find from the test case. Thanks for the reviews!

tejohnson added inline comments.Jun 21 2022, 7:18 PM
lld/test/ELF/lto/lto-exit-on.ll
24

Why are we getting one when exiting on postpromote?

35

Why 3 here and below?

Northbadge added inline comments.Jun 21 2022, 9:15 PM
lld/test/ELF/lto/lto-exit-on.ll
24

the hook is called from lto::thinBackend AND also from LTO::runRegularLTO. When running the test case, LTO::runRegularLTO is called before lto::thinBackend from here. With postpromote being a ThinLTO exclusive hook, the postinternalize hook (from regularLTO) manages to run first before the postpromote hook is hit, thus creating 1 file. (I believe this is unavoidable without changes to that last snippet of code)

Now that you've pointed it out though, I realize that specifying --lto-exit-on=postinternalize would be ambiguous due to the two callsites, this is also the case for the preopt hook as I've just checked. The quick fix would be to make these two hooks invalid arguments for the option. Would this be an ok solution?

35

see above (that gives 1 "internalize" file)
and ThinLTO outputs the other 2, one for main.o, one for thin1.o

Northbadge marked 2 inline comments as done and an inline comment as not done.

Removed support for --lto-exit-on={preopt,postinternalize} due to ambiguity when running ThinLTO.

Northbadge marked an inline comment as not done.Jun 22 2022, 1:03 PM
Northbadge added inline comments.
lld/test/ELF/lto/lto-exit-on.ll
24

I've removed it for now, let me know if I should try doing something else

Update test to use subdirectories.

tejohnson added inline comments.Jun 24 2022, 10:49 AM
lld/test/ELF/lto/lto-exit-on.ll
24

I think it would be ok to leave the other cases in and just document it.

llvm/lib/LTO/LTOBackend.cpp
197

Probably want to issue an error here if it is something else.

Northbadge updated this revision to Diff 441859.Jul 1 2022, 7:45 PM
Northbadge marked 2 inline comments as done.

Bring back deleted options, updated to be consistent with parent diff, added llvm-lto2 tests.

Northbadge updated this revision to Diff 441860.Jul 1 2022, 7:49 PM

Missed a word, and braces

Northbadge updated this revision to Diff 441863.Jul 1 2022, 8:47 PM

Fix test case from missing word

The summary does not mention the motivation adding --lto-exit-on. Why is useful enough to have such an option?

Northbadge edited the summary of this revision. (Show Details)Jul 7 2022, 3:55 PM

The summary does not mention the motivation adding --lto-exit-on. Why is useful enough to have such an option?

Added some detail on that, thanks for pointing it out

I can see that fine-grained --save-temps=xxx is useful, but the utility of --lto-exit-on is unclear. Why is the overhead considered so high that an option needs to be added?

I can see that fine-grained --save-temps=xxx is useful, but the utility of --lto-exit-on is unclear. Why is the overhead considered so high that an option needs to be added?

It is admittedly more of a nice-to-have development tool/utility. On a 128 thread machine (with enough memory) the difference between having it vs not having it for the "exit on import" case with Chrome (~30k modules) is 1m20s vs 2m20s; extrapolating the parallelized portion to a more conventional 16 thread cpu that'd be a ~5 minute difference instead

I can see that fine-grained --save-temps=xxx is useful, but the utility of --lto-exit-on is unclear. Why is the overhead considered so high that an option needs to be added?

It is admittedly more of a nice-to-have development tool/utility. On a 128 thread machine (with enough memory) the difference between having it vs not having it for the "exit on import" case with Chrome (~30k modules) is 1m20s vs 2m20s; extrapolating the parallelized portion to a more conventional 16 thread cpu that'd be a ~5 minute difference instead

And this seems specific to in-process ThinLTO. A distributed ThinLTO (usually they should use this if their executable is large) ThinLTO does not need this. So this feature seems more restricted.

An update on this: I found a different use for --lto-exit-on (well, the original intended use: in MLGO training cycles which run >1M times, which I thought wouldn't have worked initially). I'm trial-ing to make sure it's doing what I'm expecting so we can probably put this patch on hold for a week or two, it isn't super-critical at the moment. I'll report back then (+details) and see how it goes from there

Northbadge abandoned this revision.Aug 17 2022, 4:44 PM

It's been a bit more than 2 weeks, but the idea I had didn't exactly pan out.

For future reference, in case someone revisits this: The idea was to concatenate the postimport .bc file with its associated thinlto index file (which is also bitcode). The naive way of doing this was to use llvm-dis on both the bitcode files, append the thinlto.ll lines to the postimport.ll lines and use llvm-as to go back to bitcode. This worked on small toy examples, but crashed on larger, real, files (Something about the asm/bitcode reader/writer, iirc, didn't have time to look into it further). The motivation for doing this was so that we can run lld with a single "hermetic" module which included the necessary index entries to prevent the deletion of locally-dead(not used within own module but used in other modules) functions. This would've allowed MLGO to use lld to perfectly replicate the ThinLTO environment when it is used in a non-distributed manner, and would've necessitated this patch to prevent the linker from crashing due to missing symbols.

Abandoning the patch as I won't be working on this in the near future. Thanks for all the feedback and guidance!