Page MenuHomePhabricator

[WIP][Do NOT review] LLD related changes for -ffat-lto-objects support
Needs ReviewPublic

Authored by arda on Aug 10 2022, 2:12 PM.

Details

Summary

If -ffat-lto-objects enabled and the input files are FatLTO object
files, LTO optimized executable is generated using the LTO bitcode sections
that are embbeded in these FatLTO object files. Otherwise, regular linking is
done using the assembly section in these FatLTO object files.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
arda requested review of this revision.Aug 10 2022, 2:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2022, 2:12 PM
arda removed a reviewer: MaskRay.Aug 10 2022, 2:14 PM
arda edited reviewers, added: phosek, paulkirth; removed: MaskRay.Aug 10 2022, 2:14 PM
arda updated this revision to Diff 451904.Aug 11 2022, 10:27 AM

Reverting the magic number based design. FatLTO object files stay as ELF files
with embedded bitcode. That's also how GCC does it. Current change checks if
there is a bitcode section named .fatlto.

arda updated this revision to Diff 451933.Aug 11 2022, 11:31 AM

Fix the order of bitcode and fatLTO section check.

arda updated this revision to Diff 451985.Aug 11 2022, 2:23 PM

Add the -fat-lto-objects flag as a prerequisite to checks.

arda updated this revision to Diff 452234.Aug 12 2022, 10:48 AM

Do not aggregate the .fatlto section when generating non-LTO executable.

I know this is WIP, but I think there are some Q's worth discussing as work progresses.

  1. How do "fat" objects interact w/ existing object types? i.e., regular object files and normal LTO objects?
  2. How do we control the behavior? can I just mix them? Are there combinations that should not work?
  3. Are the plans for testing? do we need special tests for the new format? or are the differences insignificant? can they be covered by existing tests?

I haven't seen these points discussed in the RFC or in these patches. How these are intended to work should probably be spelled out somewhere. Probably both in the code and in the documentation.

lld/ELF/Driver.cpp
2635–2639

maybe it makes sense to check for pointer validity separately now that there are 2 uses?

arda marked an inline comment as done.Aug 15 2022, 10:28 AM
arda updated this revision to Diff 452735.Aug 15 2022, 10:28 AM

Check for the pointer validity separately

paulkirth added inline comments.Aug 15 2022, 10:32 AM
lld/ELF/Driver.cpp
2644

LLVM coding standard prefers early return/continue to avoid nesting/indentation.

see https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code

arda updated this revision to Diff 452741.Aug 15 2022, 10:43 AM
arda marked an inline comment as done.

Check for the pointer validity separately with early continue.

arda updated this revision to Diff 453147.Aug 16 2022, 3:51 PM

Add a draft test. Not sure if this is the best way to verify the executables
are identical. I should probably ignore sections like .comment. I would
appreciate a feedback on the approach.

Is there a clang part of this change?

arda added a comment.Aug 17 2022, 10:04 AM

Is there a clang part of this change?

Yes,
https://reviews.llvm.org/D131090

I will be merging that patch to this one soon though.

arda updated this revision to Diff 453345.Aug 17 2022, 10:50 AM

Add clang and llvm related changes

Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2022, 10:50 AM
arda updated this revision to Diff 453386.Aug 17 2022, 11:54 AM

Replace section name from .fatlto to .llvm.lto

arda updated this revision to Diff 453550.Aug 18 2022, 12:42 AM

Use obj2yaml and yaml2obj to avoid unreadable object files

arda updated this revision to Diff 453751.Aug 18 2022, 12:24 PM

Add a clang Driver test

paulkirth added inline comments.Aug 18 2022, 2:59 PM
llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
5080 ↗(On Diff #453751)

Since this has been merged into this commit, I'll restate my concerns here from D131090.

Does this just do what embed bitcode does? My understanding was that LTO generated IR is emitted into object files differently than the embed bitcode scenario. I'm basing that on the discussion at https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

What I mean is that the way EmbedBitcode worked was not TheRightThing according to that discussion. It appears as though your change just copies its logic.

So if I want to compile a C file and get a fat-LTO object, how does that work? When do the LTO passes get added to the compilation pipeline? I don't mean the big whole program optimization stuff that gets invoked at link-time, I mean the module level passes and summary generation? It's not clear to me when those will run after invoking clang on a source file, since I don't see any config for setting that up as part of codegen.

can you elaborate on how this is intended to work? I seem to be missing something. In particular, I'd like to some kind of documentation describing how all the pieces are intended to work. I'd also like to see something outlining any limitations, such as different combinations of object files/flags that are not expected to work and why?

arda updated this revision to Diff 454048.Aug 19 2022, 9:34 AM

Invoke -flto=thin in the first stage of -ffat-lto-objects

phosek added inline comments.Aug 19 2022, 11:09 AM
clang/lib/Driver/ToolChains/Clang.cpp
7120–7122

Rather than assuming ThinLTO, I think this patch should assume Unified LTO as was suggested in https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977.

clang/test/Driver/fat-lto-objects.c
3–4

I don't think this test is useful, IIUC it checks that Clang accepts the -ffat-lto-objects flag, but that's also checked by the test below.

lld/test/ELF/fatlto/Inputs/a-thinLTO.ll
1–66 ↗(On Diff #454048)

Rather than using the output of Clang which tends to include a lot of unnecessary stuff, I'd consider either writing the minimal .ll file manually, or pruning this one to drop everything that's not strictly needed for the test.

lld/test/ELF/fatlto/Inputs/a.c
2–24

This seems unnecessarily complicated. You should only include code that's absolutely necessary for the test.

lld/test/ELF/fatlto/Inputs/a.h
2–4

You can use forward declarations and avoid this header altogether.

lld/test/ELF/fatlto/Inputs/main-thinLTO.ll
1–48 ↗(On Diff #454048)

Rather than using the output of Clang which tends to include a lot of unnecessary stuff, I'd consider either writing the minimal .ll file manually, or pruning this one to drop everything that's not strictly needed for the test.

lld/test/ELF/fatlto/Inputs/main.c
2–11

This seems unnecessarily complicated. You should only include code that's absolutely necessary for the test.

lld/test/ELF/fatlto/fatlto.test
5–7

LLD tests cannot depend on clang, that's a layering violation. It also doesn't seem like the output is being used anywhere so this could be safely removed.

13

The convention is to use uppercase names for prefixes.

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
5080 ↗(On Diff #453751)

I suspect that the approach we'll want to use is going to be akin to D130777 which uses a pass to control where in the pipeline the bitcode gets emitted.

paulkirth added inline comments.Aug 19 2022, 2:00 PM
lld/ELF/InputFiles.cpp
1721

Do we just keep going if there's an error? This is only checking if there's not an error, right? or does takeError do more than I remember?

phosek added inline comments.Aug 19 2022, 2:12 PM
lld/test/ELF/fatlto/Inputs/a-fatLTO.yaml
24

I think it'd be much nicer if you inline the LLVM IR in the YAML file and have yaml2obj convert that to bitcode automatically (similarly to how yaml2obj also handles DWARF inside ELF). That's not supported now as far as I'm aware and should be implemented as a separate change. It might be worth leaving a comment here explaining how the content of this section was produced and having a TODO regarding the yaml2obj improvement.

arda updated this revision to Diff 454585.Aug 22 2022, 11:58 AM
arda marked 9 inline comments as done.

Reimplement embed-bitcode part as a separate pass

arda added inline comments.Aug 22 2022, 12:00 PM
lld/test/ELF/fatlto/fatlto.test
5–7

Sorry for the confusion. These ones are just comments and do not run. I will remove them.

paulkirth added a comment.EditedAug 22 2022, 1:59 PM

I think the test strategy here needs to be:

  • Flags
    • clang, llvm, lld can parse the fat-object flags and configure themselves correctly
  • Format checking
    • llvm generates the correct object file sections when configured for fatlto
    • clang generates fatlto objects correctly (i.e. correctly sets up the codegen pipeline in llvm)
    • LLD can correctly link fat objects
  • Optimizations
    • the LTO pipeline is configured correctly
    • LLD can optimize fat objects the same way as normal bitcode files
clang/lib/Driver/ToolChains/Clang.cpp
7120–7122

Do we always do this under -flto=full, now? Doesn't that just essentially disable full LTO, since you set it to 'thin' when compiling for fat objects?

clang/test/CodeGen/embed-lto-fatlto.c
4–5

What property are you trying to check here? that the full LTO mode is replaced with thin?

lld/test/ELF/fatlto/fatlto.test
28–30

Can this test fail given how you generate the linker's inputs and what you're comparing?

So, does the diff fail if you don't pass the -fat-lto-objects flag on Line 13?

arda marked 2 inline comments as done.Aug 22 2022, 9:36 PM
arda added inline comments.
lld/ELF/InputFiles.cpp
1721

We just keep going if there is an error. So we end up using the assembly and not the embedded LTO module when generating the object file if there is no embedded bitcode in the input files.

lld/test/ELF/fatlto/fatlto.test
28–30

Yes, it fails.

sfertile added inline comments.
llvm/lib/Bitcode/Writer/EmbedBitcodePass.cpp
26

From the discourse discussion:

  1. it was suggested that we remove the existing -fembed-bitcode functionality as Apple has stop supporting it.
  2. mentioned that MLGO uses the option to embed the bitcode at various points in the pipeline depending on if its using LTO our not.

Do we want the pass to be a bit more generic and be able to specify the global to use for embedding, and the section name to use as arguments? That way MLGO can keep using the section name it uses now . It also helps consuming tools to disambiguate between bitcode embedded for lto purpose from bitcode embedded for other purposes.

arda marked 2 inline comments as done.Aug 23 2022, 5:43 PM
arda added inline comments.
llvm/lib/Bitcode/Writer/EmbedBitcodePass.cpp
26

I should have an answer in a couple of days. I will keep you updated.

arda marked an inline comment as done.Aug 24 2022, 10:53 PM
arda marked an inline comment as done.Aug 29 2022, 10:58 AM
arda added inline comments.
llvm/lib/Bitcode/Writer/EmbedBitcodePass.cpp
26

We have talked to folks at MLGO and reached a decision that their use case is sufficiently different from ours that it should be handled separately. More detailed discussion here:

[[ https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977/15?u=arda | [RFC] -ffat-lto-objects support on Discours ]]