This is an archive of the discontinued LLVM Phabricator instance.

[LTO] Support for embedding bitcode section during LTO
ClosedPublic

Authored by zapster on Sep 30 2019, 5:55 AM.

Details

Summary

This adds support for embedding bitcode in a binary during LTO. The libLTO gains supports the -lto-embed-bitcode flag. The option allows users of the LTO library to embed a bitcode section. For example, LLD can pass the option via ld.lld -mllvm=-lto-embed-bitcode.

This feature allows doing something comparable to clang -c -fembed-bitcode, but on the (LTO) linker level. Having bitcode alongside native code has many use-cases. To give an example, the MacOS linker can create a -bitcode_bundle section containing bitcode. Also, having this feature built into LLVM is an alternative to 3rd party tools such as wllvm or gllvm. As with these tools, this feature simplifies creating "whole-program" llvm bitcode files, but in contrast to wllvm/gllvm it does not rely on a specific llvm frontend/driver.

Diff Detail

Event Timeline

zapster created this revision.Sep 30 2019, 5:55 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
zapster updated this revision to Diff 222401.Sep 30 2019, 5:56 AM

re-ran clang format

zapster marked 5 inline comments as done.Sep 30 2019, 6:05 AM

Added inline remarks.

clang/lib/CodeGen/BackendUtil.cpp
1547

moved to llvm/lib/Bitcode/Writer/BitcodeWriter.cpp

clang/test/Frontend/x86-embed-bitcode.ll
2

This duplicates the embed-bitcode.ll test (which only runs on ARM) for x86.

llvm/include/llvm/Bitcode/BitcodeWriter.h
158

BitcodeWriter.h seems like a natural place for this functionality. However, suggestions for a better location are more than appreciated.

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
4670

moved from clang/lib/CodeGen/BackendUtil.cpp

llvm/lib/LTO/LTOBackend.cpp
327

This options parsing logic is duplicated from clang. We might want move this to a shared place, but I failed to find a good candidate. include/llvm/Support/CodeGen.h came to mind, but it currently only contains types, no code. Any suggestions?

(Ping)

Hi! I am still looking for reviews/reviewers for this change. Please let me know what I can do to make progress with this.
Many thanks in advance!

It seems reasonable to support this from LTO since it provides analogous support to what is available via clang without LTO. @pcc what do you think?

I only just skimmed it for now, one question below.

llvm/lib/LTO/LTOBackend.cpp
335

Does this value make any sense since CmdArgs is always empty below?

Thanks for your comment. I replied inline.

llvm/lib/LTO/LTOBackend.cpp
335

You are right. Currently, this option value is not utterly useful, but I kept it for compatibility with clangs -fembed-bitcode. It seems the marker section is needed (even if empty), otherwise certain tools will complain (notably the MacOS linker). Ideally, we would have something sensible to write into the section but I am not sure about reasonable values. I was not able to find any information about consumers of these commands. Pointers in that regard would be highly appreciated.

Anyhow, I lean toward keeping the options for compatibility with clang and for making it simple to properly implement the marker case. That said, I am perfectly fine with removing the option. In that case, I guess I should get rid of EmbedBitcodeKind altogether and always insert the bitcode and the marker, right?

On the other hand, we should maybe just consolidate the option (parsing) with the clang option. Some thoughts about that in my inline comment above.

tejohnson added inline comments.Dec 10 2019, 10:57 AM
llvm/lib/LTO/LTOBackend.cpp
335

It's unclear to me whether it is better to not provide the marker, and get a clear error, or put in an empty marker which may cause these tools to do the wrong thing. @steven_wu who might be able to provide some guidance about how this is used by the MacOS linker. Note that if someone is using MacOS to do their LTO linking, they won't even trigger this code, as that toolchain uses the old legacy LTO, which doesn't come here.

Presumably your tool doesn't use the marker section?

steven_wu added inline comments.Dec 10 2019, 4:15 PM
llvm/lib/LTO/LTOBackend.cpp
335

I don't think marker is useful for this use case. I will suggest not to have the marker LTO option unless we have the need. It is being tested by clang and that is probably enough coverage for testing.

I don't think any users of the legacy interface express interests in using a unified bitcode section so it is fine just to leave this out of the legacy C interface.

zapster marked an inline comment as done.Dec 11 2019, 2:02 AM
zapster added inline comments.
llvm/lib/LTO/LTOBackend.cpp
335

[...] if someone is using MacOS to do their LTO linking, they won't even trigger this code, as that toolchain uses the old legacy LTO

Right. I mixed up the case where the embedding logic was called from clang and from LTO. And of course, the MacOS linker does (unfortunately) not use this code.

Presumably your tool doesn't use the marker section?

Nope, I don't need the marker section.

I don't think marker is useful for this use case.

Right, makes sense. I'll update the review and make -lto-embed-bitcode a boolean option.

Thanks to both of you for the clarifications.

zapster updated this revision to Diff 233282.Dec 11 2019, 2:03 AM
zapster edited the summary of this revision. (Show Details)

Made -lto-embed-bitcode a boolean option.

zapster marked an inline comment as not done.Dec 11 2019, 2:04 AM
zapster marked an inline comment as done.
zapster edited the summary of this revision. (Show Details)Dec 11 2019, 2:15 AM
tejohnson accepted this revision.Dec 11 2019, 9:54 AM

LGTM with a few small suggestions before you commit.

llvm/lib/LTO/LTOBackend.cpp
317

Expand description a bit to indicate that it will embed LLVM bitcode in object files produced by LTO.

llvm/test/LTO/X86/embed-bitcode.ll
8

I don't think it is necessary to check anything other than "-lto-embed-bitcode=false" and "-lto-embed-bitcode", and the default. Otherwise you are just testing the standard behavior of cl::opt, and that isn't necessary for a new internal option.

This revision is now accepted and ready to land.Dec 11 2019, 9:54 AM
zapster updated this revision to Diff 233548.Dec 12 2019, 2:44 AM

Addressed suggestions from @tejohnson

zapster marked 2 inline comments as done.Dec 12 2019, 2:47 AM
zapster marked 4 inline comments as done.

Thanks again for you reviews! Since I do not have commit rights, I would be grateful if someone could push it for me.

Thanks again for you reviews! Since I do not have commit rights, I would be grateful if someone could push it for me.

The new LTO/X86/embed-bitcode.ll test is failing for me:

llvm-lto2: llvm/lib/LTO/LTO.cpp:514: void llvm::lto::LTO::addModuleToGlobalRes(ArrayRef<InputFile::Symbol>, ArrayRef<llvm::lto::SymbolResolution>, unsigned int, bool): Assertion `!GlobalRes.Prevailing && "Multiple prevailing defs are not allowed"' failed.

zapster updated this revision to Diff 233652.Dec 12 2019, 9:59 AM

The new LTO/X86/embed-bitcode.ll test is failing for me

Should be fixed now, thanks.

This revision was automatically updated to reflect the committed changes.
ychen added a subscriber: ychen.Dec 12 2019, 6:36 PM