This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols
ClosedPublic

Authored by steven_wu on Apr 3 2019, 12:26 PM.

Details

Summary

ThinLTOCodeGenerator currently does not preserve llvm.used symbols and
it can internalize them. In order to pass the necessary information to the
legacy ThinLTOCodeGenerator, the input to the code generator is
rewritten to be based on lto::InputFile.

This fixes: PR41236
rdar://problem/49293439

Diff Detail

Repository
rL LLVM

Event Timeline

steven_wu created this revision.Apr 3 2019, 12:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 3 2019, 12:26 PM

Also do note, I did a quick benchmark on thin-link time for libLTO.dylib. thin-link was 2.6s before the change vs. 4.0s after the change. Building IRSymtab during thin-link does introduce non-negligible overhead. It might be possible to build them async in threads to reduce the overhead.

Also do note, I did a quick benchmark on thin-link time for libLTO.dylib. thin-link was 2.6s before the change vs. 4.0s after the change. Building IRSymtab during thin-link does introduce non-negligible overhead. It might be possible to build them async in threads to reduce the overhead.

@pcc is this surprising? Typically we should just be loading it out of the bitcode file. There are cases where it has to be rebuilt, but theoretically those should be exceptions.

pcc added a comment.Apr 4 2019, 2:30 PM

Also do note, I did a quick benchmark on thin-link time for libLTO.dylib. thin-link was 2.6s before the change vs. 4.0s after the change. Building IRSymtab during thin-link does introduce non-negligible overhead. It might be possible to build them async in threads to reduce the overhead.

@pcc is this surprising? Typically we should just be loading it out of the bitcode file. There are cases where it has to be rebuilt, but theoretically those should be exceptions.

Yes, we normally just load it out of the bitcode file. The irsymtab only needs to be rebuilt when the version of the producer and the consumer do not match. You might be hitting the "upgrade" path here: http://llvm-cs.pcc.me.uk/lib/Object/IRSymtab.cpp#330

In D60226#1455485, @pcc wrote:

Also do note, I did a quick benchmark on thin-link time for libLTO.dylib. thin-link was 2.6s before the change vs. 4.0s after the change. Building IRSymtab during thin-link does introduce non-negligible overhead. It might be possible to build them async in threads to reduce the overhead.

@pcc is this surprising? Typically we should just be loading it out of the bitcode file. There are cases where it has to be rebuilt, but theoretically those should be exceptions.

Yes, we normally just load it out of the bitcode file. The irsymtab only needs to be rebuilt when the version of the producer and the consumer do not match. You might be hitting the "upgrade" path here: http://llvm-cs.pcc.me.uk/lib/Object/IRSymtab.cpp#330

This is likely. I didn't bootstrap the build. I will try again with a matching version.

In D60226#1455485, @pcc wrote:

Also do note, I did a quick benchmark on thin-link time for libLTO.dylib. thin-link was 2.6s before the change vs. 4.0s after the change. Building IRSymtab during thin-link does introduce non-negligible overhead. It might be possible to build them async in threads to reduce the overhead.

@pcc is this surprising? Typically we should just be loading it out of the bitcode file. There are cases where it has to be rebuilt, but theoretically those should be exceptions.

Yes, we normally just load it out of the bitcode file. The irsymtab only needs to be rebuilt when the version of the producer and the consumer do not match. You might be hitting the "upgrade" path here: http://llvm-cs.pcc.me.uk/lib/Object/IRSymtab.cpp#330

This is likely. I didn't bootstrap the build. I will try again with a matching version.

Without hitting the upgrade path, the time is almost negligible. Thanks for pointing that out.

I have mostly small suggestions below. Overall, it looks pretty good and straightforward.

llvm/lib/LTO/LTO.cpp
423 ↗(On Diff #193570)

Suggest renaming to something more generic like "getSingleBitcodeModule", and just change the assert text to "Expect only one module". In case we want to use this interface for something other than that client in the future.

llvm/tools/llvm-lto/llvm-lto.cpp
458 ↗(On Diff #193570)

Maybe combine this with loadFile, since they are always used together and the Buffer from loadFile doesn't seem to be used elsewhere?

462 ↗(On Diff #193570)

The new patch loses this option, which seems to be used by a couple tests under test/ThinLTO/X86.

steven_wu marked 2 inline comments as done.Apr 5 2019, 12:51 PM
steven_wu added inline comments.
llvm/tools/llvm-lto/llvm-lto.cpp
458 ↗(On Diff #193570)

This is just to extend the lifetime of buffer till module is created from InputFile.

462 ↗(On Diff #193570)

I will add this back.

steven_wu updated this revision to Diff 193958.Apr 5 2019, 1:28 PM

Address review feedback

Just noticed something else looking through it again, question below.

llvm/include/llvm/LTO/legacy/ThinLTOCodeGenerator.h
254 ↗(On Diff #193958)

This and the functions below take an InputFile pointer with a nullptr default - why the default (and null check in these routines), since it looks like it is always being passed (other than emitImports, note on that further down). Can you remove the default parameter value, and also make this take a reference and remove the null checks?

llvm/tools/llvm-lto/llvm-lto.cpp
626 ↗(On Diff #193958)

This should now pass Input.get()

steven_wu updated this revision to Diff 194162.Apr 8 2019, 9:30 AM

Review feedback.

steven_wu marked 2 inline comments as done.Apr 8 2019, 9:31 AM
This revision is now accepted and ready to land.Apr 8 2019, 10:25 AM
This revision was automatically updated to reflect the committed changes.

I reverted this because the old thinLTO test cases need to be updated to have data layout as a requirement to use IRSymtab. I will start a separate review to address all those issues.