This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO/gold] Handle bitcode archives
ClosedPublic

Authored by tejohnson on May 23 2016, 10:15 PM.

Details

Summary

Several changes were required for ThinLTO links involving bitcode
archive static libraries. With this patch clang/llvm bootstraps with
ThinLTO and gold.

The first is that the gold callbacks get_input_file and
release_input_file can normally be used to get file information for
each constituent bitcode file within an archive. However, these
interfaces lock the underlying file and can't be for each archive
constituent for ThinLTO backends where we get all the input files up
front and don't release any until after the backend threads complete.
However, it is sufficient to only get and release once per file, and
then each consituent bitcode file can be accessed via get_view. This
required saving some information to identify which file handle is the
"leader" for each claimed file sharing the same file descriptor, and
other information so that get_input_file isn't necessary later when
processing the backends.

Second, the module paths in the index need to distinguish between
different constituent bitcode files within the same archive file,
otherwise they will all end up with the same archive file path.
Do this by appending the offset within the archive for the start of the
bitcode file, returned by get_input_file when we claim each bitcode file,
and saving that along with the file handle.

Third, rather than have the function importer try to load a file based
on the module path identifier (which now contains a suffix to
distinguish different bitcode files within an archive), use a custom
module loader. This is the same approach taken in libLTO, and I am using
the support refactored into the new LTO.h header in r270509. The module
loader parses the bitcode files out of the memory buffers returned from
gold via the get_view callback and saved in a map. This also means that
we call the function importer directly, rather than add it to the pass
pipeline (which was in the plan to do already for other reasons).

Diff Detail

Repository
rL LLVM

Event Timeline

tejohnson updated this revision to Diff 58195.May 23 2016, 10:15 PM
tejohnson retitled this revision from to [ThinLTO/gold] Handle bitcode archives.
tejohnson updated this object.
tejohnson added reviewers: pcc, mehdi_amini.
tejohnson added a subscriber: llvm-commits.
tejohnson updated this revision to Diff 58196.May 23 2016, 10:17 PM
  • Remove inadvertent change
pcc edited edge metadata.May 24 2016, 11:06 AM

Thanks for discovering that issue with the gold plugin API, that's nasty. It reminds me that I'll probably have to be more careful with memory ownership in D20268.

Can you please also see if this works if you pass --no-map-whole-files to gold?

tools/gold/gold-plugin.cpp
522 ↗(On Diff #58196)

Nit: maybe "/" + std::to_string(file->offset) to avoid possibility of collisions with real files?

pcc added a comment.May 24 2016, 11:12 AM

And can you add a test case please?

In D20559#438033, @pcc wrote:

Thanks for discovering that issue with the gold plugin API, that's nasty. It reminds me that I'll probably have to be more careful with memory ownership in D20268.

Can you please also see if this works if you pass --no-map-whole-files to gold?

Will do. I am first tracking down a runtime error in the llvm-tblgen binary built using this patch with latest clang+llvm sources (it seems related to ThinLTO since it goes away when I disable some amount of importing, but I wasn't getting this with my fixes on a slightly older revision, so I don't think it is caused by this patch directly)

And can you add a test case please?

Will do.

tools/gold/gold-plugin.cpp
522 ↗(On Diff #58196)

The same name is used to generate save temps files (so they don't all just use the archive name), so I wanted it to be a legal file name. What if I put in something like ".llvm." in between? This is similar to the naming scheme for symbols used during promotion.

That being said, in working to track down the tblgen failure, I am finding the name generated just from the offset not very useful, so I am planning to grab the SourceFileName out of the Module and include its basename as well.

I am first tracking down a runtime error in the llvm-tblgen binary built using this patch with latest clang+llvm sources (it seems related to ThinLTO since it goes away when I disable some amount of importing, but I wasn't getting this with my fixes on a slightly older revision, so I don't think it is caused by this patch directly)

Confirmed not due to this patch - I can repro at head without this patch, working around it by linking llvm-tblgen with the archive library constituent .o files directly. While I track that down (possibly just exposed by ThinLTO's larger optimization scope) I will see if I can figure out a way to work around it here to do more testing of this patch.

I am first tracking down a runtime error in the llvm-tblgen binary built using this patch with latest clang+llvm sources (it seems related to ThinLTO since it goes away when I disable some amount of importing, but I wasn't getting this with my fixes on a slightly older revision, so I don't think it is caused by this patch directly)

Confirmed not due to this patch - I can repro at head without this patch, working around it by linking llvm-tblgen with the archive library constituent .o files directly. While I track that down (possibly just exposed by ThinLTO's larger optimization scope) I will see if I can figure out a way to work around it here to do more testing of this patch.

I was able to test this and get a successful bootstrap of clang with ThinLTO and gold by temporarily reducing the importing threshold enough to avoid the apparent bug.

Can you please also see if this works if you pass --no-map-whole-files to gold?

Confirmed this works, as well as --no-keep-files-mapped.

tejohnson updated this revision to Diff 58487.May 25 2016, 1:17 PM
tejohnson edited edge metadata.

Make name used for archive members more meaningful by including the
source file name. Also include the string ".llvm." to lessen likelihood
of clashing with real file.

Add test.

pcc accepted this revision.May 25 2016, 1:40 PM
pcc edited edge metadata.

LGTM, thanks for testing those things!

tools/gold/gold-plugin.cpp
522 ↗(On Diff #58487)

Sounds good. As I mentioned before I think at some point we will need to revisit our representation of paths in the index. For save-temps I think we should use some separate mechanism to produce more developer-friendly paths that can be correlated with the index path representation. That does seem low priority for the moment though.

This revision is now accepted and ready to land.May 25 2016, 1:40 PM
This revision was automatically updated to reflect the committed changes.