This is an archive of the discontinued LLVM Phabricator instance.

Support for distributed ThinLTO options
ClosedPublic

Authored by rdhindsa on Apr 24 2018, 4:16 PM.

Details

Summary

Added support for options thinlto-index-only and thinlto-prefix-replace.

When thinlto-index-only option is specified, the thin LTO backend is created to generate the index files.
thinlto-prefix-replace controls where the files of distributed backend are created.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

rdhindsa created this revision.Apr 24 2018, 4:16 PM
ruiu added a comment.Apr 24 2018, 4:34 PM

There are a few things to note:

  1. When you upload a patch, please include all context. Are you using git? If so, use git diff -u 10000 to include virtually the entire file in your diff.
  2. Please read the LLVM coding style and stick with it. The easiest way of doing it is to use clang-format-diff to automatically format your patch according to the LLVM coding style.
lld/ELF/Driver.cpp
798

Please append {} to all these if ~ else if ~ else.

802

Basically all members in Config class are named after their corresponding command line options. This is a special case, but I'm reluctant to give a totally different name than the corresponding option. How about ThinLTOIndexOnlyFile?

803

I'd just write 19 instead of strlen(...).

806

Ditto

807

Use StringRef::contains.

808

fatal() is not for reporting a wrong usage. Use error() instead.

1015

I do understand that you need this or otherwise ThinLTO doesn't work. But this comment doesn't explain why we really need this. Could you expand the comment so that why ThinLTO needs to see all bitcode files even within --start-lib/end-lib?

1016

Flip the condition and break early.

if (Config->ThinLTOIndexOnly)
  break;
lld/ELF/LTO.cpp
72

Please stick with the LLVM/lld function naming scheme.

73

Please use clang-format.

74

I don't think you need this assertion.

88–89

Ditto

90

Why do you need a temporary variable?

101

Format

142

You can replace this with

std::string Old;
std::string New;
std::tie(Old, New) = Config->ThinLTOPrefixReplace.split(';');
156

Please follow the LLVM coding style.

158

Please fix style.

270

Well, you need to use clang-format.

lld/ELF/LTO.h
56

What is this for?

rdhindsa updated this revision to Diff 143987.Apr 25 2018, 12:35 PM
rdhindsa marked 15 inline comments as done.Apr 25 2018, 12:41 PM
rdhindsa added inline comments.
lld/ELF/Driver.cpp
802

I have updated name to ThinLTOIndexOnlyObjectsFile. This file contains the list of object files. I think keeping objects in the name makes it easy to understand.

lld/ELF/LTO.cpp
74

Removed it.

90

Removed the getThinLTOOldAndNewPrefix function, since the old and new prefix can just be retreived by std::tie(OldPrefix, NewPrefix) = Config->ThinLTOPrefixReplace.split(';');

lld/ELF/LTO.h
56

I have removed ObjectToIndexFileState for now, I will be adding it in subsequent patch. It is supposed to track all the object files that are indexed.

pcc added inline comments.Apr 25 2018, 1:33 PM
lld/ELF/Driver.cpp
1015

The problem as I understand it is not related to files with no symbols but rather that if an object file between --start-lib and --end-lib is not selected using the usual rules for adding archive members to the link, the distributed build system will still want to see a .thinlto.bc file for it, even if it is unused.

1018

This part seems to be equivalent to writeEmptyDistributedBuildOutputs in the gold plugin. It's unfortunate that we need this in order to work around a deficiency in the distributed build system, but if we are going to do this, can we instead do something equivalent to that in order to avoid linking more than we need?

I think we should be able to do this by keeping track of which LazyObjFiles have been added to the link, and emitting a .thinlto.bc file for every file that has not been added to the link.

rdhindsa added inline comments.Apr 25 2018, 1:40 PM
lld/ELF/Driver.cpp
1018

I intended to add similar functionality as to writeEmptyDistributedBuildOutputs in the LTO.cpp while trying to add files, so that we emit .thinlto.bc for every file, instead of doing more linking. Will that be okay?

tejohnson added inline comments.Apr 25 2018, 1:52 PM
lld/ELF/Driver.cpp
1015

There are two cases where we may need to write out an empty file:

  1. object between --start-lib and --end-lib is not selected by the link
  2. The case handled by https://reviews.llvm.org/D42514, when the object is skipped by ThinLTO because it doesn't have a summary, which I think is due to side effects of module splitting for types?

It's also not so much that we need to generate object files, this step is not generating object files in a distributed build, but rather that we are expected to generate the necessary index files. It's this indexing step (the thin link) of the distributed build system that wants to validate that all the expected output files have been created.

Perhaps the comment can then be changed to something along the lines of "With the ThinLTOIndexOnly option, only the thin link is performed, and will generate index files for the ThinLTO backends in a distributed build system. The distributed build system may expect that index files are created for all input bitcode objects provided to the linker for the thin link. However, index files will not normally be created for input bitcode objects that either aren't selected by the linker (i.e. in a static library and not needed), or because they don't have a summary. Therefore we need to create empty dummy index file outputs in those cases".

pcc added inline comments.Apr 25 2018, 2:20 PM
lld/ELF/Driver.cpp
1018

It depends what you mean by "every file". We can't just do it for files added using BitcodeCompiler::add because that wouldn't include non-selected files and then you're back to the original problem. And we can't do it for everything in LinkerDriver::Files because that would include regular object files like crtbegin.o. I think that we can do it by filtering LinkerDriver::Files along the lines that I mentioned.

rdhindsa updated this revision to Diff 144140.Apr 26 2018, 9:39 AM

For the LazyObjFiles which are not added for link, this change ensures that empty index files are generated.

In the next patch, I will update it to include imports file as well, along with the EmitImports option.

pcc added inline comments.Apr 26 2018, 3:42 PM
lld/ELF/InputFiles.h
262

Instead of adding this function, can you give Seen a clearer name and make it public?

263

I don't think you need this function. You should be able to use isBitcode to test whether MB->File is bitcode directly in the caller.

lld/ELF/LTO.cpp
85

This function is always called with SkipModule = true. There's also a case in the gold plugin where SkipModule is false (which just creates an empty file); that's the case where the file was added to the link but we end up not emitting an index for it. There's no easy way to detect this case using the current API, so probably the easiest thing to do would be to create the empty files in BitcodeCompiler::add and let them be overwritten later.

175

I don't think you need the if.

195

This part doesn't seem right. We don't want to skip input files without symbols because they could contain section contributions that could be significant.

lld/ELF/SymbolTable.cpp
132

Consider moving this part to BitcodeCompiler::compile().

135

Remove unnecessary parens.

grimar added a subscriber: grimar.Apr 27 2018, 3:05 AM
grimar added inline comments.
lld/ELF/Driver.cpp
806

Please provide the testcase for this error.

lld/ELF/LTO.cpp
87

You do not need llvm:: prefix I think.

93

Error messages start from the lower case usually.
Also, the test would be nice to have if possible (i think you could create the file and so this code should error out when try to overwrite it, but I am not sure).

99

The same.

grimar added inline comments.Apr 27 2018, 3:07 AM
lld/ELF/LTO.cpp
99

I mean you can remove llvm:: I think.

rdhindsa updated this revision to Diff 144375.Apr 27 2018, 11:48 AM
rdhindsa marked 8 inline comments as done.Apr 27 2018, 11:57 AM
rdhindsa added inline comments.
lld/ELF/InputFiles.h
263

isBitcode is a static function in InputFiles.cpp. In order to be able to use that function in LTO.cpp :

  1. Either I move the function to InputFiles.h, in which case it will get imported to all other files including InputFiles.h ,which will increase code size.

In my opinion, we can simply call it for LazyObjFile. Please let me know if you still have concerns.

pcc added inline comments.Apr 27 2018, 12:07 PM
lld/ELF/InputFiles.h
263

You can just declare it as inline bool isBitcode(...) { ... }. That way it should be fine to put it in a header file.

rdhindsa updated this revision to Diff 144385.Apr 27 2018, 12:30 PM
rdhindsa marked 3 inline comments as done.Apr 27 2018, 12:33 PM
rdhindsa added inline comments.
lld/ELF/Driver.cpp
806

Added the testcase to thinlto_prefix_replace.ll

lld/ELF/LTO.cpp
93

Added the test case to thinlto.ll
By changing the permissions of the file, it is not available to write and test produces error.

pcc added inline comments.May 2 2018, 11:00 AM
lld/ELF/InputFiles.h
47

Remove this line and qualify with llvm:: below.

262

Rename to AddedToLink to match local style.

355

Do you need this line? Both identify_magic and file_magic are in the llvm namespace.

lld/ELF/LTO.cpp
259–260

Remove braces

262

Remove this->

lld/test/ELF/lto/thinlto_prefix_replace.ll
10

I would also test that the index file is still created if the file is wrapped in --start-lib..--end-lib, or if -module-summary is not passed to opt.

rdhindsa updated this revision to Diff 144911.May 2 2018, 12:08 PM
rdhindsa marked 2 inline comments as done.
rdhindsa marked 6 inline comments as done.May 2 2018, 12:12 PM
pcc accepted this revision.May 2 2018, 1:38 PM

LGTM

lld/ELF/LTO.cpp
259

Remove this one as well.

lld/ELF/Options.td
428–431

These lines aren't needed, we only support the new flags as plugin-opts.

This revision is now accepted and ready to land.May 2 2018, 1:38 PM
sbc100 added a subscriber: sbc100.May 3 2018, 1:55 AM

I think this broke the BUILD_SHARED_LIB build: https://reviews.llvm.org/D46381

Couple of post-commit suggestions for improvements below.

lld/ELF/LTO.cpp
199

Sorry for not noticing this earlier - I was looking at the code again in the context of D46400. This seems inefficient - we have large links of tens of thousands of files, and writing empty index (and with D46400, empty imports) for each will slow down the thin link. In gold-plugin, only the necessary empty files are written. The createLTO function takes an optional callback function, which can be used to track which files get index files written for them. Take a look at how gold-plugin uses this to optionally call it's version of writeEmptyDistributedBuildOutputs (specifically, see how ObjectToIndexFileState is used). Can you do a follow-up change to implement the same support here so it is only called when needed?

302

Can this be moved up above the cache pruning and the iteration over all the MaxTasks to create output files and save temps? Those should never be needed in the index-only case, and it would be more efficient to skip the MaxTasks Buff[I].empty() checks.

I will send out a patch to address both concerns. Thank you for reviewing.

MaskRay closed this revision.Sun, Jan 21, 4:31 PM
Herald added a project: Restricted Project. · View Herald Transcript