This is an archive of the discontinued LLVM Phabricator instance.

[lld] Support separate minimized bitcode file path in --thinlto-prefix-replace
Needs ReviewPublic

Authored by itf on Apr 4 2023, 5:00 PM.

Details

Reviewers
tejohnson
MaskRay
Group Reviewers
Restricted Project
Summary

Currently, the --thinlto-suffix-replace="oldsuffix;newsuffix" option
is used during distributed ThinLTO thin links to specify the mapping
of the input bitcode object file to the minimized bitcode file.
--thinlto-suffix-replace is the only option specifying the mapping
between those files.

This forces the full bitcode file and the minimized bitcode file to be in
the same directory.

This patch expands the --thinlto-prefix-replace option to allow a
separate directory tree mapping to be specified for the minimized
bitcode file paths.

This is important to support builds and build systems where a directory
that is the output of a single action must either be fully included or
not included as the input of another action (e.g. the thin link, it must take as its input
the minimized bitcode files but not take as input the full bitcode files).

The new format is: --thinlto-prefix-replace="origpath;outpath[;objpath][;minimizedbitcodepath]"

If minimizedbitcodepath is specified it replaces minimizedbitcodepath
of the minimized bitcode files with origpath when mapping the minimized
bitcode to the full bitcode, otherwise it falls back to the old behavior of
assuming both files are in the same directory. The rewritten path is written
into the index and imports files for the subsequent distributed ThinLTO backends.

With this change instead of mapping the minimized bitcode to the full
bitcode using a suffix replace, we can instead use a prefix replace.

Diff Detail

Event Timeline

itf created this revision.Apr 4 2023, 5:00 PM
Herald added a reviewer: Restricted Project. · View Herald TranscriptApr 4 2023, 5:00 PM
Herald added a project: Restricted Project. · View Herald Transcript
itf requested review of this revision.Apr 4 2023, 5:00 PM
itf added a comment.Apr 4 2023, 5:03 PM

This is a continuation of D144596 , based on D144596#4182628.

Thanks!

If minimizedbitcodepath is specified it replaces minimizedbitcodepath of the minimized bitcode files with origpath when mapping the minimized bitcode to the full bitcode, otherwise it falls back to the old behavior of assuming both files are in the same directory.

It would be clearer to specify what this rewritten path is used for (i.e. written into the index and imports files for the subsequent distributed ThinLTO backends).

Also, similar to what I suggest below for the variables, origpath is somewhat of a misnomer if we have a minimizedbitcodepath (since the latter is actually the original input path to the thin link). So maybe origpath should be fullbitcodepath?

With this change, --thinlto-suffix-replace can now be deprecated. Instead of mapping the minimized bitcode to the full bitcode using a suffix replace, we can instead use a prefix replace.

I don't think it should be deprecated until and unless all build systems using it are moved to the new handling. Which might just be bazel but I'm not totally sure? If there are other users they may prefer the existing mechanism and directory structure.

lld/COFF/Config.h
227

This should be thinLTOPrefixReplaceMinimizedBitcode not thinLTOPrefixReplaceMinimizedIndex to match the code (and MinimizedBitcode is a better name IMO).

I also think with this change that we should rename thinLTOPrefixReplaceOld to thinLTOPrefixReplaceFullBitcode, and thinLTOPrefixReplaceNew with thinLTOPrefixReplaceOutputFile or something similar, to clarify what each of these paths are.

Then I suggest making the comment something like "replace the prefix of the minimized thin link input bitcode files, thinLTOPrefixReplaceMinimizedBitcode, with thinLTOPrefixReplaceFullBitcode, when the bitcode path is written to output index and imports files, allowing..."

lld/ELF/Config.h
179–180

Similar name changes here and in other Config.h to what I suggested above.

@MaskRay, would it be good to add comments about what these variables are, as is done in the COFF config file? I notice that there are no comments on any of the variables, but I think it might be helpful for these in particular.

lld/ELF/InputFiles.cpp
1586

The way the code is written, both suffix and prefix replacing can be done. If we don't want people to rely on that, and want to eventually move off of suffix replacement, we probably should give an error somewhere, i.e. in the option parsing, if both are specified. Then maybe add a comment here that we have already checked that only one is in effect.

1587

The name getThinLTOOutputFile is no longer very accurate. Perhaps make this getThinLTOReplacePathPrefix or something?

llvm/tools/gold/gold-plugin.cpp
897

Remove the extra space added here.

itf updated this revision to Diff 511503.Apr 6 2023, 12:43 PM
itf updated this revision to Diff 511505.Apr 6 2023, 12:50 PM
itf marked an inline comment as done.
itf added inline comments.
lld/COFF/Config.h
227

Changing the name of the variables makes it more clear. I replaced it here and on the other files

lld/ELF/InputFiles.cpp
1586

I think it is fine to have both suffix and prefix replace, as they don't interfere with one another.

I can add a check so that this is not possible any longer and check the error in the tests, but if we intend to deprecate suffix replace this would just add more code to be removed in the future.

itf marked 2 inline comments as done.Apr 6 2023, 4:02 PM

Apologies, I'm on an intense trip and will have little time studying patches before April 12...

lld/ELF/Config.h
179–180

Yes, having a comment will be useful.

lld/test/COFF/thinlto-minimized-bitcode-prefix-replace.ll
5

... && mkdir -p %t/minimized %t/full %t/indexing

itf updated this revision to Diff 511712.Apr 7 2023, 10:03 AM
itf marked 3 inline comments as done.
itf edited the summary of this revision. (Show Details)Apr 11 2023, 8:46 AM
MaskRay added a comment.EditedApr 18 2023, 7:12 PM

Sorry for the delay. I finished a long trip and was spending some time to clean up my email queue:)

Currently, the --thinlto-suffix-replace="oldsuffix;newsuffix" option is used during distributed ThinLTO thin links to specify the mapping...

I feel that this paragraph can be made clearer. How about this:

For distributed ThinLTO, --thinlto-index-only= emits index files (*.thinlink.bc) and import files (*.imports) that reference IR module names.
When minimized bitcode files like a.indexing.o are used as input, we use --thinlto-suffix-replace='.indexing.o;.o' so that the IR module names record a.o instead of a.indexing.o.
Backend compiles need the full bitcode file paths, not the minimized bitcode file paths.

However, this forces the full bitcode file and its corresponding minimized bitcode file to be in the same directory.

This patch adds the fourth component to the value of --thinlto-prefix-replace=
to allow a separate directory for minimized bitcode files, whose hierarchy map the directory for full bitcode files.

This is important to support builds and build systems where a directory that is the output of a single action must either be fully included or not included as the input of another action (e.g. the thin link, it must take as its input the minimized bitcode files but not take as input the full bitcode files).

This paragraph can change the subject to be clearer.

In Bazel, when we have an action that may take an tree artifact as an input, the action takes as input all the files inside the directory of the tree artifact or none of them.
If a full bitcode file and its corresponding minimized bitcode file within the same directory, Bazel will have to ship both files to a backend compile action, causing waste.


The error message is

error: --thinlto-prefix-replace=old_dir;new_dir;obj_dir must be used with --thinlto-index-only=

It can be changed to:

error: --thinlto-prefix-replace=full;index;obj;minimized when obj is non-empty must be used with --thinlto-index-only=

lld/test/COFF/thinlto-minimized-bitcode-prefix-replace.ll
12

Bazel currently uses a.indexing.o for minimized bitcode files. Using %t/indexing will cause confusion for the emitted index files. Suggest: %t/index

Also fix tests for other ports.

23

Using %t/unused as both the output response file and the directory in -thinlto-prefix-replace: leads to confusion.

Suggest -thinlto-index-only:%t/a.rsp -thinlto-prefix-replace:'%t/full;%t/index;%t/obj;%t/minimized'

Prefer single quotes to double quotes.

Add a RUN line to inspect FileCheck --check-prefix=xxx < %t/a.rsp. The output is useful.

39

Fix: No newline at end of file

lld/test/ELF/lto/thinlto-minimized-bitcode-prefix-replace.ll
1

This file just duplicates thinlto-object-suffix-replace.ll ?

MaskRay added inline comments.Apr 18 2023, 7:15 PM
lld/COFF/Config.h
234

Perhaps it's clearer to name all variables thinLTOPrefixReplace*Dir, e.g. thinLTOPrefixReplace{Full,Index,Obj,Minimized}Dir?

lld/COFF/Driver.cpp
94

Perhaps full;index[;obj][;minimized]