This is an archive of the discontinued LLVM Phabricator instance.

[lld][ELF] Fix lazy ThinLTO index writing in thin archives
ClosedPublic

Authored by aidengrossman on Oct 1 2022, 1:13 AM.

Details

Summary

Currently when the --thinlto-emit-index-files is used with LLD and a
thin archive is passed containing references to object files to link
against where the object files are in a different folder than the thin
archive and some of the archives aren't linked against (ie stay lazy),
the empty index file writer ends up trying to write to a path that
doesn't exist. This patch changes the behavior of that function to use
the path of the obj member of the BitcodeFile object rather than just
the path of the BitcodeFile object itself, which matches the behavior of
the default (non-lazy) case.

Fixes #57963

Regression test added.

Diff Detail

Event Timeline

aidengrossman created this revision.Oct 1 2022, 1:13 AM
Herald added a project: Restricted Project. · View Herald Transcript
aidengrossman requested review of this revision.Oct 1 2022, 1:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 1 2022, 1:13 AM
aidengrossman added a subscriber: mtrofin.

This fixes the issue that we were seeing in #57963 and shouldn't change any of the behavior in the base case (outside of thin archives) and should just match the behavior that happens in the non-lazy case. I'm not too familiar with this area of the code base, so if I've taken the wrong approach in any specific area, let me know and I can adjust.

I read through Issue #57963 just now but still a little unsure of a couple things:

What is the path that is written for the test case without this patch?

This is just for MLGO, right? It isn't going to make this case work for distributed ThinLTO all the way through the ThinLTO backends I believe - is that correct?

It does seem fine to me and consistent with the path used when the bitcode file is actually added to the link, however.

The path that is written to without this patch is the path to the object file relative from the path of the thin archive, but it expands this path over the current working directory rather than the path of the thin archive. For example if the thin archive is in ./test-folder-2/lib.a, pointing to archive files in ./test-folder-1, the thinLTOCreateEmptyIndexFiles() function will try and write the *.thinlto.bc files to ../test-folder-1/* relative to the current working directory (set up in the examples as containing test-folder-1 and test-folder-2, ie one level up). Even just fixing the path issue, the current implementation also names the files differently (ie c.o.thinlto.bc instead of in the non-lazy case lib.a(c.o at xxx).thinlto.bc.

I have just tested this on a ThinLTO Chromium compile for preparing a corpus for MLGO. I don't believe this patch will make this case work for distributed ThinLTO all the way through the backends, but I'm not familiar enough with how distributed ThinLTO picks up all the necessary files in order to say anything conclusive.

The path that is written to without this patch is the path to the object file relative from the path of the thin archive, but it expands this path over the current working directory rather than the path of the thin archive. For example if the thin archive is in ./test-folder-2/lib.a, pointing to archive files in ./test-folder-1, the thinLTOCreateEmptyIndexFiles() function will try and write the *.thinlto.bc files to ../test-folder-1/* relative to the current working directory (set up in the examples as containing test-folder-1 and test-folder-2, ie one level up). Even just fixing the path issue, the current implementation also names the files differently (ie c.o.thinlto.bc instead of in the non-lazy case lib.a(c.o at xxx).thinlto.bc.

Got it, thanks.

I have just tested this on a ThinLTO Chromium compile for preparing a corpus for MLGO. I don't believe this patch will make this case work for distributed ThinLTO all the way through the backends, but I'm not familiar enough with how distributed ThinLTO picks up all the necessary files in order to say anything conclusive.

Ok, right. For a full distributed ThinLTO the archives currently need to be expanded and passed as individual object files on the command line surrounded by --start-lib/--end-lib pairs to identify the libraries.

Change lgtm, other than question about test but please wait for @MaskRay to review, especially to see if he wants any comments in the code or test about this being for MLGO but not really enough for supporting thin archives with distributed ThinLTO.

lld/test/ELF/lto/thinlto-index-files-thin-archive.ll
1

Why is this needed?

Remove extraneous x86 requirement on regression test.

aidengrossman marked an inline comment as done.Oct 3 2022, 11:12 AM
aidengrossman added inline comments.
lld/test/ELF/lto/thinlto-index-files-thin-archive.ll
1

It was just an artifact that I copied ever when replicating some of the setup from one of the other regression tests regarding thinlto index files that I mistakenly didn't remove. This has since been fixed in the latest revision of this patch. Thanks for pointing this out.

aidengrossman marked an inline comment as done.Oct 3 2022, 11:13 AM

thinlto-emit-index-thin-archive.ll may be better since we already have thinlto-emit-index.ll and generally it's useful to make similar tests share a common prefix.

thinlto-thin-archive-collision.ll may be merged into the test as well.

lld/test/ELF/lto/thinlto-index-files-thin-archive.ll
2

The comment may be too long. I'll suggest

For a thin archive referencing object files in a different directory, emit index files (lib.a($member at $offset).thinlto.bc) in the directory containing the archive.

11

%t is about tests. test- in test-folder-2 isn't useful. Just use folder-1 folder-2, but actually dir1 dir2` is more common.

14

Add another test using --whole-archive lib.a. The patch makes the behavior similar to --whole-archive.

MaskRay added inline comments.Oct 3 2022, 11:52 AM
lld/test/ELF/lto/thinlto-index-files-thin-archive.ll
2

For a thin archive referencing object files in a different directory, emit index files (lib.a($member at $offset).thinlto.bc) in the directory containing the archive. The information about the referenced member's directory is lost.

aidengrossman marked 4 inline comments as done.

Combine tests from thinlto-thin-archive-collision.ll into the same test file
and addressed inline comments.

Renamed test to better match other similar tests.

MaskRay added inline comments.Oct 4 2022, 11:18 AM
lld/test/ELF/lto/thinlto-emit-index-thin-archive.ll
3 ↗(On Diff #465088)

Delete ;; Setup. It's not useful

5 ↗(On Diff #465088)

mkdir dir1 dir2

40 ↗(On Diff #465088)

Rename CHECK-UNUSED and just reuse its check since the pattern is exactly the same.

Addressed inline comments.

MaskRay accepted this revision.Oct 4 2022, 12:34 PM
This revision is now accepted and ready to land.Oct 4 2022, 12:34 PM