This is an archive of the discontinued LLVM Phabricator instance.

Fix /WholeArchive bug.
ClosedPublic

Authored by ruiu on May 30 2018, 4:52 PM.

Details

Summary

lld-link foo.lib /wholearchive:foo.lib should work the same way as
lld-link /wholearchive:foo.lib foo.lib. Previously, /wholearchive in
the former case was ignored.

Event Timeline

ruiu created this revision.May 30 2018, 4:52 PM
rnk added inline comments.May 31 2018, 10:04 AM
lld/COFF/Driver.cpp
1262

I think we need to canonicalize the path a little to ensure that this works on case insensitive file systems:
$ lld-link foo.lib -wholearchive:Foo.lib

We shouldn't enqueue the same file twice, right?

Also, maybe we should just enqueue all whole archive inputs in this loop, up front, and then only process inputs later if the input wasn't already added as a whole archive input.

1262

I guess to make it easy to write a cross-platform test for path canonicalization, the test could use:
$ lld-link ./foo.lib -wholearchive:foo.lib

1278

I don't think Args.hasArg(OPT_wholearchive_flag) is the right check here, that asks if a whole archive flag appears anywhere on the command line, not if this particular flag is whole archive.

ruiu updated this revision to Diff 149364.May 31 2018, 3:09 PM
  • address review comments
ruiu added inline comments.May 31 2018, 3:10 PM
lld/COFF/Driver.cpp
1262

We shouldn't generally add the same file more than once. Added code to do that check.

1278

This expression correctly checks for that condition, no?

rnk added inline comments.May 31 2018, 4:37 PM
lld/COFF/Driver.cpp
134

I think O(n**2) stats on input object files is going to be too much. Once we have the FD, you can do fs::getUniqueId(FD) to get the inode number, and then build a set of those. I guess if it's not a file MemoryBuffer, don't do the inode check. It probably comes from a whole archive.

1267–1268

sys::fs::equivalent internally opens and closes a file handle for the purpose of doing a stat and comparing inodes. This will end up doing O(inputs * wholearchiveinputs) equivalency tests, and linkers have many input object files. Is there a way to avoid that?

1278

I see, I didn't understand how OPT_wholearchive_flag vs. file worked.

ruiu added inline comments.May 31 2018, 7:43 PM
lld/COFF/Driver.cpp
1267–1268

That's right, but I don't think there's a way to avoid that unless we implement some OS-dependent logic on our side. How operating system normalize pathname components depends on the operating system and the file system, and I believe the only reliable way to do that is to ask about it to the system.

Maybe we could cache stat's results along with filenames to reduce the number of system calls, but I'm not sure if we need it. If it turns out to be too slow, we could optimize, but probably we should do that later when it becomes a real problem.

In reality, what is the largest number of files you can think of you want to pass to the linker? For hundreds of files, this is probably fine. If you pass thousands of files, this is probably slow.

smeenai added inline comments.
lld/COFF/Driver.cpp
1267–1268

We have a library that's built from ~5,000 object files, so we care about the performance here :) I can try to get some numbers later today.

ruiu added a comment.Jun 1 2018, 10:23 AM

I don't need the exact number as long as it is more than 5,000. :) I'll try to optimize this patch for a large number of input files.

rnk added inline comments.Jun 1 2018, 10:46 AM
lld/COFF/Driver.cpp
1267–1268

Right, O(n) stats should be fine, just use llvm::sys::fs::getUniqueId or whatever it's called, and it will be fine. That gets the inode (or its local equivalent).

ruiu updated this revision to Diff 149509.Jun 1 2018, 10:50 AM
  • cache stat() results
ruiu added inline comments.Jun 1 2018, 11:15 AM
lld/COFF/Driver.cpp
134

Since this is a MemoryBuffer and not a MemoryBufferRef, I think anything that comes to this function is a file and not a slice of a file.

rnk accepted this revision.Jun 12 2018, 11:43 AM

lgtm

Sorry this got lost.

This revision is now accepted and ready to land.Jun 12 2018, 11:43 AM
smeenai added inline comments.Jun 12 2018, 11:48 AM
lld/COFF/Driver.cpp
791

Typo: wrapper

This diff results in a 30% link time regression for us (5.3 seconds to 6.8 seconds) when linking ~5,000 input files together. This is when cross-compiling on Linux; the results would probably be even worse on Windows, where I believe stat/the filesystem in general is slower.

Is there any way to make this less expensive? I notice we still have the O(n^2) loop comparing every input file to every other input file (using a cached stat result, but still). @rnk had previously suggested building up a set out of inode numbers, which should be O(n) instead; are there problems with that approach?

ruiu added a comment.Jun 12 2018, 2:01 PM

Let me try. I'll make a change and upload shortly.

ruiu updated this revision to Diff 151034.Jun 12 2018, 2:30 PM
  • Remove O(n^2)-ness.

This new version is much better; there doesn't actually appear to be any performance difference for my test case at all now (barring noise). Thank you!

rnk accepted this revision.Jun 12 2018, 2:42 PM

lgtm

Closed by commit rL334552: Fix /WholeArchive bug. (authored by ruiu). · Explain Why
This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.
rnk added inline comments.Jun 14 2018, 12:48 PM
lld/trunk/COFF/Driver.cpp
1249–1250 ↗(On Diff #151041)

We forgot to call findFile here. I'll go ahead and fix that. This caused https://crbug.com/852679