Download Raw Diff

Details

Reviewers

Commits

rG4eed6cc43369: Fix /WholeArchive bug.
rLLD334552: Fix /WholeArchive bug.
rL334552: Fix /WholeArchive bug.

Summary

lld-link foo.lib /wholearchive:foo.lib should work the same way as
lld-link /wholearchive:foo.lib foo.lib. Previously, /wholearchive in
the former case was ignored.

Diff Detail

Build Status

Buildable 19247
Build 19247: arc lint + arc unit

Event Timeline

ruiu created this revision.May 30 2018, 4:52 PM

Harbormaster completed remote builds in B18764: Diff 149221.May 30 2018, 4:52 PM

rnk added inline comments.May 31 2018, 10:04 AM

lld/COFF/Driver.cpp
1250	I think we need to canonicalize the path a little to ensure that this works on case insensitive file systems: $ lld-link foo.lib -wholearchive:Foo.lib We shouldn't enqueue the same file twice, right? Also, maybe we should just enqueue all whole archive inputs in this loop, up front, and then only process inputs later if the input wasn't already added as a whole archive input.
1250	I guess to make it easy to write a cross-platform test for path canonicalization, the test could use: $ lld-link ./foo.lib -wholearchive:foo.lib
1270	I don't think `Args.hasArg(OPT_wholearchive_flag)` is the right check here, that asks if a whole archive flag appears anywhere on the command line, not if this particular flag is whole archive.

address review comments

Harbormaster completed remote builds in B18794: Diff 149364.May 31 2018, 3:09 PM

ruiu added inline comments.May 31 2018, 3:10 PM

lld/COFF/Driver.cpp
1250	We shouldn't generally add the same file more than once. Added code to do that check.
1270	This expression correctly checks for that condition, no?

rnk added inline comments.May 31 2018, 4:37 PM

lld/COFF/Driver.cpp
134	I think O(n**2) stats on input object files is going to be too much. Once we have the FD, you can do `fs::getUniqueId(FD)` to get the inode number, and then build a set of those. I guess if it's not a file MemoryBuffer, don't do the inode check. It probably comes from a whole archive.
1255–1256	sys::fs::equivalent internally opens and closes a file handle for the purpose of doing a stat and comparing inodes. This will end up doing `O(inputs * wholearchiveinputs)` equivalency tests, and linkers have many input object files. Is there a way to avoid that?
1270	I see, I didn't understand how OPT_wholearchive_flag vs. file worked.

ruiu added inline comments.May 31 2018, 7:43 PM

lld/COFF/Driver.cpp
1255–1256	That's right, but I don't think there's a way to avoid that unless we implement some OS-dependent logic on our side. How operating system normalize pathname components depends on the operating system and the file system, and I believe the only reliable way to do that is to ask about it to the system. Maybe we could cache stat's results along with filenames to reduce the number of system calls, but I'm not sure if we need it. If it turns out to be too slow, we could optimize, but probably we should do that later when it becomes a real problem. In reality, what is the largest number of files you can think of you want to pass to the linker? For hundreds of files, this is probably fine. If you pass thousands of files, this is probably slow.

smeenai added a subscriber: smeenai.Jun 1 2018, 10:06 AM

smeenai added inline comments.

lld/COFF/Driver.cpp
1255–1256	We have a library that's built from ~5,000 object files, so we care about the performance here :) I can try to get some numbers later today.

I don't need the exact number as long as it is more than 5,000. :) I'll try to optimize this patch for a large number of input files.

rnk added inline comments.Jun 1 2018, 10:46 AM

lld/COFF/Driver.cpp
1255–1256	Right, O(n) stats should be fine, just use llvm::sys::fs::getUniqueId or whatever it's called, and it will be fine. That gets the inode (or its local equivalent).

cache stat() results

Harbormaster completed remote builds in B18840: Diff 149509.Jun 1 2018, 10:50 AM

ruiu added inline comments.Jun 1 2018, 11:15 AM

lld/COFF/Driver.cpp
134	Since this is a MemoryBuffer and not a MemoryBufferRef, I think anything that comes to this function is a file and not a slice of a file.

lgtm

Sorry this got lost.

This revision is now accepted and ready to land.Jun 12 2018, 11:43 AM

smeenai added inline comments.Jun 12 2018, 11:48 AM

lld/COFF/Driver.cpp
796	Typo: wrapper

This diff results in a 30% link time regression for us (5.3 seconds to 6.8 seconds) when linking ~5,000 input files together. This is when cross-compiling on Linux; the results would probably be even worse on Windows, where I believe stat/the filesystem in general is slower.

Is there any way to make this less expensive? I notice we still have the O(n^2) loop comparing every input file to every other input file (using a cached stat result, but still). @rnk had previously suggested building up a set out of inode numbers, which should be O(n) instead; are there problems with that approach?

Let me try. I'll make a change and upload shortly.

Remove O(n^2)-ness.

Harbormaster completed remote builds in B19247: Diff 151034.Jun 12 2018, 2:30 PM

This new version is much better; there doesn't actually appear to be any performance difference for my test case at all now (barring noise). Thank you!

lgtm

Closed by commit rLLD334552: Fix /WholeArchive bug. (authored by ruiu). · Explain WhyJun 12 2018, 2:52 PM

Closed by commit rL334552: Fix /WholeArchive bug. (authored by ruiu). · Explain Why

This revision was automatically updated to reflect the committed changes.

rnk added inline comments.Jun 14 2018, 12:48 PM

lld/trunk/COFF/Driver.cpp
1249–1250 ↗	(On Diff #151041)	We forgot to call `findFile` here. I'll go ahead and fix that. This caused https://crbug.com/852679

Diff 151034

lld/COFF/Driver.h

Show All 15 Lines
#include "lld/Common/Reproduce.h"		#include "lld/Common/Reproduce.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/StringSet.h"		#include "llvm/ADT/StringSet.h"
#include "llvm/Object/Archive.h"		#include "llvm/Object/Archive.h"
#include "llvm/Object/COFF.h"		#include "llvm/Object/COFF.h"
#include "llvm/Option/Arg.h"		#include "llvm/Option/Arg.h"
#include "llvm/Option/ArgList.h"		#include "llvm/Option/ArgList.h"
		#include "llvm/Support/FileSystem.h"
#include "llvm/Support/TarWriter.h"		#include "llvm/Support/TarWriter.h"
#include <memory>		#include <memory>
#include <set>		#include <set>
#include <vector>		#include <vector>

namespace lld {		namespace lld {
namespace coff {		namespace coff {

▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	private:
StringRef doFindFile(StringRef Filename);		StringRef doFindFile(StringRef Filename);
StringRef doFindLib(StringRef Filename);		StringRef doFindLib(StringRef Filename);

// Parses LIB environment which contains a list of search paths.		// Parses LIB environment which contains a list of search paths.
void addLibSearchPaths();		void addLibSearchPaths();

// Library search path. The first element is always "" (current directory).		// Library search path. The first element is always "" (current directory).
std::vector<StringRef> SearchPaths;		std::vector<StringRef> SearchPaths;
std::set<std::string> VisitedFiles;
		// We don't want to add the same file more than once.
		// Files are uniquified by their filesystem and file number.
		std::set<llvm::sys::fs::UniqueID> VisitedFiles;

std::set<std::string> VisitedLibs;		std::set<std::string> VisitedLibs;

Symbol *addUndefined(StringRef Sym);		Symbol *addUndefined(StringRef Sym);
StringRef mangle(StringRef Sym);		StringRef mangle(StringRef Sym);

// Windows specific -- "main" is not the only main function in Windows.		// Windows specific -- "main" is not the only main function in Windows.
// You can choose one from these four -- {w,}{WinMain,main}.		// You can choose one from these four -- {w,}{WinMain,main}.
// There are four different entry point functions for them,		// There are four different entry point functions for them,
▲ Show 20 Lines • Show All 89 Lines • Show Last 20 Lines

lld/COFF/Driver.cpp

Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines

void LinkerDriver::addBuffer(std::unique_ptr<MemoryBuffer> MB,		void LinkerDriver::addBuffer(std::unique_ptr<MemoryBuffer> MB,
bool WholeArchive) {		bool WholeArchive) {
StringRef Filename = MB->getBufferIdentifier();		StringRef Filename = MB->getBufferIdentifier();

MemoryBufferRef MBRef = takeBuffer(std::move(MB));		MemoryBufferRef MBRef = takeBuffer(std::move(MB));
FilePaths.push_back(Filename);		FilePaths.push_back(Filename);

// File type is detected by contents, not by file extension.		// File type is detected by contents, not by file extension.
		rnkUnsubmitted Not Done Reply Inline Actions I think O(n2) stats on input object files is going to be too much. Once we have the FD, you can do `fs::getUniqueId(FD)` to get the inode number, and then build a set of those. I guess if it's not a file MemoryBuffer, don't do the inode check. It probably comes from a whole archive. rnk: I think O(n2) stats on input object files is going to be too much. Once we have the FD, you…
		ruiuAuthorUnsubmitted Not Done Reply Inline Actions Since this is a MemoryBuffer and not a MemoryBufferRef, I think anything that comes to this function is a file and not a slice of a file. ruiu: Since this is a MemoryBuffer and not a MemoryBufferRef, I think anything that comes to this…
switch (identify_magic(MBRef.getBuffer())) {		switch (identify_magic(MBRef.getBuffer())) {
case file_magic::windows_resource:		case file_magic::windows_resource:
Resources.push_back(MBRef);		Resources.push_back(MBRef);
break;		break;
case file_magic::archive:		case file_magic::archive:
if (WholeArchive) {		if (WholeArchive) {
std::unique_ptr<Archive> File =		std::unique_ptr<Archive> File =
CHECK(Archive::create(MBRef), Filename + ": failed to parse archive");		CHECK(Archive::create(MBRef), Filename + ": failed to parse archive");
▲ Show 20 Lines • Show All 184 Lines • ▼ Show 20 Lines	if (!HasExt) {
Path.append(".obj");		Path.append(".obj");
if (sys::fs::exists(Path.str()))		if (sys::fs::exists(Path.str()))
return Saver.save(Path.str());		return Saver.save(Path.str());
}		}
}		}
return Filename;		return Filename;
}		}

		static Optional<sys::fs::UniqueID> getUniqueID(StringRef Path) {
		sys::fs::UniqueID Ret;
		if (sys::fs::getUniqueID(Path, Ret))
		return None;
		return Ret;
		}

// Resolves a file path. This never returns the same path		// Resolves a file path. This never returns the same path
// (in that case, it returns None).		// (in that case, it returns None).
Optional<StringRef> LinkerDriver::findFile(StringRef Filename) {		Optional<StringRef> LinkerDriver::findFile(StringRef Filename) {
StringRef Path = doFindFile(Filename);		StringRef Path = doFindFile(Filename);
bool Seen = !VisitedFiles.insert(Path.lower()).second;
		if (Optional<sys::fs::UniqueID> ID = getUniqueID(Path)) {
		bool Seen = !VisitedFiles.insert(*ID).second;
if (Seen)		if (Seen)
return None;		return None;
		}

if (Path.endswith_lower(".lib"))		if (Path.endswith_lower(".lib"))
VisitedLibs.insert(sys::path::filename(Path));		VisitedLibs.insert(sys::path::filename(Path));
return Path;		return Path;
}		}

// Find library file from search path.		// Find library file from search path.
StringRef LinkerDriver::doFindLib(StringRef Filename) {		StringRef LinkerDriver::doFindLib(StringRef Filename) {
// Add ".lib" to Filename if that has no file extension.		// Add ".lib" to Filename if that has no file extension.
bool HasExt = Filename.contains('.');		bool HasExt = Filename.contains('.');
if (!HasExt)		if (!HasExt)
Filename = Saver.save(Filename + ".lib");		Filename = Saver.save(Filename + ".lib");
return doFindFile(Filename);		return doFindFile(Filename);
}		}

// Resolves a library path. /nodefaultlib options are taken into		// Resolves a library path. /nodefaultlib options are taken into
// consideration. This never returns the same path (in that case,		// consideration. This never returns the same path (in that case,
// it returns None).		// it returns None).
Optional<StringRef> LinkerDriver::findLib(StringRef Filename) {		Optional<StringRef> LinkerDriver::findLib(StringRef Filename) {
if (Config->NoDefaultLibAll)		if (Config->NoDefaultLibAll)
return None;		return None;
if (!VisitedLibs.insert(Filename.lower()).second)		if (!VisitedLibs.insert(Filename.lower()).second)
return None;		return None;

StringRef Path = doFindLib(Filename);		StringRef Path = doFindLib(Filename);
if (Config->NoDefaultLibs.count(Path))		if (Config->NoDefaultLibs.count(Path))
return None;		return None;
if (!VisitedFiles.insert(Path.lower()).second)
		if (Optional<sys::fs::UniqueID> ID = getUniqueID(Path))
		if (!VisitedFiles.insert(*ID).second)
return None;		return None;
return Path;		return Path;
}		}

// Parses LIB environment which contains a list of search paths.		// Parses LIB environment which contains a list of search paths.
void LinkerDriver::addLibSearchPaths() {		void LinkerDriver::addLibSearchPaths() {
Optional<std::string> EnvOpt = Process::GetEnv("LIB");		Optional<std::string> EnvOpt = Process::GetEnv("LIB");
if (!EnvOpt.hasValue())		if (!EnvOpt.hasValue())
return;		return;
▲ Show 20 Lines • Show All 397 Lines • ▼ Show 20 Lines
void LinkerDriver::enqueueTask(std::function<void()> Task) {		void LinkerDriver::enqueueTask(std::function<void()> Task) {
TaskQueue.push_back(std::move(Task));		TaskQueue.push_back(std::move(Task));
}		}

bool LinkerDriver::run() {		bool LinkerDriver::run() {
ScopedTimer T(InputFileTimer);		ScopedTimer T(InputFileTimer);

bool DidWork = !TaskQueue.empty();		bool DidWork = !TaskQueue.empty();
while (!TaskQueue.empty()) {		while (!TaskQueue.empty()) {
		smeenaiUnsubmitted Not Done Reply Inline Actions Typo: wrapper smeenai: Typo: wrapper
TaskQueue.front()();		TaskQueue.front()();
TaskQueue.pop_front();		TaskQueue.pop_front();
}		}
return DidWork;		return DidWork;
}		}

// Parse an /order file. If an option is given, the linker places		// Parse an /order file. If an option is given, the linker places
// COMDAT sections in the same order as their names appear in the		// COMDAT sections in the same order as their names appear in the
▲ Show 20 Lines • Show All 435 Lines • ▼ Show 20 Lines	if (Config->Incremental && Config->DoICF) {
warn("ignoring '/incremental' because ICF is enabled; use '/opt:noicf' to "		warn("ignoring '/incremental' because ICF is enabled; use '/opt:noicf' to "
"disable");		"disable");
Config->Incremental = false;		Config->Incremental = false;
}		}

if (errorCount())		if (errorCount())
return;		return;

bool WholeArchiveFlag = Args.hasArg(OPT_wholearchive_flag);		// A predicate returning true if a given path is an argument for
		// /wholearchive:, or /wholearchive is enabled globally.
		// This function is a bit tricky because "foo.obj /wholearchive:././foo.obj"
		rnkUnsubmitted Not Done Reply Inline Actions I think we need to canonicalize the path a little to ensure that this works on case insensitive file systems: $ lld-link foo.lib -wholearchive:Foo.lib We shouldn't enqueue the same file twice, right? Also, maybe we should just enqueue all whole archive inputs in this loop, up front, and then only process inputs later if the input wasn't already added as a whole archive input. rnk: I think we need to canonicalize the path a little to ensure that this works on case insensitive…
		rnkUnsubmitted Not Done Reply Inline Actions I guess to make it easy to write a cross-platform test for path canonicalization, the test could use: $ lld-link ./foo.lib -wholearchive:foo.lib rnk: I guess to make it easy to write a cross-platform test for path canonicalization, the test…
		ruiuAuthorUnsubmitted Not Done Reply Inline Actions We shouldn't generally add the same file more than once. Added code to do that check. ruiu: We shouldn't generally add the same file more than once. Added code to do that check.
		// needs to be handled as "/wholearchive:foo.obj foo.obj".
		std::set<sys::fs::UniqueID> WholeArchives;
		for (auto *Arg : Args.filtered(OPT_wholearchive_file))
		if (Optional<sys::fs::UniqueID> ID = getUniqueID(Arg->getValue()))
		WholeArchives.insert(*ID);

		rnkUnsubmitted Not Done Reply Inline Actions sys::fs::equivalent internally opens and closes a file handle for the purpose of doing a stat and comparing inodes. This will end up doing `O(inputs * wholearchiveinputs)` equivalency tests, and linkers have many input object files. Is there a way to avoid that? rnk: sys::fs::equivalent internally opens and closes a file handle for the purpose of doing a stat…
		ruiuAuthorUnsubmitted Not Done Reply Inline Actions That's right, but I don't think there's a way to avoid that unless we implement some OS-dependent logic on our side. How operating system normalize pathname components depends on the operating system and the file system, and I believe the only reliable way to do that is to ask about it to the system. Maybe we could cache stat's results along with filenames to reduce the number of system calls, but I'm not sure if we need it. If it turns out to be too slow, we could optimize, but probably we should do that later when it becomes a real problem. In reality, what is the largest number of files you can think of you want to pass to the linker? For hundreds of files, this is probably fine. If you pass thousands of files, this is probably slow. ruiu: That's right, but I don't think there's a way to avoid that unless we implement some OS…
		smeenaiUnsubmitted Not Done Reply Inline Actions We have a library that's built from ~5,000 object files, so we care about the performance here :) I can try to get some numbers later today. smeenai: We have a library that's built from ~5,000 object files, so we care about the performance here…
		rnkUnsubmitted Not Done Reply Inline Actions Right, O(n) stats should be fine, just use llvm::sys::fs::getUniqueId or whatever it's called, and it will be fine. That gets the inode (or its local equivalent). rnk: Right, O(n) stats should be fine, just use llvm::sys::fs::getUniqueId or whatever it's called…
		auto IsWholeArchive = [&](StringRef Path) -> bool {
		if (Args.hasArg(OPT_wholearchive_flag))
		return true;
		if (Optional<sys::fs::UniqueID> ID = getUniqueID(Path))
		return WholeArchives.count(*ID);
		return false;
		};

// Create a list of input files. Files can be given as arguments		// Create a list of input files. Files can be given as arguments
// for /defaultlib option.		// for /defaultlib option.
std::vector<MemoryBufferRef> MBs;		for (auto *Arg : Args.filtered(OPT_INPUT, OPT_wholearchive_file))
for (auto *Arg : Args.filtered(OPT_INPUT, OPT_wholearchive_file)) {
switch (Arg->getOption().getID()) {
case OPT_INPUT:
if (Optional<StringRef> Path = findFile(Arg->getValue()))
enqueuePath(*Path, WholeArchiveFlag);
break;
case OPT_wholearchive_file:
if (Optional<StringRef> Path = findFile(Arg->getValue()))		if (Optional<StringRef> Path = findFile(Arg->getValue()))
enqueuePath(*Path, true);		enqueuePath(*Path, IsWholeArchive(Arg->getValue()));
break;
		rnkUnsubmitted Not Done Reply Inline Actions I don't think `Args.hasArg(OPT_wholearchive_flag)` is the right check here, that asks if a whole archive flag appears anywhere on the command line, not if this particular flag is whole archive. rnk: I don't think `Args.hasArg(OPT_wholearchive_flag)` is the right check here, that asks if a…
		ruiuAuthorUnsubmitted Not Done Reply Inline Actions This expression correctly checks for that condition, no? ruiu: This expression correctly checks for that condition, no?
		rnkUnsubmitted Not Done Reply Inline Actions I see, I didn't understand how OPT_wholearchive_flag vs. file worked. rnk: I see, I didn't understand how OPT_wholearchive_flag vs. file worked.
}
}
for (auto *Arg : Args.filtered(OPT_defaultlib))		for (auto *Arg : Args.filtered(OPT_defaultlib))
if (Optional<StringRef> Path = findLib(Arg->getValue()))		if (Optional<StringRef> Path = findLib(Arg->getValue()))
enqueuePath(*Path, false);		enqueuePath(*Path, false);

// Windows specific -- Create a resource file containing a manifest file.		// Windows specific -- Create a resource file containing a manifest file.
if (Config->Manifest == Configuration::Embed)		if (Config->Manifest == Configuration::Embed)
addBuffer(createManifestRes(), false);		addBuffer(createManifestRes(), false);

▲ Show 20 Lines • Show All 285 Lines • Show Last 20 Lines

lld/test/COFF/wholearchive.s

	# REQUIRES: x86			# REQUIRES: x86

	# RUN: yaml2obj < %p/Inputs/export.yaml > %t.archive.obj			# RUN: yaml2obj < %p/Inputs/export.yaml > %t.archive.obj
	# RUN: llvm-ar rcs %t.archive.lib %t.archive.obj			# RUN: llvm-ar rcs %t.archive.lib %t.archive.obj
	# RUN: llvm-mc -triple=x86_64-windows-msvc %s -filetype=obj -o %t.main.obj			# RUN: llvm-mc -triple=x86_64-windows-msvc %s -filetype=obj -o %t.main.obj

	# RUN: lld-link -dll -out:%t.dll -entry:main %t.main.obj -wholearchive:%t.archive.lib -implib:%t.lib			# RUN: lld-link -dll -out:%t.dll -entry:main %t.main.obj -wholearchive:%t.archive.lib -implib:%t.lib
	# RUN: llvm-readobj %t.lib \| FileCheck %s -check-prefix CHECK-IMPLIB			# RUN: llvm-readobj %t.lib \| FileCheck %s -check-prefix CHECK-IMPLIB

	# RUN: lld-link -dll -out:%t.dll -entry:main %t.main.obj -wholearchive %t.archive.lib -implib:%t.lib			# RUN: lld-link -dll -out:%t.dll -entry:main %t.main.obj -wholearchive %t.archive.lib -implib:%t.lib
	# RUN: llvm-readobj %t.lib \| FileCheck %s -check-prefix CHECK-IMPLIB			# RUN: llvm-readobj %t.lib \| FileCheck %s -check-prefix CHECK-IMPLIB

				# RUN: lld-link -dll -out:%t.dll -entry:main %t.main.obj %t.archive.lib -wholearchive:%t.archive.lib -implib:%t.lib
				# RUN: llvm-readobj %t.lib \| FileCheck %s -check-prefix CHECK-IMPLIB

				# RUN: mkdir -p %t.dir
				# RUN: cp %t.archive.lib %t.dir/foo.lib
				# RUN: lld-link -dll -out:%t.dll -entry:main %t.main.obj %t.dir/./foo.lib -wholearchive:%t.dir/foo.lib -implib:%t.lib
				# RUN: llvm-readobj %t.lib \| FileCheck %s -check-prefix CHECK-IMPLIB

	# CHECK-IMPLIB: Symbol: __imp_exportfn3			# CHECK-IMPLIB: Symbol: __imp_exportfn3
	# CHECK-IMPLIB: Symbol: exportfn3			# CHECK-IMPLIB: Symbol: exportfn3

	.global main			.global main
	.text			.text
	main:			main:
	ret			ret

This is an archive of the discontinued LLVM Phabricator instance.

Fix /WholeArchive bug.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 151034

lld/COFF/Driver.h

lld/COFF/Driver.cpp

lld/test/COFF/wholearchive.s

This is an archive of the discontinued LLVM Phabricator instance.

Fix /WholeArchive bug.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 151034

lld/COFF/Driver.h

lld/COFF/Driver.cpp

lld/test/COFF/wholearchive.s

Fix /WholeArchive bug.
ClosedPublic