This is an archive of the discontinued LLVM Phabricator instance.

[Driver][Windows] Add dependent lib argument for profile instr generate
ClosedPublic

Authored by russell.gallop on May 9 2019, 9:51 AM.

Details

Summary

This is needed so lld-link can find clang_rt.profile when self hosting on Windows with PGO.

Trying to self host on Windows with PGO runs into undefined symbols as lld-link doesn't know to link clang_rt.profile as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2019-January/129500.html

Using clang-cl as a linker knows to add the library but self hosting, using -DCMAKE_LINKER=<...>/lld-link.exe doesn't. I think that the cleanest way to make this work is to add a dependent library when using profile-instr-generate on Windows.

Note some cmake changes are also required to get PGO self hosting on Windows working. I will post them separately.

Diff Detail

Event Timeline

russell.gallop created this revision.May 9 2019, 9:51 AM
rnk added a comment.May 9 2019, 1:56 PM

Thanks, I would like to do this for the sanitizers as well, since this is a constant pain point for users, who have to come up with an architecture-dependent filename (clang_rt.asan-dynamic-$arch.lib or something).

clang/lib/Driver/ToolChains/Clang.cpp
787

This won't work with ld.bfd, so I would restrict this to a Windows MSVC environment. For mingw, I'd expect the user to call the compiler to link, and the right library path to be supplied to the linker by the compiler driver.

Prevent adding -dependent-lib on mingw32 and add tests.

In D61742#1497259, @rnk wrote:

Thanks, I would like to do this for the sanitizers as well, since this is a constant pain point for users

That would be good. I think that this might already work for UBSan.

russell.gallop marked an inline comment as done.May 10 2019, 7:31 AM
dmajor added a subscriber: dmajor.May 13 2019, 6:57 AM
rnk accepted this revision.May 13 2019, 1:37 PM

lgtm

This revision is now accepted and ready to land.May 13 2019, 1:37 PM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.May 27 2019, 5:03 PM

Hello, this embeds an absolute path into the generated .obj file, which means the output now is no longer deterministic (since it depends on the absolute path to clang_rt.profile-x86_64.lib). This means the output will be different on different machines, which breaks things like caching in distributed builds and whatnot.

I can't see an obvious way to save this patch either, since you won't know at compile time what the CWD will be at link time.

As-is, this breaks Chromium's deterministic builds. ("Only" the coverage builds, and they don't yet check for determinism, which is why it took us a while to notice.)

Suggestions?

Hello, this embeds an absolute path into the generated .obj file, which means the output now is no longer deterministic (since it depends on the absolute path to clang_rt.profile-x86_64.lib).

Yes, it embeds the absolute path. Note that -fsanitize=undefined has done that since r241225 (in 2015).

This means the output will be different on different machines, which breaks things like caching in distributed builds and whatnot.

I assume that your distributed build sandboxes tool paths so this is deterministic from one users point of view, just not deterministic across users with tools installed in different locations so it defeats the cache. Yes, that's not ideal.

I can't see an obvious way to save this patch either, since you won't know at compile time what the CWD will be at link time.

I assume you're considering relative paths here. Yes, I don't think that would work nicely.

As-is, this breaks Chromium's deterministic builds. ("Only" the coverage builds, and they don't yet check for determinism, which is why it took us a while to notice.)
Suggestions?

If this was made into a library name instead of a path (e.g. --dependent-lib=clang_rt.profile-x86_64.lib) then it would solve the problem of different paths on different nodes, but then you would have to tell the linker where to find this library.

That is *almost* as bad as having to specify the full path to the library to the linker in the first place! It's not quite as bad as you don't have to specify the architecture and you only need to specify it once regardless of whether you're using profiling or sanitizers. If lld-link added the clang library path itself, then the library name would be all that was needed from the compiler side.

Rui, would it be possible for lld-link to search the path to the clang libraries (e.g. C:\Program Files\LLVM\lib\clang\8.0.0\lib\windows) by default? Would this have any undesirable side effects?

After some more reading, it looks like passing -no-canonical-prefixes makes this path relative, and as long as compiles and links run from the same CWD things will work and be path-independent. So I think this is all good. Sorry about the false alarm! (Maybe we should have a test that fprofile-instr-generate + no-canonical-prefixes doesn't generate an absolute path?)

it looks like passing -no-canonical-prefixes makes this path relative

Do you see this working? I tried writing a test and it doesn't appear to work for me.

It seems to work for me from what I can tell:

$ out/gn/bin/clang-cl /c -fprofile-instr-generate /TC /dev/null -### -no-canonical-prefixes 2>&1 | grep -o 'dependent-lib=[^"]*'
dependent-lib=libcmt
dependent-lib=oldnames
dependent-lib=out/gn/lib/clang/9.0.0/lib/windows/clang_rt.profile-x86_64.lib
rnk added a comment.May 28 2019, 9:34 AM

I was going to suggest that maybe what we should do is just embed the basename, i.e. /nodefaultlib:clang_rt.profile-x86_64.lib, and then we just ask users to add one /libpath: flag to their linker invocation. That saves users from having to come up with a mapping from their build system CPU name to whatever compiler-rt is using (i686, i386, x86-64, x86_64h, etc). Users should probably put clang's library directory on libpath anyway if they want to be able to find the builtins library, which might be necessary even in a standard (no-pgo, no-coverage) build configuration. This is also consistent with MSVC in some ways, which autolinks the CRT this way.

I was going to suggest that maybe what we should do is just embed the basename, i.e. /nodefaultlib:clang_rt.profile-x86_64.lib ...

Do you mean /defaultlib:clang_rt.profile-x86_64.lib?

... and then we just ask users to add one /libpath: flag to their linker invocation

I'll try that. If it works, should we revisit the ubsan dependent lib support, to be consistent?

rnk added a comment.May 28 2019, 10:55 AM

I was going to suggest that maybe what we should do is just embed the basename, i.e. /nodefaultlib:clang_rt.profile-x86_64.lib ...

Do you mean /defaultlib:clang_rt.profile-x86_64.lib?

... and then we just ask users to add one /libpath: flag to their linker invocation

I'll try that. If it works, should we revisit the ubsan dependent lib support, to be consistent?

Yes, definitely. I'd like to do the same for asan, but things there are complicated by the various CRT linking modes.

Since this seems to work with no-canonical-prefixes, I think this is good as-is since people who want build path independent builds need that flag anyway. (At least if Russel can reproduce it working with no-canonical-prefixes; not sure why it wouldn't.)