This is an archive of the discontinued LLVM Phabricator instance.

msan: Pass --disable-new-dtags for tests
AbandonedPublic

Authored by Hahnfeld on Jan 22 2018, 11:51 AM.

Details

Summary

We really want to have DT_RPATH instead of DT_RUNPATH: When
searching for dynamic libraries, DT_RPATH is searched before
LD_LIBRARY_PATH while DT_RUNPATH is looked afterwards. So if
there is a libc++ in LD_LIBRARY_PATH, the tests will fail
because the found library is not instrumented.

This currently happens with GNU gold and lld while GNU ld
still defaults to "old dtags". However, GNU ld can also be
configured differently since binutils 2.29 as of last year.

Diff Detail

Event Timeline

Hahnfeld created this revision.Jan 22 2018, 11:51 AM
Herald added subscribers: Restricted Project, llvm-commits, mgorny. · View Herald TranscriptJan 22 2018, 11:51 AM

I'm not sure about this. Reverting to an old, deprecated feature is rarely the answer.
For example, Android does not support DT_RPATH at all.

One option is to unset LD_LIBRARY_PATH in tests.
Another is to use -Wl,-rpath when linking the code with non-instrumented libc++. The same as tests do for instrumented one.

One option is to unset LD_LIBRARY_PATH in tests.
Another is to use -Wl,-rpath when linking the code with non-instrumented libc++. The same as tests do for instrumented one.

I use libc++ with the host compiler and it needs to be in LD_LIBRARY_PATH for all other tests to work.

One option is to unset LD_LIBRARY_PATH in tests.
Another is to use -Wl,-rpath when linking the code with non-instrumented libc++. The same as tests do for instrumented one.

I use libc++ with the host compiler and it needs to be in LD_LIBRARY_PATH for all other tests to work.

You should be able to handle this in lit.

Take a look at how compiler-rt/test/asan/lit.cfg munges LD_LIBRARY_PATH, and do something similar in compiler-rt/test/msan/Unit/.

One option is to unset LD_LIBRARY_PATH in tests.
Another is to use -Wl,-rpath when linking the code with non-instrumented libc++. The same as tests do for instrumented one.

I use libc++ with the host compiler and it needs to be in LD_LIBRARY_PATH for all other tests to work.

You should be able to handle this in lit.

Take a look at how compiler-rt/test/asan/lit.cfg munges LD_LIBRARY_PATH, and do something similar in compiler-rt/test/msan/Unit/.

I was half-way through implementing this and wanted to align tsan where we might have the same problem (for some reasons unknown to me the tests don't fail there?). However, setting environment variables like LD_LIBRARY_PATH in lit means that also the compiler will run in this environment:

symbol lookup error: <...>/projects/compiler-rt/lib/tsan/libcxx_tsan_x86_64/lib/libc++.so.1: undefined symbol: __tsan_init

This obviously works for tsan-binaries but not if some toolchain binaries are linked against libc++.

One option is to unset LD_LIBRARY_PATH in tests.
Another is to use -Wl,-rpath when linking the code with non-instrumented libc++. The same as tests do for instrumented one.

I use libc++ with the host compiler and it needs to be in LD_LIBRARY_PATH for all other tests to work.

You should be able to handle this in lit.

Take a look at how compiler-rt/test/asan/lit.cfg munges LD_LIBRARY_PATH, and do something similar in compiler-rt/test/msan/Unit/.

I was half-way through implementing this and wanted to align tsan where we might have the same problem (for some reasons unknown to me the tests don't fail there?). However, setting environment variables like LD_LIBRARY_PATH in lit means that also the compiler will run in this environment:

symbol lookup error: <...>/projects/compiler-rt/lib/tsan/libcxx_tsan_x86_64/lib/libc++.so.1: undefined symbol: __tsan_init

This obviously works for tsan-binaries but not if some toolchain binaries are linked against libc++.

I normally build on my mac, and haven't actually seen these errors. Can you tell me how to reproduce? Should I build/test on Linux?

Right. That's why LD_LIBRARY_PATH is evil, as are environment variables in general.
IMHO the right way is to set DT_RUNPATH for each executable that wants non-standard libraries. That includes your host compiler.

Right. That's why LD_LIBRARY_PATH is evil, as are environment variables in general.
IMHO the right way is to set DT_RUNPATH for each executable that wants non-standard libraries. That includes your host compiler.

Sound good, but won't you still need to unset LD_LIBRARY_PATH in lit.

Unless I'm missing something, it should not be necessary if we add -Wl,-rpath flags to the right %clang substitutions.

IIUC, the search order is:

  • LD_PRELOAD, if set
  • DT_RPATH, if DT_RUNPATH is not set
  • LD_LIBRARY_PATH unless priviledged, i.e., setuid/setgid bits set
  • DT_RUNPATH (when linked with --enable-new-dtags)

So, DT_RUNPATH will not trump LD_LIBRARY_PATH unless setuid/setgid bits are set, which is not the case for these tests.

Oh, sorry, I did not notice that you said "unset".
Maybe. On the other hand, LD_LIBRARY_PATH could be set for a reason other than libc++. It sounds like whether we do it or not, we may break some users.

No worries. Like Jonas said, it's an ordering problem, so if you could prepend the correct paths to LD_LIBRARY_PATH, it should work -- perhaps easier said than done.

Again, I think this could be done on a test by test basis which lit should be able to handle.

I was half-way through implementing this and wanted to align tsan where we might have the same problem (for some reasons unknown to me the tests don't fail there?). However, setting environment variables like LD_LIBRARY_PATH in lit means that also the compiler will run in this environment:

symbol lookup error: <...>/projects/compiler-rt/lib/tsan/libcxx_tsan_x86_64/lib/libc++.so.1: undefined symbol: __tsan_init

This obviously works for tsan-binaries but not if some toolchain binaries are linked against libc++.

I normally build on my mac, and haven't actually seen these errors. Can you tell me how to reproduce? Should I build/test on Linux?

This deserves some longer explanation and it was getting a bit late yesterday: The error only comes up with some local changes implementing the approach with LD_LIBRARY_PATH.

The "problem" is that I'm building with LLVM_ENABLE_LIBCXX, so bin/clang is linked against libc++. Additionally the "host" clang (that I'm using to build LLVM) was built the same way, so there already is a libc++.so in LD_LIBRARY_PATH. If some tests are linked with -Wl,-rpath and use DT_RUNPATH, the library in LD_LIBRARY_PATH takes precedence and the instrumented one is not used. If we however add the instrumented libc++ to LD_LIBRARAY_PATH, bin/clang will also use it which results in the symbol error mentioned above when building tsan tests.

No worries. Like Jonas said, it's an ordering problem, so if you could prepend the correct paths to LD_LIBRARY_PATH, it should work -- perhaps easier said than done.

Again, I think this could be done on a test by test basis which lit should be able to handle.

Yep, but as explained above we have the ordering issue both when compiling (where we don't want to have the instrumented libc++) and when executing the test (that's where we want the instrumented version!). That's why I wanted to use the "old" DT_RPATH which is evaluated before looking at LD_LIBRARY_PATH and is only set for the compiled test, not for bin/clang. If you are strongly against that, the only other possibility I see is setting LD_LIBRARY_PATH when executing the test, ie add a substitution like %load_instrumented_libcxx which evaluates to env LD_LIBRARY_PATH=.... This would avoid the problems when bin/clang also uses libc++.

Thanks for the explanation.

Since it's a problem with your particular environment, I think you should fix that instead. You can eliminate the need to export LD_LIBRARY_PATH to run your host compiler by creating a simple script and symlink it to clang and clang++, e.g., here's a simple example...

$ cat run_clang.sh
#!/bin/sh
name=`basename $0`
prefix=<host clang prefix>
LD_LIBRARY_PATH=$prefix/lib $prefix/bin/$name $@
$ ln -s run_clang.sh clang
$ ln -s run_clang.sh clang++

Then you can get rid of the global LD_LIBRARY_PATH, and run cmake like this CC=<path to run_clang.sh>/clang cmake ../llvm. No need to pass CXX since cmake will set it automatically based on CC.

Since it's a problem with your particular environment, I think you should fix that instead.

I strongly disagree: Environment variables are used like that in the wild. And the same thing will happen when libc++ is installed to /usr/lib one day in the future.

Since it's a problem with your particular environment, I think you should fix that instead.

I strongly disagree: Environment variables are used like that in the wild. And the same thing will happen when libc++ is installed to /usr/lib one day in the future.

Pretty sure /usr/lib gets searched last, i.e., DT_RUNPATH always wins unless LD_LIBRARY_PATH is set. As a general rule, only set LD_LIBRARY_PATH as a last resort, and if you have a special case, use a wrapper script.

Pretty sure /usr/lib gets searched last, i.e., DT_RUNPATH always wins unless LD_LIBRARY_PATH is set.

That's right, unless it's also in LD_LIBRARY_PATH

As a general rule, only set LD_LIBRARY_PATH as a last resort, and if you have a special case, use a wrapper script.

You are the first person to say that and in general the tests shouldn't fail just because someone needed to use the "last resort".

Pretty sure /usr/lib gets searched last, i.e., DT_RUNPATH always wins unless LD_LIBRARY_PATH is set.

That's right, unless it's also in LD_LIBRARY_PATH

But that's my point. You should avoid setting LD_LIBRARY_PATH unless you have no other choice.

As a general rule, only set LD_LIBRARY_PATH as a last resort, and if you have a special case, use a wrapper script.

You are the first person to say that and in general the tests shouldn't fail just because someone needed to use the "last resort".

Well, perhaps the second, i.e., s/evil/last resort/:

In D42390#987043, @eugenis wrote:
Right. That's why LD_LIBRARY_PATH is evil, as are environment variables in general.
IMHO the right way is to set DT_RUNPATH for each executable that wants non-standard libraries. That includes your host compiler.

Well, perhaps the second, i.e., s/evil/last resort/:

In D42390#987043, @eugenis wrote:
Right. That's why LD_LIBRARY_PATH is evil, as are environment variables in general.
IMHO the right way is to set DT_RUNPATH for each executable that wants non-standard libraries. That includes your host compiler.

I understood this to be related to setting environment variables in the test. As the name suggest they are meant for the environment.

hintonda requested changes to this revision.EditedJan 25 2018, 2:16 AM

Well, perhaps the second, i.e., s/evil/last resort/:

In D42390#987043, @eugenis wrote:
Right. That's why LD_LIBRARY_PATH is evil, as are environment variables in general.
IMHO the right way is to set DT_RUNPATH for each executable that wants non-standard libraries. That includes your host compiler.

I understood this to be related to setting environment variables in the test. As the name suggest they are meant for the environment.

I'll let @eugenis speak for himself, but in general LD_LIBRARY_PATH is not a good idea. A quick google search will provide a bunch of links to blogs and articles that make a much stronger case than I can.

This revision now requires changes to proceed.Jan 25 2018, 2:16 AM

I'll let @eugenis speak for himself, but in general LD_LIBRARY_PATH is not a good idea. A quick google search will provide a bunch of links to blogs and articles that make a much stronger case than I can.

Okay, rethinking about this, we probably can't get rid of it in our environment because we use the modules system (http://modules.sourceforge.net/) which heavily relies on modifying environment variables.

In general, would patches be acceptable (maybe in a different form) that solves this kind "problem"?

I'll let @eugenis speak for himself, but in general LD_LIBRARY_PATH is not a good idea. A quick google search will provide a bunch of links to blogs and articles that make a much stronger case than I can.

Okay, rethinking about this, we probably can't get rid of it in our environment because we use the modules system (http://modules.sourceforge.net/) which heavily relies on modifying environment variables.

In general, would patches be acceptable (maybe in a different form) that solves this kind "problem"?

I think unsetting LD_LIBRARY_PATH in the tests sounds reasonable -- many tests already do that.
However, I'm still unclear why doing this in compiler-rt/test/msan/Unit/ didn't work. Can't you just unset LD_LIBRARY_PATH for those tests (not all tests and not globally, just add a lit.cfg file that does in that directory)?

I think unsetting LD_LIBRARY_PATH in the tests sounds reasonable -- many tests already do that.
However, I'm still unclear why doing this in compiler-rt/test/msan/Unit/ didn't work. Can't you just unset LD_LIBRARY_PATH for those tests (not all tests and not globally, just add a lit.cfg file that does in that directory)?

Which tests do that? AFAICS the asan tests you have been referring to add something to LD_LIBRARY_PATH...

I think unsetting LD_LIBRARY_PATH in the tests sounds reasonable -- many tests already do that.
However, I'm still unclear why doing this in compiler-rt/test/msan/Unit/ didn't work. Can't you just unset LD_LIBRARY_PATH for those tests (not all tests and not globally, just add a lit.cfg file that does in that directory)?

Which tests do that? AFAICS the asan tests you have been referring to add something to LD_LIBRARY_PATH...

Perhaps I should have said manipulate/set, etc. The point is that is not unreasonable to change it.

However, I would again suggest you get away from relying on magic environment variables. As far as I can tell, you are the only one having this issue.

Hahnfeld abandoned this revision.Jan 25 2018, 4:46 AM

However, I would again suggest you get away from relying on magic environment variables. As far as I can tell, you are the only one having this issue.

As I said we won't and most of HPC clusters use this way to install multiple compilers etc. It's probably just that I'm using an uncommon configuration to build LLVM (LLVM_ENABLE_LIBCXX + LLVM_ENABLE_LLD + clang host compiler linked against libc++ which is in LD_LIBRARY_PATH). Because I can't see a correct way to fix this in all cases, I'm abandoning this revision and will keep the patch locally.