This is an archive of the discontinued LLVM Phabricator instance.

[openmp] Drop requirement on library path environment variables
Needs ReviewPublic

Authored by JonChesterfield on May 5 2021, 4:37 PM.

Details

Summary

Involves multiple independent changes, intent is to land one piece at a time.
This diff provides a big picture of one way to avoid needing to specify
LD_LIBRARY_PATH and/or LIBRARY_PATH in order to run any openmp offloading code.
Specific details of the implementation are not necessarily interesting - e.g.
dlinfo appears to be impossible to use safely, so will drop that.

This diff (and associated openmp-dev email to be written shortly) is to start
a discussion on whether requiring users to set LD_LIBRARY_PATH in order to run
any openmp application is what we want.

Reviewers are a guess at interested parties, feel free to add anyone else.

Diff Detail

Event Timeline

JonChesterfield created this revision.May 5 2021, 4:37 PM
JonChesterfield requested review of this revision.May 5 2021, 4:37 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 5 2021, 4:37 PM
JonChesterfield added inline comments.May 5 2021, 5:03 PM
clang/lib/Driver/ToolChains/CommonArgs.cpp
648

Similar to other functions in this file, derived from aomp (by deleting some stuff for finding a debug version of the library)

I can make my peace with runpath instead if people are keen to override this with LD_LIBRARY_PATH. The rest of clang's toolchains wire things up with rpath but that doesn't mean we have to make the same choice here.

1703

Otherwise we need LIBRARY_PATH to run from the build tree, which is awkward for tests and for ad hoc running from the build tree. This path is checked last, so only changes behaviour from error message to success when the file exists here. Split out in D101935

clang/tools/amdgpu-arch/CMakeLists.txt
17 ↗(On Diff #343235)

Lets amdgpu-arch work from the build tree, split out in D101926

openmp/libomptarget/src/rtl.cpp
99

This logic is cut from D73657 without addressing any of the review comments. Idea is to look for the offloading plugins next to libomptarget, instead of in dlopen's default search path. Will address the previous comments when splitting out to a separate patch.

openmp/libomptarget/test/lit.cfg
72–73

For copy & paste of a RUN line from a failing test

182

Because otherwise we write --cuda-path=CUDA_TOOLKIT_ROOT_DIR-NOTFOUND at the top of non-nvptx tests, which is harmless but ugly

JonChesterfield added inline comments.May 5 2021, 5:10 PM
openmp/libomptarget/src/rtl.cpp
99

There's a D87413 that I didn't know about before seeing a comment on D73657 just now. That uses RTLD_DI_LINKMAP with dlinfo instead. I'm leaning towards dladdr because it seems to exist on more platforms but haven't looked into the options closely yet.

Thank you for looking into this!
This also fixes the problem of not being able to find libomp, right?

clang/lib/Driver/ToolChains/CommonArgs.cpp
648

I think it would be a shame if this would be the only thing
that *doesn't* support being overriden with LD_LIBRARY_PATH.
I'm not sure about libomptarget, but it i think it would be good to keep such possibility for libomp.

Yes. Libomp and libomptarget are in the same directory, so the rpath/runpath change catches both of them. Libomptarget then needs to find the plugins to load, either through library path or something equivalent to the above.

clang/lib/Driver/ToolChains/CommonArgs.cpp
648

The search order is 'rpath' then 'LD_LIBRARY_PATH' then 'runpath'. Presently we set no default at all and require the user to set LD_LIBRARY_PATH or manually link the right library. So using runpath here is backwards compatible, in that all the scripts out there that use LD_LIBRARY_PATH will continue to work. That may force our hand here.

  • rework logic for finding libomptarget.so
JonChesterfield added inline comments.May 6 2021, 7:40 PM
openmp/libomptarget/src/rtl.cpp
99

This part is factored out as D102043

protze.joachim added inline comments.May 7 2021, 10:22 AM
clang/lib/Driver/ToolChains/CommonArgs.cpp
648

Especially for debugging, I like the ability to exchange the OpenMP runtimes by adding the debugging build of the runtimes to LD_LIBRARY_PATH without needing to relink the application, so I'd also prefer runpath.

In the manpage of ld I found:

For a native ELF linker, the directories in "DT_RUNPATH" or "DT_RPATH" of a shared library are searched for shared libraries needed by it. The "DT_RPATH" entries are ignored if "DT_RUNPATH" entries exist.

Does this mean, that adding a runpath here will lead to ignoring the other rpath entries? Or does this only affect linking shared libraries and not linking an application?

openmp/libomptarget/test/lit.cfg
182

This would completely avoid the --cuda-path flag for non-nvptx tests

clang/lib/Driver/ToolChains/CommonArgs.cpp
648

The internet also claims rpath is now deprecated, which I suppose is consistent with ignoring rpath when both are present. So it seems likely that setting runpath here will clobber any (possibly deprecated) rpath that is added elsewhere.

I'm not sure what the right path is from there. People might set rpath on openmp executables, and we shouldn't clobber that if they do. There may be a linker flag along the lines of 'set runpath unless there is already an rpath requested'.

I think it's a bug in the linker to ignore one when the other is present but backwards compatibility will render that a feature.

There is a halfway step where we set rpath (or runpath) on libomp.so so that it can find libomptarget, as we own the libomp.so that is being built at the time, but that doesn't really solve the problem.

I'd note that compiler-rt is always statically linked, which seems a good idea for a compiler runtime, but would totally thwart the desire to replace it with some other runtime without relinking.

We could embed rpath, which gets definitely working out of the box behaviour, with a compiler flag to leave that off for development builds that want dynamic control over the pieces. Where I see that rpath is 'deprecated', but also that the runpath behaviour of clobbering an unrelated rpath makes it unusable in this context.

openmp/libomptarget/test/lit.cfg
182

Yes. That seems like a good thing. Is there a reason we want to pass --cuda-path=CUDA_TOOLKIT_ROOT_DIR-NOTFOUND to all the non-nvptx tests?

JonChesterfield edited the summary of this revision. (Show Details)May 25 2021, 3:18 PM

Just ran into this again. It's really annoying that a test fails, and prints a run line, which can be copied into a terminal where it abjectly fails to run because the environment variables aren't set.

jdenny added a subscriber: jdenny.Jul 29 2021, 8:53 AM

To make debugging of failed tests easier, we could just explicitly include env LD_LIBRARY_PATH=... into each run line - by adding it to the %libomp-run substitution (and the libomptarget equivalent).

To summarize the conversation, can we do LD_LIBRARY_PATH overwrites after this patch or not? If so, I feel everyone is in favor, if not, we need a different solution.

To summarize the conversation, can we do LD_LIBRARY_PATH overwrites after this patch or not? If so, I feel everyone is in favor, if not, we need a different solution.

+1

JonChesterfield added a comment.EditedAug 23 2021, 11:34 AM

Pasting env LD_LIBRARY_PATH= and env LIBRARY_PATH into the printed commandline, as opposed to through magic, would solve most of my day to day frustration here. libomp.so and libomptarget.so are in two different directories, LD_LIBRARY_PATH turns out to be colon delimited. LIBRARY_PATH is presently used to find the deviceRTL though I'd like to change that.

That doesn't solve any of the experience for our users but does help with debugging the tests. I'm currently trying to work out why the set of tests that fail under lit are different to the set that fail outside of lit and decreasing magic there might help me.

edit: oh, that's a slightly different question.

I don't think we can set rpath or runpath on user executables. We don't know what else they're doing to the executable. But we could definitely teach our libomp.so how to find libomptarget.so (can use runpath to allow environment override as that seems to be preferred), and if that is combined with looking for plugins next to libomptarget.so then all the host side libraries would hang together OK.

clang/lib/Driver/ToolChains/CommonArgs.cpp
648

Mutating the binary to let it find libomp.so (like this, or with runpath) is convenient for users who want that and very inconvenient for those who are using rpath for some other purpose. What we could do is change the build script for libomp.so to add a relative runpath to let it find libomptarget.so. That makes the test infra slightly simpler.

It also means a user only has to arrange for their binary to find libomp.so on their own without worrying about the other pieces, so doesn't necessarily have to use LD_LIBRARY_PATH to do it.

openmp/libomptarget/test/lit.cfg
182

This should disappear under a rebase

Do I understand correctly that adding runpath to libomp.so will help it find libomptarget.so *and* still allows users to use LD_LIBRARY_PATH to make sure a different libomptarget.so is found?

If the above is the case, can't we do the same for clang?

Asked differently, how does clang find things like compiler-rt?

JonChesterfield added a comment.EditedAug 23 2021, 11:57 AM

Do I understand correctly that adding runpath to libomp.so will help it find libomptarget.so *and* still allows users to use LD_LIBRARY_PATH to make sure a different libomptarget.so is found?

If the above is the case, can't we do the same for clang?

Asked differently, how does clang find things like compiler-rt?

If we set runpath on libomp.so it'll be able to find libomptarget.so without environment variables. If the user sets LD_LIBRARY_PATH, a libomptarget.so on that path will be used instead.

Clang can definitely set rpath or runpath on the user binary, that's what addOpenMPRuntimeSpecificRPath does. That way no environment variables are needed. The set of options are then:

  • User has otherwise set nothing on the binary, either rpath or runpath works great
  • User has set runpath and we set runpath, linker should put both in. Not sure which will go first. Probably fine
  • User has set rpath and we set rpath, linker should put both in. Not sure which will go first. Probably fine
  • User has set rpath and we set runpath. Linker will ignore the user rpath, so we've probably broken the binary
  • User has set runpath and we set rpath. Harmless, but we'll be ignored

compiler-rt is different because it's statically linked into every binary and every shared library, and is carefully written so that it doesn't matter if the final running system has loads of copies of it as a result. If we statically linked our libraries none of the above would be an issue but developers (and I suppose users) would no longer be able to pick and choose at runtime.

There might be a linker flag (which might even exist on all the linkers users might use) to control the behaviour where any runpath means ignore all rpath. What we'd really like is to set r(un)path on the user binary unless something else is already doing so, but I don't think we can have that information.

edit: did a little reading trying to guess the current state of play. It's not what I hoped for.

Interesting bug reports out there like https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/1253638 - in addition to being lower priority than rpath, runpath also (at least on some systems) fails to influence the search for transitive dependencies. That probably means runpath on libomp.so will find libomptarget.so, but won't let libomptarget.so find the plugins. There seems to be some discussion about whether that's how it is supposed to work or not.

If we go with my preference to use whatever plugin is in the same directory as the libomptarget.so the missing transitive search will be OK as the dependency chain is (just) short enough.

It's not obvious to me what makes 'rpath' deprecated relative to 'runpath', the latter seems strictly less useful. Also, there are linker flags that control which one of the two you get when asking for rpath (enable-new-dtags).

Suggestion therefore:

  • Add a cmake flag that disables all the convenient lookup logic for people using systems where it just doesn't work
  • Look for plugins next to libomptarget as transitive search may not work, failing that we need to set rpath on libomptarget so that it can find the plugins that are next to it
  • Set 'rpath' on libomp.so pointing at libomptarget.so. What other linker flags the user specifies may interact with that. Some systems will treat that as rpath and others as runpath. We could pass linker flags to try to force runpath and hope the linker silently ignores the new-dtags argument if it doesn't know it
  • Tag @dim as freebsd's linker probably does different stuff to bfd/gold/lld

This is all pretty bad. The current state of play where users build an application with clang that then refuses to run until they've set three environment paths is however also bad, so any of this that we can automate away I think we should.

edit2: a comment in one of the bug reports is

RUNPATH is optionally considered for both shared objects and executables, while RPATH is ignored for shared objects

I expect rpath to work on shared objects but that may be an implementation quirk. If we set whatever the linker does with 'rpath' on libomp.so (and possibly libomptarget.so) and the linker or loader ignores it, at least we've tried to do the convenient thing.

Pasting env LD_LIBRARY_PATH= and env LIBRARY_PATH into the printed commandline, as opposed to through magic, would solve most of my day to day frustration here. libomp.so and libomptarget.so are in two different directories, LD_LIBRARY_PATH turns out to be colon delimited. LIBRARY_PATH is presently used to find the deviceRTL though I'd like to change that.

The lit config has platform-specific rules to build the environmental variables (including the use of the proper separators). I suggest to used these values to create the printed env expressions.

The lit config has platform-specific rules to build the environmental variables (including the use of the proper separators). I suggest to used these values to create the printed env expressions.

Yep. That seems an unconditional win, however the rpath/runpath stuff works out.

There are 3 problems here (ignoring our test setup which should be discussed separately):

  1. make sure clang finds libomp.so
  2. make sure libomp.so (or clang?) finds libomptarget.so
  3. make sure libomptartget.so finds the plugins

All of which have to work nicely with LD_LIBRARY_PATH.

I think for 3 you can look at the current dir. That was discussed and, as long as it does work with LD_LIBRARY_PATH, that seems a win.

For 2, why don't we install them in the same place? If so, we reduce it to problem 1) [after applying solution to 3)] no?

For 1, doing something always backwards compatible that may or may not work on some platforms and configurations seems best. You had a proposal here already. If that works, let's do it.

Test setup can be fixed independently (and possibly should be).

D102043 is newly simplified. It looks for plugins next to libomptarget.so, which means it can find them even when the application uses a non-transitive method to find libomptarget.so.

Turns out libomptarget.so is linked directly to the application, not via libomp.so as I believed, and they're installed next to each other so whatever finds the first will find the second.

ye-luo added a comment.EditedAug 30 2021, 8:45 PM

There are 3 problems here (ignoring our test setup which should be discussed separately):

  1. make sure clang finds libomp.so
  2. make sure libomp.so (or clang?) finds libomptarget.so
  3. make sure libomptartget.so finds the plugins

All of which have to work nicely with LD_LIBRARY_PATH.

I think for 3 you can look at the current dir. That was discussed and, as long as it does work with LD_LIBRARY_PATH, that seems a win.

For 2, why don't we install them in the same place? If so, we reduce it to problem 1) [after applying solution to 3)] no?

For 1, doing something always backwards compatible that may or may not work on some platforms and configurations seems best. You had a proposal here already. If that works, let's do it.

For 1. If users prefer linking an alternative libomp.so, they should not use -fopenmp at linking and link libomp.so explicitly as a regular library. If users add -fopenmp to clang at linking, clang needs to link the libomp.so shipped with clang and set rpath to ensure libomp.so can be found at run even libomp.so doesn't exist on LD_LIBRARY_PATH. In this way, if a user would like to switch to anther libomp.so, they can still specify LD_LIBRARY_PATH.
For 2. Is libomp.so aware of libomptarget.so? I doubt. Anyway I'd like to see a similar logic as case 1 and the control option is -fopenmp-targets.

So addOpenMPRuntimeSpecificRPath seems being aligned with what I described.