This is an archive of the discontinued LLVM Phabricator instance.

[openmp] No longer use LIBRARY_PATH to find devicertl
ClosedPublic

Authored by JonChesterfield on Sep 1 2021, 6:29 AM.

Details

Summary

Given D109057, change test runner to use the libomptarget-x-bc-path
argument instead of the LIBRARY_PATH environment variable to find the device
library.

Also drop the use of LIBRARY_PATH environment variable as it is far
too easy to pull in the device library from an unrelated toolchain by accident
with the current setup. No loss in flexibility to developers as the clang
commandline used here is still available.

Diff Detail

Event Timeline

JonChesterfield created this revision.Sep 1 2021, 6:29 AM
JonChesterfield requested review of this revision.Sep 1 2021, 6:29 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptSep 1 2021, 6:29 AM
JonChesterfield added inline comments.Sep 1 2021, 6:30 AM
clang/lib/Driver/ToolChains/CommonArgs.cpp
1713–1715

This part is contained in D109057 which is intended to land first

This revision is now accepted and ready to land.Sep 1 2021, 9:07 AM

I suppose it require users to add a command line argument every time they want to build a program?

This revision was landed with ongoing or failed builds.Sep 1 2021, 12:24 PM
This revision was automatically updated to reflect the committed changes.

Nope, works the same as it used to - looks relative to clang. However I think it's broken some CI machines

JonChesterfield reopened this revision.Sep 1 2021, 12:44 PM
This revision is now accepted and ready to land.Sep 1 2021, 12:44 PM

Nope, works the same as it used to - looks relative to clang. However I think it's broken some CI machines

Well, it also breaks when OpenMP is built standalone.

From my perspective, using LIBRARY_PATH to find bitcode library conforms with convention because the library is linked in a linkage during compilation time, where LIBRARY_PATH is used by the linker. If we don't want to mess up test results, the patch to force rpath should do the right thing.

Using LIBRARY_PATH to find the bitcode is a reasonable interpretation of LIBRARY_PATH but it's a disaster in terms of using clang on a system that has LIBRARY_PATH pointing at some other version of clang. Rpath will be ignored by the bitcode lib handling.

How does this break under standalone? It uses the same config.variable that is used for LIBRARY_PATH at the moment, so it looks in exactly the same place as before this patch

Using LIBRARY_PATH to find the bitcode is a reasonable interpretation of LIBRARY_PATH but it's a disaster in terms of using clang on a system that has LIBRARY_PATH pointing at some other version of clang. Rpath will be ignored by the bitcode lib handling.

How does this break under standalone? It uses the same config.variable that is used for LIBRARY_PATH at the moment, so it looks in exactly the same place as before this patch

OpenMP libraries don't always sit next to clang's default library path. For standalone build, the installation path of OpenMP can be different from that of LLVM. Before this patch, clang can find the library as long as openmp/lib is in LIBRARY_PATH. Now it can never find it except we pass the path explicitly to the compilation command line. I understand your argument that clang might find a wrong one, but it's user's responsibility to make sure the expected version should be on the top of LIBRARY_PATH.

That helps disambiguate. You aren't concerned that this change will break the in tree tests, rather that people who currently rely on LIBRARY_PATH to choose a bitcode library to use with a clang located somewhere else in the filesystem will now need to pass --libomptarget-nvptx-bc-path instead, or put a symlink relative to clang.

That is a huge win for anyone who has LIBRARY_PATH pointing to one toolchain and who wants to be able to use clang from another as it replaces (non-diagnosed) runtime misbehaviour with a compile time error about being unable to find the devicertl. E.g. if they've had to set LIBRARY_PATH to find something their application uses that is unrelated to that clang, perhaps a vendor toolchain.

It's also a moderate win in that it hopefully dissuades people from assuming that any clang can be used with any openmp offloading library, which is very much not true.

It breaks the use case of people who have an exactly matching clang and openmp, that they installed in unrelated places, and use LIBRARY_PATH to stitch back together because they don't have to use it for anything else. They get to use the commandline arguments instead.

I think that's great all around. It makes it much less likely that people will mismatch clang and devicertl, which is good for UX and for our bug handling. If we need to preserve the environment plumbing then let's have a new variable named for the purpose, so that it doesn't collide with LIBRARY_PATH, and still wire tests together with the commandline argument as it lets people copy & paste the failing line into a new terminal to debug it.

  • Update tests
JonChesterfield requested review of this revision.Sep 1 2021, 3:28 PM

Sending back to review now that the test updates are included and LIBRARY_PATH removed from the test setup entirely

clang/test/Driver/openmp-offload-gpu.c
151–157

Tests didn't change much - replace env LIBRARY_PATH with --libomptarget-nvptx-bc-path. Dropped the one that only used LIBRARY_PATH and added one for the new runtime library using directory. Slightly reordered and updated comments.

In order to tackle the issues, I think we can do it this way: put the clang default lib ahead of LIBRARY_PATH. This can make sure clang always uses the bitcode library that comes with it, and if it is not there, it also has the ability to find a potential one from LIBRARY_PATH.

@tianshilei1992 can we agree that the tests must point to a specific deviceRTL, and not dig one out of LIBRARY_PATH? That can be factored out of this patch to make the tests always use exactly the library that was just built, and also makes them easier to run outside of the test harness.

Discussed at weekly call. Going to go with the priority order:
1/ commandline argument
2/ look next to clang
3/ LIBRARY_PATH
4/ error

That means people who have installed clang+openmp together have something that works out of the box, and they can override the deviceRTL if they wish. It also means people who have installed matching clang/openmp in different locations and wired them together with LIBRARY_PATH will continue to work. I'll amend the above to do that.

@tianshilei1992 can we agree that the tests must point to a specific deviceRTL, and not dig one out of LIBRARY_PATH? That can be factored out of this patch to make the tests always use exactly the library that was just built, and also makes them easier to run outside of the test harness.

For test, IMO we could do whatever we want. That’s why I accepted the patch regarding rpath.

  • Keep LIBRARY_PATH search, as fallback if local file is missing
  • Use regex for windows path compat
tianshilei1992 added inline comments.Sep 8 2021, 5:00 PM
clang/test/Driver/openmp-offload-gpu.c
153

Seems like we don't have a test for the fallback version now?

JonChesterfield added inline comments.Sep 9 2021, 1:52 AM
clang/test/Driver/openmp-offload-gpu.c
153

Yes, indeed. That seemed reasonable to me, but can add some more stub files and check they can be found. I'll do that today - would really like this to go in soon enough that it can make it to the llvm-13 branch.

tianshilei1992 added inline comments.Sep 9 2021, 6:07 AM
clang/test/Driver/openmp-offload-gpu.c
153

Yeah, please do. If we provide one feature, it’s better to be tested.

  • Add test checking LIBRARY_PATH is used
tianshilei1992 accepted this revision.Sep 9 2021, 8:45 AM

LGTM. It's good to go if the test is green.

This revision is now accepted and ready to land.Sep 9 2021, 8:45 AM
This revision was landed with ongoing or failed builds.Sep 9 2021, 9:16 AM
This revision was automatically updated to reflect the committed changes.