This is an archive of the discontinued LLVM Phabricator instance.

[clang-offload-wrapper] Lower priority of __tgt_register_lib in favor of __tgt_register_requires
ClosedPublic

Authored by grokos on Feb 26 2020, 5:33 PM.

Details

Summary

Currently, the offload-wrapper tool inserts __tgt_register_lib to the list of global ctors of a target module with Priority=0. This means that it's got the same priority as __tgt_register_requires and the order in which these two functions are called in not guaranteed. Ideally, we'd like to call __tgt_register_requires BEFORE loading a libomptarget plugin (which is one of the actions happening inside __tgt_register_lib). The reason is that we want to know which requirements the user has asked for so that upon loading the plugin libomptarget can report how many devices there are that can satisfy the requirements.

E.g. with the current implementation we can run into the following problem:

  1. The user requests unified_shared_memory but the available devices on the system do not support this feature.
  2. Initially, the offload policy is set to tgt_default.
  3. __tgt_register_lib is called and the plugin for the specific target device reports there are N>0 available devices.
  4. Consequently, the offload policy is set to tgt_mandatory.
  5. __tgt_register_requires is called and we find out that the unified_shared_memory requirement cannot be satisfied.
  6. Offload fails and because the offload policy had been set to mandatory libomptarget terminates the application.

With the proposed change things will proceed as follows:

  1. The user requests unified_shared_memory but the available devices on the system do not support this feature.
  2. Initially, the offload policy is set to tgt_default.
  3. __tgt_register_requires is called and registers the unified_shared_memory requirement with libomptarget.
  4. __tgt_register_lib is called and the plugin for the specific target device reports that the unified_shared_memory requirement cannot be satisfied, so there are N=0 available devices.
  5. Consequently, the offload policy is set to tgt_disabled.
  6. Execution falls back on the host instead of terminating the application.

Diff Detail

Event Timeline

grokos created this revision.Feb 26 2020, 5:33 PM
vzakhari added inline comments.Feb 26 2020, 5:43 PM
clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp
291–292

I think __tgt_unregister_lib should have a matching priority.

grokos updated this revision to Diff 246870.Feb 26 2020, 6:52 PM
grokos marked an inline comment as done.

@ABataev, @jdoerfert, does the patch look good to you? Can someone accept it if it's ready to go?

This revision is now accepted and ready to land.Mar 3 2020, 11:59 AM
This revision was automatically updated to reflect the committed changes.