Currently, the offload-wrapper tool inserts __tgt_register_lib to the list of global ctors of a target module with Priority=0. This means that it's got the same priority as __tgt_register_requires and the order in which these two functions are called in not guaranteed. Ideally, we'd like to call __tgt_register_requires BEFORE loading a libomptarget plugin (which is one of the actions happening inside __tgt_register_lib). The reason is that we want to know which requirements the user has asked for so that upon loading the plugin libomptarget can report how many devices there are that can satisfy the requirements.
E.g. with the current implementation we can run into the following problem:
- The user requests unified_shared_memory but the available devices on the system do not support this feature.
- Initially, the offload policy is set to tgt_default.
- __tgt_register_lib is called and the plugin for the specific target device reports there are N>0 available devices.
- Consequently, the offload policy is set to tgt_mandatory.
- __tgt_register_requires is called and we find out that the unified_shared_memory requirement cannot be satisfied.
- Offload fails and because the offload policy had been set to mandatory libomptarget terminates the application.
With the proposed change things will proceed as follows:
- The user requests unified_shared_memory but the available devices on the system do not support this feature.
- Initially, the offload policy is set to tgt_default.
- __tgt_register_requires is called and registers the unified_shared_memory requirement with libomptarget.
- __tgt_register_lib is called and the plugin for the specific target device reports that the unified_shared_memory requirement cannot be satisfied, so there are N=0 available devices.
- Consequently, the offload policy is set to tgt_disabled.
- Execution falls back on the host instead of terminating the application.
I think __tgt_unregister_lib should have a matching priority.