OpenMP linker script is known to cause problems for gold and lld linkers on Linux and it will also cause problems for Windows enabling in future. This is a sufficient justification for eliminating use of OpenMP linker script and replacing it with a portable solution. This patch contains prototype changes which implement such solution.
First a brief explanation how OpenMP linker script is being used in existing implementation. OpenMP linker script is dynamically generated by the clang driver and is added to the host link command to fulfil the following tasks
- Insert device binaries into the host binary at link time as data (makes host binary fat)
- Creates pair of symbols for the start/end address for each device binary
- And creates pair of symbols with start/end addresses around the compiler generated offload entry table
All symbols that are created by the linker script are used by the offload registration code that is added by the compiler to each host object as a comdat group. This compiler generated code consists of a pair of data objects (device binary descriptor) that use those symbols as initializers and two functions. One of those functions registers device binary descriptor at OpenMP runtime at program startup and the other unregisters it. BTW, having offload registration code in each host object is not good because it makes host object dependent on a particular list of targets (device binary descriptor depends on the offload targets).
This patch implements an alternative solution for the above tasks. Device binaries are inserted into the host binary with a help of the wrapper bit-code file which contains device binaries as data as well as the offload registration code for registering device binaries in offload runtime (tasks 1 and 2 in the above list). Wrapper bit-code file is dynamically created by the clang driver with a help of new tool clang-offload-wrapper which takes device binaries as input and produces bit-code file with required contents. Wrapper bit-code is then compiled to an object and resulting object is appended to the host linking by the clang driver.
Start/end symbols around the offload entry table (3 in the list above) are added by the linker which provides definition of start_name/stop_name symbols to satisfy unresolved references for ELF sections with a name representable as C identifier (see https://sourceware.org/binutils/docs/ld/Input-Section-Example.html for details). On Windows start/end symbols can be defined in the wrapper bit-code file with a help of the sections grouping (see https://docs.microsoft.com/en-us/windows/win32/Debug/pe-format#grouped-sections-object-only); Windows support should still be added in future.
Do we really need this new kind of job here, can we use bundler instead?