This patch adds support for linking in the OpenMP math wrappers library.
The math library first replaces all math calls with an OpenMP wrapper
call. This wrapper call is linked early with a library that associates
the wrapper call to the original math function. This is necessary to
have access to the math function symbols without including the math
header which includes incompatible code with the GPU. These wrapper
functions have another library which maps them to the device library
version late when doing LTO. Linking libdevice must be done after this
library and done late when doing LTO. Unfortunately doing LTO with
libdevice is very slow right now.
Depends on D121467
FWIW, I don't believe this amdgpu/nvpxt separation is actually improving things, on the contrary.