Math libraries are linked only when -lm is specified. This is because
host system could be missing rocm-device-libs.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp | ||
---|---|---|
260 | I recognise this comment. Is this a bunch of logic that can be moved into the base class and then called from here and hip? |
clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp | ||
---|---|---|
260 | This is copied (after removing stuff related to opencl) from https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/AMDGPU.cpp#L841 |
clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp | ||
---|---|---|
275 | Gone searching. This stuff has already been copied & pasted between AMDGPU.cpp and HIP.cpp. And diverged, looks like HIP has gained a bunch of inverse flags that AMDGPU has not, and some flags are duplicated (OPT_cl_fp32_correctly_rounded_divide_sqrt, OPT_fhip_fp32_correctly_rounded_divide_sqrt). @b-sumner / @t-tye / @yaxunl / @scchan I'd like to suggest that we use the same handling of these flags on opencl / hip / c++ openmp. Since the names have diverged and we don't want to break backwards compatibility, I'd like to check for all three (hip, no_hip, cl) flags for each one, with last one wins semantics, and do that from a single method called by all the amdgpu language drivers. That way hip and opencl will continue working as they do today, except hip will additionally accept some opencl flags and do the right thing, and likewise opencl. If that is unacceptable, the near future involves OPT_fopenmp_fp32_correctly_rounded_device_sqrt and similar, and I'd much rather use the same path for all the languages than copy&paste again. @pdhaliwal getCommonBitcodeLibs will splice in ockl as well as ocml and dependencies. OpenMP doesn't call into ockl so I'd like to avoid that, preferably by moving the ockl append out of getCommonBitcodeLibs @amdgpu in general can we get rid of this oclc_isa_version_xxx.bc file, which only defines a single constant in each case |
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
910–920 | I agree that we'd better refactor this part as a common function used by HIP/OpenMP/OpenCL. However I disagree to let HIP use OpenCL options. HIP uses clang generic options or HIP/CUDA shared options to control these flags. I think OpenMP may consider using similar options as HIP instead of OpenCL options. |
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
910–920 | It's a mixture. Some flags look generic, some have hip in the name. My preference would be to have a generic named flag for each one, and alias hip/opencp/cuda onto the generic one. Aliasing the flags would mean what works today continues working while simplifying the control flow in clang. Keeping flags with different names that do the same thing is certainly possible but doesn't seem a feature. It makes clang more complicated in order to make user build scripts more complicated. |
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
910–920 | OpenCL options are defined by OpenCL spec. There may be difficulties to alias them to other options. I am OK to alias HIP options to more generic options. |
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
910–920 | Works for me. We could fold HIP and OpenMP together, both using generic options, and then leave OpenCL to adopt the same if they wish. |
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
923–924 | I wanted to rename these to something generic like -fgpu-fp32.... but due to some weird reason aliasing wasn't working. Anyhow, my suggestion is to make that change in separate patch. |
I don't like that this pulls in ockl automatically but don't think that's a blocker. OK on my side, @yaxunl?
Due to the current state of math headers, I was unable to test this patch without ockl. But last time when headers were working, I was actually required to link ockl for a symbol (I forgot the name). I will update once I am able to get the math headers work again.
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
831–860 | I think we'd better absorb this part into the newly added function getCommonDeviceLibOptions so that we have a centralized location for determining device libs. We could use offload kind of the toolchain to differentiate between OpenCL/HIP/OpenMP. |
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
831–860 | getCommonBitcodeLibs is called by opencl with some other set of constraints around argument names. Persuading opencl to use the same arguments, getting rid of some of the files, doing things with aliasing, or however else we want to dice this problem is separable from linking the bitcode into openmp and can be left for a later patch. Using a common path for HIP and OpenMP seems a step in the right direction. It might take quite a long time to reach consensus on how to deduplicate the two remaining copies, which I'd guess is why they were copy/pasted to begin with. |
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
831–860 | We do not need to use shared options for HIP/OpenMP/OpenCL. We could use if/else to use different options for HIP/OpenCL. I am suggesting this because this part of code is largely the same as getCommonDeviceLibOptions, except the option names. Also the name getCommonDeviceLibOptions indicates that is a common function for all languages, otherwise it should be renamed as getCommonDeviceLibOptionsForHIPAndOpenMP | |
clang/lib/Driver/ToolChains/AMDGPU.h | ||
141 | Maybe rename it as getCommonDeviceLibNames since it returns the bc file names instead of options. Pls add a comment like 'returns a list of device library names shared by different languages". |
clang/lib/Driver/ToolChains/AMDGPU.cpp | ||
---|---|---|
831–860 | If it is out of scope for this patch I am OK to leave this part out. I can create a patch to refactor this part later. |
I think this is good enough for now, assuming the majority of OvO is running locally. More cleanups to do after landing.
It's been pointed out to me that -lm is a linker flag so it's weird to require it at compile time. I haven't thought of a good fix for that yet.
We don't need to splice in ocml for each compilation unit, so can move the test+splice into the link phase, except that amdgcn doesn't have a very well defined link phase as it keeps everything in IR. We could look for -c or equivalent, i.e. only check for ocml when building an executable, and not when compiling a translation unit to IR.
Maybe rename it as getCommonDeviceLibNames since it returns the bc file names instead of options.
Pls add a comment like 'returns a list of device library names shared by different languages".