This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Pass AMDGPU math libraries into the linker wrapper
ClosedPublic

Authored by jhuber6 on Feb 15 2022, 6:59 AM.

Details

Summary

This patch passes in the AMDPGU math libraries to the linker wrapper.
The wrapper already handles linking OpenMP bitcode libraries via the
--target-library option. This should be sufficient to link in math
libraries for the accompanying architecture.

Fixes #53526.

Diff Detail

Event Timeline

jhuber6 created this revision.Feb 15 2022, 6:59 AM
jhuber6 requested review of this revision.Feb 15 2022, 6:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 15 2022, 6:59 AM

I can't test this yet because the AMDGPU cluster is down for maintenance.

jdoerfert added inline comments.Feb 15 2022, 8:14 AM
clang/lib/Driver/ToolChains/Clang.cpp
8203

Can you elaborate on this comment, what's bad, how would the better version look

jhuber6 added inline comments.Feb 15 2022, 8:15 AM
clang/lib/Driver/ToolChains/Clang.cpp
8203

It's explained in more detail where this is done for the AMDGPU ToolChain, e.g.

// This is not certain to work. The device libs added here, and passed to    
// llvm-link, are missing attributes that they expect to be inserted when    
// passed to mlink-builtin-bitcode. The amdgpu backend does not generate    
// conservatively correct code when attributes are missing, so this may    
// be the root cause of miscompilations. Passing via mlink-builtin-bitcode    
// ultimately hits CodeGenModule::addDefaultFunctionDefinitionAttributes    
// on each function, see D28538 for context.    
// Potential workarounds:    
//  - unconditionally link all of the device libs to every translation    
//    unit in clang via mlink-builtin-bitcode    
//  - build a libm bitcode file as part of the DeviceRTL and explictly    
//    mlink-builtin-bitcode the rocm device libs components at build time    
//  - drop this llvm-link fork in favour or some calls into LLVM, chosen    
//    to do basically the same work as llvm-link but with that call first    
//  - write an opt pass that sets that on every function it sees and pipe    
//    the device-libs bitcode through that on the way to this llvm-link

Should I copy the gist here?

jdoerfert added inline comments.
clang/lib/Driver/ToolChains/Clang.cpp
8203

Is it still relevant? We don't use llvm-link here, do we?

@arsenm, the backend is (almost) OK with the lack of attributes, is it not?

8213

I'd switch the conditions.

More importantly, does this require that the user passes -lm to the linker invocation? I'm not convinced we should not always link these in.

jhuber6 added inline comments.Feb 15 2022, 11:58 AM
clang/lib/Driver/ToolChains/Clang.cpp
8203

Linking is done using LTO now, I don't know exactly how they merge bitcode compared to llvm-link but I'm assuming it's similar.

8213

Yes, would save some time assuming most codes are C++

So I figured I'd copy the same semantics of how -lm works where you need to specify it for C but not C++. We could just pass this in all the time, but since linking it in currently required -lm I copied that.

I'm not confident it's only attributes that we rely on mlink-builtin-bitcode patching up, that could be poor phrasing on my part.

Making the pipeline robust to underspecified IR input may be easier (and arguably more useful) than changing the rocm library to be compiled for a given target.

clang/lib/Driver/ToolChains/Clang.cpp
8213

re: always linking these libraries in, regardless of -lm, it's probably better to link them by default and effectively ignore -lm.

I'd like to keep it guarded by -nogpulib or similar so that people can still opt out of the rocm device library stack entirely.

jdoerfert added inline comments.Feb 15 2022, 1:34 PM
clang/lib/Driver/ToolChains/Clang.cpp
8213

Opting out of "optional" gpu libraries is fine. I want to avoid that one needs to add linking flags etc. for OpenMP. We link libdevice by default too, so this is no different (as both contain math function impl etc.)

jhuber6 updated this revision to Diff 409043.Feb 15 2022, 1:36 PM

Removing -lm semantics so it is linked by default. Also adding the flag to the
tested output.

jhuber6 edited the summary of this revision. (Show Details)Feb 16 2022, 6:09 AM
clang/lib/Driver/ToolChains/Clang.cpp
8213

I guess that hangs on whether libm is optional or not.

I am reluctant to tie every openmp execution to the rocm libdevice as it frequently fails to build with llvm trunk and I'm nervy about linking IR built by the rocm clang with IR generated from trunk. It mostly seems to work for rocm sufficiently similar in age to trunk but periodically the abi or intrinsic set diverges.

Maybe our working assumption should be that everyone using openmp wants libm so rocm breaking trunk openmp has to be resolved each time it happens anyway. If so, there's a design decision to be made about how to manage that dependency.

jhuber6 added inline comments.Feb 16 2022, 6:36 AM
clang/lib/Driver/ToolChains/Clang.cpp
8213

I tested this on Spock and it resolves linking math libraries. I'm not sure about overall stability however. I'm assuming this is sufficient until we get a real libdevice for AMDGPU.

arsenm added inline comments.Feb 16 2022, 6:43 AM
clang/lib/Driver/ToolChains/Clang.cpp
8203

It can be subtley wrong, and also pushes reliance on setting the global subtarget. It's best to ensure the default attributes are propagated to the device libraries

jdoerfert accepted this revision.Feb 16 2022, 8:14 AM

LG, with or without the -lm lookup

clang/lib/Driver/ToolChains/Clang.cpp
8203

We will build a better solution for this soon, for now this is "good enough".

This revision is now accepted and ready to land.Feb 16 2022, 8:14 AM

Hi Joseph,
seems to have broken amdgpu buildbit,
please revert and fix ?
https://lab.llvm.org/buildbot/#/builders/193/builds/6939

This is asserting somewhere in clang-linker-wrapper, e.g. https://lab.llvm.org/buildbot/#/builders/193/builds/6939

This is asserting somewhere in clang-linker-wrapper, e.g. https://lab.llvm.org/buildbot/#/builders/193/builds/6939

Yes, must be when we try to sort the device files. I'll revert for now since I don't have time to fix it until later.

This revision is now accepted and ready to land.Feb 16 2022, 8:56 AM