We support standalone compilation for the NVPTX architecture using
'nvlink' as our linker. Because of the special handling required to
transform input files to cubins, as nvlink expects for some reason, we
didn't use the standard AddLinkerInput method. However, this also
meant that we weren't forwarding options passed with -Wl to the
linker. Add this support in for the standalone toolchain path.
Details
- Reviewers
JonChesterfield tra yaxunl MaskRay
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/lib/Driver/ToolChains/Cuda.cpp | ||
---|---|---|
594 | No, thanks that was from when I originally tried to use AddLinkerInput but it didn't work because of the cubin thing. |
Somewhat annoying, I've discovered that LLVM adds -Wl,-fcolor-diagnostics which obviously isn't supported by nvlink so it fails while including this in libc's CMake. Any clue if there's a way to work around that?
The main reason I made this patch was to allow passing --suppress-stack-size-warning to nvlink. But it turns out it's a little more difficult there.
I guess the options are to either filter out the automatically added option or to avoid adding that particular argument if we know that the target is NVPTX. The latter would probably be preferable as there would be only one place where the decision is made.
The latter is a little difficult, the logic adds it based off of the host linker, but we explicitly override the host triple when we build via --target=. So there's be no way to turn it off in LLVM unless it's a blanket check on building libc. And since it's a global flag I can't just disable it only for the target. So I think the options are, either filter it out manually here or make a new flag called -Xcuda-nvlink, which I wouldn't like to do.
Putting up the hack that works around my problem with libc. Definitely not a good solution though.
The latter is a little difficult,
The more we dig, the more we want GPU-capable lld. :-)
clang/lib/Driver/ToolChains/Cuda.cpp | ||
---|---|---|
641 | Can there ever be more than one value returned by II.getInputArg().getValues()? If so, we probably don't want to skip all of them if one of them is --color-diagnostics. We may want to ignore only singleton --color-diagnostics and let all other combinations error out. |
My thoughts exactly. I had a small chat with @MaskRay about how difficult it would be to spin up support for NVPTX. But it would probably be a reasonably large project, and considering who I work for would be difficult for me to do it as more than a hobby.
clang/lib/Driver/ToolChains/Cuda.cpp | ||
---|---|---|
641 | Yeah, you can do -Wl,arg1,arg2,arg3. This was just because I couldn't think of an easy way to separate them out, considering that we rely on renderAsInput we'd need to create an entirely new arg. Which is doable, but I wasn't sure if it was worth the effort. |
I've discovered that LLVM adds -Wl,-fcolor-diagnostics
Can you tell me where it's done?
I do not think that we should work around this particular source of options in clang driver.
This sounds like something that may need to be dealt with in cmake.
The root cause is that cmake assumes that if the linker accepts the flag for target X, it will accept that flag for target Y. Or that the flags will be used only for compiling for the default target. Considering that clang is a cross-compiler, neither of the assumptions is universally true.
We may need to add a concept of per-offload-arch options that would be checked with specific --target=.... It's a bigger can of worms than just filtering the argument out, but I think we'll need to deal with it sooner or later anyways.
As a short-term stop-gap solution, I would suggest adding a cmake knob to disable linker color output altogether. This should unblock you and would not affect anybody else until we have a better fix.
This might have to do something with
# Handle common options used by all runtimes. include(AddLLVM)
in runtimes/CMakeLists.txt
which is totally wrong in my opinion.
In a LLVM_ENABLE_PROJECTS build (not sure about LLVM_ENABLE_RUNTIMES build) AddLLVM determines flags for building LLVM itself, not for building runtime libraries.
The results of the tests of the host toolchain therefore affect the invocation of the target toolchain.
I came across this once in https://reviews.llvm.org/D146920#4240370
Is removal of this line intentional?