This is an archive of the discontinued LLVM Phabricator instance.

Add HIP toolchain
ClosedPublic

Authored by yaxunl on Apr 3 2018, 9:27 AM.

Details

Summary

This patch adds HIP toolchain to support HIP language mode. It includes:

Create specific compiler jobs for HIP.

Choose specific libraries for HIP.

With contribution from Greg Rodgers.

Diff Detail

Event Timeline

yaxunl created this revision.Apr 3 2018, 9:27 AM
yaxunl edited the summary of this revision. (Show Details)Apr 3 2018, 9:31 AM
yaxunl updated this revision to Diff 140827.Apr 3 2018, 10:51 AM
yaxunl retitled this revision from [WIP][HIP] Let CUDA toolchain support HIP language mode to [HIP] Let CUDA toolchain support HIP language mode.
yaxunl edited the summary of this revision. (Show Details)

Fixed typo which causes lit test failures. Now all lit tests pass.

The other changes I see here seem reasonable, but please do split the patch.

include/clang/Basic/Cuda.h
61 ↗(On Diff #140827)

Does this actually have anything to do with HIP? You have a lot of changes in this patch which seem to just be about supporting more GPU revisions.

include/clang/Driver/Options.td
562

Do we absolutely need the non-CUDA-related aliases here? We generally try to be good about namespacing extension-specific language options.

I understand that you're probably trying to maintain command-line compatibility with some existing toolchain, but if it's possible to avoid this, I would be much happier.

include/clang/Driver/ToolChain.h
124 ↗(On Diff #140827)

"Backend" is a really generic name for this thing that is probably hyper-specific to the CUDA translation model.

yaxunl retitled this revision from [HIP] Let CUDA toolchain support HIP language mode to [HIP] Let CUDA toolchain support HIP language mode and amdgpu.Apr 4 2018, 7:29 AM
yaxunl marked an inline comment as done.Apr 4 2018, 7:50 AM
yaxunl added inline comments.
include/clang/Basic/Cuda.h
61 ↗(On Diff #140827)

This patch not only adds support of HIP language mode but also adds support of amdgpu to CUDA toolchain.

Currently HIP extension is only supported by amdgpu although in the future it may be supported by other targets.

include/clang/Driver/Options.td
562

There were discussions about a uniform clang option for offloading sub-arcs

http://lists.llvm.org/pipermail/llvm-dev/2017-February/109896.html

the consensus seem to be -offload-arch or -offload-archs.

This patch attempts to make that transition to use this new option.

We can separate this change to a different patch though.

include/clang/Driver/ToolChain.h
124 ↗(On Diff #140827)

Agree. This tool actually links bunch of bitcode libraries (so called device libraries). For non-GPU targets, this is usually unnecessary since they support ISA-level linking. However most GPU targets do not support that, therefore they need this stage.

How about renaming it as BitcodeLink?

rjmccall added inline comments.Apr 4 2018, 9:10 AM
include/clang/Basic/Cuda.h
61 ↗(On Diff #140827)

I understand that, but I think you can separate those two patches without too much difficulty.

include/clang/Driver/Options.td
562

I don't mind the -offload-* options; I'm more concerned about --host-only and so on.

include/clang/Driver/ToolChain.h
124 ↗(On Diff #140827)

DeviceLibraryLink, maybe? I wouldn't want someone to think this was related to ordinary LTO.

Hahnfeld added inline comments.
include/clang/Basic/Cuda.h
61 ↗(On Diff #140827)

There already is D42800 with lots of unanswered comments, so those need to be addressed first.

yaxunl marked 4 inline comments as done.Apr 4 2018, 10:26 AM
yaxunl added inline comments.
include/clang/Basic/Cuda.h
61 ↗(On Diff #140827)

Will split the changes for supporting amdgcn to another review.

61 ↗(On Diff #140827)

Will address those comments in the split review.

include/clang/Driver/Options.td
562

Will drop those options for now. Since HIP is just a language mode of CUDA, it is OK to use CUDA options e.g. 'cuda-device-only' for HIP.

include/clang/Driver/ToolChain.h
124 ↗(On Diff #140827)

That sounds good. Will change to that.

yaxunl updated this revision to Diff 141221.Apr 5 2018, 2:18 PM
yaxunl marked an inline comment as done.
yaxunl edited the summary of this revision. (Show Details)

Revised by reviewers' comments, including comments from previous review.

This looks okay to me, but I think it would better if someone with more expertise in the design of the driver and frontend code could review this.

lib/Driver/Driver.cpp
2204

The extra parens are weird here.

Can this revision be split further? The summary mentions many things that might make up multiple independent changes...

lib/Driver/ToolChains/Cuda.cpp
263 ↗(On Diff #141221)

Will this override the user's value, e.g. -std=c++14?

yaxunl marked 2 inline comments as done.Apr 9 2018, 9:42 AM
yaxunl added inline comments.
lib/Driver/Driver.cpp
2204

will remove it.

lib/Driver/ToolChains/Cuda.cpp
263 ↗(On Diff #141221)

No. We will add diagnotics in a separate patch.

yaxunl updated this revision to Diff 141863.Apr 10 2018, 8:48 AM
yaxunl marked 2 inline comments as done.
yaxunl edited the summary of this revision. (Show Details)

Separate file type changes to another patch.

Can this revision be split further? The summary mentions many things that might make up multiple independent changes...

I separated file type changes to https://reviews.llvm.org/D45489 and deferred some other changes for future.

It is kind of difficult to split this patch further since they depend on each other.

Can this revision be split further? The summary mentions many things that might make up multiple independent changes...

I separated file type changes to https://reviews.llvm.org/D45489 and deferred some other changes for future.

It is kind of difficult to split this patch further since they depend on each other.

You can add Related Revisions to make this easy to see.

rjmccall accepted this revision.Apr 13 2018, 12:16 AM

I'd still prefer if someone with more driver-design expertise weighed in, but we might not have any specialists there.

LGTM, although you might consider changing your tests a bit: FileCheck recently added support for a -D argument where you can predefine variables in the command line. But that's just a suggestion, not something I'm asking you to do as part of review.

This revision is now accepted and ready to land.Apr 13 2018, 12:16 AM

I'd still prefer if someone with more driver-design expertise weighed in, but we might not have any specialists there.

I think you should at least give @tra the possibility to take a look. Last time we essentially ended up reverting a patch.

tra requested changes to this revision.Apr 17 2018, 4:30 PM
tra added a subscriber: jlebar.

Sorry about the delay. I've been out for few days.

include/clang/Driver/Options.td
561–562

The discussion you've mentioned appears to be unfinished. @jlebar has raised a valid point regarding -no-offload-arch[s] and his email went unanswered AFAICT.

If you propose to generalize a way to specify offload targets, it should probably be done in a separate patch.

I'd just stick with --cuda-gpu-arch for the time being until we figure out how we're going to handle target specification in general. It works well enough for the moment and the new options are not required for the functionality implemented by this patch.

lib/Driver/Driver.cpp
2289–2297

You really want to either have your own error message or change existing one to something more generic. unsupported use of NVPTX for host compilation. will sound very confusing tof someone who tries to compile for AMD GPU.

3462

Would it make sense for HIP to have its own offloading kind? You seem to be adding similar checks in few other places.

3607–3609

llvm::sys::path::extension ?

4035–4042

I'm not sure why we do this here. Nor does it seem relevant to HIP.

4107–4108

Same here -- I'm not sure why we need this nor how it's related to HIP.

lib/Driver/ToolChains/Cuda.cpp
352 ↗(On Diff #141863)

L*L*C_OUTPUT ?

356 ↗(On Diff #141863)

Please use routines in`llvm::sys::path` here and in other places where you manipulate paths.

360–363 ↗(On Diff #141863)

You could use .append({...})

CmdArgs2.append({"foo","bar","buz"});
436–439 ↗(On Diff #141863)

This is rather awkward -- you're looking for /libdevice under all paths specified by -L, but there's no way to explicitly point to the directory with the bitcode library. If device library may be in a non-canonical location, I'd rather there was an explicit option to specify it. Furthermore, my understanding is that you will need to find these bitcode libraries during -c compilation. Using -L to derive bitcode search path during compilation looks wrong to me.

445 ↗(On Diff #141863)

This (and other places where you're calculating libdevice path relative to driver dir) should probably be derived from the resource dir. Driver's path may not be the 'root' the compiler has been told to use.

447 ↗(On Diff #141863)

LIBRARY_PATH sounds rather generic. Perhaps it should have HIP somewhere in its name.

462 ↗(On Diff #141863)

Why do we need to silence the warnings?

498–501 ↗(On Diff #141863)

This does not look like the right place to disable particular passes. Shouldn't it be done somewhere in LLVM?

This revision now requires changes to proceed.Apr 17 2018, 4:30 PM
yaxunl marked 21 inline comments as done.Apr 19 2018, 7:25 PM
yaxunl added inline comments.
include/clang/Driver/Options.td
561–562

Will use --cuda-gpu-arch for this patch.

lib/Driver/Driver.cpp
2289–2297

Will generalise this error message.

3462

Yes it makes sense to let HIP have its own offloading kind. Will do.

3607–3609

will do

4035–4042

will remove it

4107–4108

will remove

lib/Driver/ToolChains/Cuda.cpp
352 ↗(On Diff #141863)

will change to LLC_OUTPUT

356 ↗(On Diff #141863)

will do

360–363 ↗(On Diff #141863)

will do

436–439 ↗(On Diff #141863)

Will use --hip-device-lib-path= and drop /libdevice.

445 ↗(On Diff #141863)

will remove this

447 ↗(On Diff #141863)

will use HIP_DEVICE_LIB_PATH

462 ↗(On Diff #141863)

will remove this

yaxunl added inline comments.Apr 19 2018, 7:25 PM
lib/Driver/ToolChains/Cuda.cpp
498–501 ↗(On Diff #141863)

These are not disabling the passes but adding these passes. They are some optimizations which are usually improving performance for amdgcn.

yaxunl updated this revision to Diff 143218.Apr 19 2018, 7:54 PM
yaxunl marked 14 inline comments as done.

Revised by Artem's comments.

yaxunl updated this revision to Diff 143298.Apr 20 2018, 5:54 AM
yaxunl edited the summary of this revision. (Show Details)

sync to ToT.

yaxunl updated this revision to Diff 143307.Apr 20 2018, 6:27 AM

minor bug fix: do not add CUDA specific link options for HIP.

yaxunl marked an inline comment as done.Apr 20 2018, 8:26 AM
yaxunl added inline comments.
lib/Driver/ToolChains/Cuda.cpp
498–501 ↗(On Diff #141863)

Since opt is now able to adjust passes based on -mtriple and -mcpu, will remove these manually added passes and add -mtriple and -mcpu instead.

yaxunl updated this revision to Diff 143320.Apr 20 2018, 8:29 AM
yaxunl marked an inline comment as done.

Remove manually added passes from opt and add -mtriple.

yaxunl marked 14 inline comments as done.Apr 26 2018, 4:08 AM

ping

yaxunl updated this revision to Diff 145297.May 4 2018, 2:19 PM

clean up code and separate action builder to another review.

Hi Artem,

I've addressed your comments. Any further changes are needed? Thanks.

tra added a comment.May 18 2018, 3:17 PM

Hi,

Sorry about the long silence. I'm back to continue the reviews. I'll handle what I can today and will continue with the rest on Tuesday.

It looks like patch description needs to be updated:

Use clang-offload-bindler to create binary for device ISA.

I don't see anything related to offload-bundler in this patch any more.

include/clang/Driver/Options.td
586

I'm not sure about i_Group? This will cause this option to be passed to all preprocessor jobs. It will also be passed to host and device side compilations, while you probably only want/need it on device side only.

lib/Driver/ToolChains/Cuda.cpp
323 ↗(On Diff #145297)

FullName is already result of Args.MakeArgString. You only need to do it once.

329 ↗(On Diff #145297)

This function is too large to easily see that we're actually constructing sequence of commands.
I'd probably split construction of individual tool's command line into its own function.

336 ↗(On Diff #145297)

No need for the leading space in the message.

344–345 ↗(On Diff #145297)

for (const InputInfo &it : Inputs) ?

350 ↗(On Diff #145297)

All-caps name looks like a macro. Rename to GfxName ?

354–359 ↗(On Diff #145297)
for (path : Args.getAllArgValues(...)) {
   LibraryPaths.push_back(Args.MakeArgString(path));
}
375–378 ↗(On Diff #145297)

This is somewhat unreadable. Perhaps you could construct the name in a temp variable.

384 ↗(On Diff #145297)

You don't need to use c_str() for MakeArgString. It will happily accept std::string.

394 ↗(On Diff #145297)

BitcodeOutputFile?

417 ↗(On Diff #145297)

I think you can get rid of the temp var here without hurting readability.

420 ↗(On Diff #145297)

I wonder if we could derive temp file name from the input's name. This may make it easier to find relevant temp files if/when we need to debug the compilation process.

422 ↗(On Diff #145297)

No need for c_str() here.

439 ↗(On Diff #145297)

c_str(), again.

764 ↗(On Diff #145297)

I'd just add something like this and leave existing if unchanged:

// There's no need for CUDA-specific bitcode linking with HIP.
if( DeviceOffloadingKind == Action::OFK_HIP)
  return;
926–938 ↗(On Diff #145297)

All these should be in the derived toolchain.

lib/Driver/ToolChains/Cuda.h
137–138 ↗(On Diff #145297)

Where does amdgcn-link come from? Does it accept --options-file ?

153–154 ↗(On Diff #145297)

You may want to derive your own HIPToolChain as you're unlikely to reuse NVPTX-specific tools.

172 ↗(On Diff #145297)

In HIPToolchain this one would become an inline function returning true.

tra added a comment.May 18 2018, 3:51 PM

One more thing -- it would be really good to add some tests to make sure your commands are constructed the way you want.

yaxunl marked 19 inline comments as done.May 23 2018, 7:59 AM
In D45212#1105177, @tra wrote:

Hi,

Sorry about the long silence. I'm back to continue the reviews. I'll handle what I can today and will continue with the rest on Tuesday.

It looks like patch description needs to be updated:

Use clang-offload-bindler to create binary for device ISA.

I don't see anything related to offload-bundler in this patch any more.

You are right. Using clang-offload-bundler to create binary for device ISA has been moved to another patch. Will update the description of this patch.

In D45212#1105282, @tra wrote:

One more thing -- it would be really good to add some tests to make sure your commands are constructed the way you want.

will do

include/clang/Driver/Options.td
586

will change to Link_Group

lib/Driver/ToolChains/Cuda.cpp
323 ↗(On Diff #145297)

will fix

329 ↗(On Diff #145297)

will do

336 ↗(On Diff #145297)

will fix.

344–345 ↗(On Diff #145297)

will fix

350 ↗(On Diff #145297)

will fix

354–359 ↗(On Diff #145297)

will fix

375–378 ↗(On Diff #145297)

will do

384 ↗(On Diff #145297)

will fix

394 ↗(On Diff #145297)

will change

417 ↗(On Diff #145297)

will do

420 ↗(On Diff #145297)

will do

422 ↗(On Diff #145297)

will do

439 ↗(On Diff #145297)

will fix

764 ↗(On Diff #145297)

will change

926–938 ↗(On Diff #145297)

will do

lib/Driver/ToolChains/Cuda.h
137–138 ↗(On Diff #145297)

amdgcn-link is the short name of the amdgcn bitcode linker. It is not a real program and does not support response file. Will remove the arguments about response file.

153–154 ↗(On Diff #145297)

Will do

172 ↗(On Diff #145297)

will do

yaxunl updated this revision to Diff 148216.May 23 2018, 8:05 AM
yaxunl marked 19 inline comments as done.
yaxunl retitled this revision from [HIP] Let CUDA toolchain support HIP language mode and amdgpu to Add HIP toolchain.
yaxunl edited the summary of this revision. (Show Details)

Revised by Artem's comments.

tra added inline comments.May 23 2018, 12:13 PM
lib/Driver/ToolChains/HIP.cpp
29–47

FullName may remain uninitialized if LibraryPaths are empty which will probably crash compiler when you attempt to pass it to MakeArgString.
If empty LibraryPaths is not expected there should be an assert.

If the library is not found, we issue an error, but we still proceed to append the FullName to the CmdArgs. I don't think we should do that. FullName will be either NULL or pointing to the last directory in the LibraryPaths.

You seem to be relying on diagnostics to deal with errors and are not using return value of the function. You may as well make it void.

I'd move CmdArgs.push_back(...) under if(::exists(FullName)) and change break to return;
Then you can get rid of FoundLibDevice and just issue the error if we ever reach the end of the function.

79–81

No need for intermediate values here -- just '+' all parts together.

133

Nit: I think explicit llvm::Twine is unnecessary here.

155–160

Nit: THis could be collapsed into ArgStringList LlcArgs({...});

179–181

Same here: ArgStringList LldArgs({"-flavor", "gnu", "--no-undefined", "-shared", "-o"});

212–215

Right now the code is structured as if you're appending to the same TempFile string which is not the case here. I'd give intermediate variables their own names -- OptCommand,LlcCommand.
This would make it easier to see that you are chaining separate commands, each producing its own temp output file.

yaxunl marked 6 inline comments as done.May 23 2018, 1:31 PM
yaxunl added inline comments.
lib/Driver/ToolChains/HIP.cpp
29–47

Will CmdArgs.push_back(...) under if(::exists(FullName)) and change break to return; and change return type to void.

79–81

will do

133

will remove

155–160

will do

179–181

will do

212–215

will do

yaxunl updated this revision to Diff 148277.May 23 2018, 1:34 PM
yaxunl marked 6 inline comments as done.

Revised by Artem's comments.

tra accepted this revision.May 23 2018, 1:42 PM

One small nit. LGTM otherwise.

lib/Driver/ToolChains/HIP.cpp
45

You don't need FoundLibDevice any more as you will always return from inside the loop if it is ever true.

This revision is now accepted and ready to land.May 23 2018, 1:42 PM
yaxunl marked an inline comment as done.May 23 2018, 1:47 PM
yaxunl added inline comments.
lib/Driver/ToolChains/HIP.cpp
45

will remove when committing. Thanks!

This revision was automatically updated to reflect the committed changes.
yaxunl marked an inline comment as done.