This is an archive of the discontinued LLVM Phabricator instance.

[HIP] Do not call opt/llc for -fno-gpu-rdc
ClosedPublic

Authored by yaxunl on Jun 10 2020, 6:54 PM.

Details

Summary

Currently HIP toolchain calls clang to emit bitcode then calls opt/llc for device compilation for the default -fno-gpu-rdc
case, which is unnecessary since clang is able to compile a single source file to ISA.

This patch fixes the HIP action builder and toolchain so that the default -fno-gpu-rdc can be done like a canonical
toolchain, i.e. one clang -cc1 invocation to compile source code to ISA.

This can avoid unnecessary processes to speed up the compilation, and avoid redundant LLVM passes which are
performed in clang -cc1 and opt.

This patch does not remove opt/llc in -fgpu-rdc case since device linking is still needed whereas
amdgpu backend does not support ISA linking for now.

Diff Detail

Event Timeline

yaxunl created this revision.Jun 10 2020, 6:54 PM
tra added a comment.Jun 11 2020, 12:00 PM

Looks OK in general. I'm happy to see reduced opt/llc use.

You may want to get someone more familiar with the AMD GPU compilation process to double-check that the compilation pipeline still does the right thing.

clang/lib/Driver/Driver.cpp
2730–2731

The comment about "create link action" should probably be moved down below to where the link action is constructed now.

2732–2737

Looks like we're chaining backend/assembly actions here. but it's not easy to spot that we use CudaDeviceActions[I] as an intermediate value. At the first glance it looked like a copy/paste error writing to CudaDeviceActions[I] multiple times.

It would be easier to see what's going on if the code was structured like this:

BackendAction = Construct(... CudaDeviceActions[I]);
AssembleAction  = Construct(... BackendAction);
AL.push_back(AssembleAction)
CudaDeviceActions[I] = C.MakeAction<LinkJobAction>(AL);
arsenm added a subscriber: arsenm.Jun 11 2020, 2:15 PM

It doesn't matter if we don't support isa linking. We should just use clang and default to -flto. LTO "just works" as is

It doesn't matter if we don't support isa linking. We should just use clang and default to -flto. LTO "just works" as is

This is a step forward, but the lack of ISA linking shouldn't block eliminating the use of llc/opt

It doesn't matter if we don't support isa linking. We should just use clang and default to -flto. LTO "just works" as is

This is a step forward, but the lack of ISA linking shouldn't block eliminating the use of llc/opt

Agree. Will fix -fgpu-rdc in a different patch.

yaxunl marked 4 inline comments as done.Jun 11 2020, 9:39 PM
yaxunl added inline comments.
clang/lib/Driver/Driver.cpp
2730–2731

will do

2732–2737

will do

yaxunl updated this revision to Diff 270302.Jun 11 2020, 9:40 PM
yaxunl marked 2 inline comments as done.
yaxunl added a reviewer: ashi1.

revised by Artem's comments.

tra accepted this revision.Jun 12 2020, 11:58 AM

LGTM. Good to go if @arsenm is OK with fixing -fgpu-rdc in a separate patch.

This revision is now accepted and ready to land.Jun 12 2020, 11:58 AM
In D81627#2090499, @tra wrote:

LGTM. Good to go if @arsenm is OK with fixing -fgpu-rdc in a separate patch.

@arsenm Are you OK with fixing -fgpu-rdc in a separate patch? The fix for that is orthogonal to the current patch. Mixing them together will clutter things up. Also the need for -fgpu-rdc is not so urgent since only rccl needs it. Requesting this patch to fix -fgpu-rdc will delay fixing issues for all other math libs and frameworks.

ashi1 accepted this revision.Jun 15 2020, 11:42 AM

LGTM, thank you for splitting the cuda-phases from hip-phases, and opening up a follow-up patch for RDC case.

arsenm accepted this revision.Jun 15 2020, 11:56 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2020, 4:03 PM