This is an archive of the discontinued LLVM Phabricator instance.

[mlir][gpu] Add GPU target support to `gpu-to-llvm`.
ClosedPublic

Authored by fmorac on Jun 29 2023, 2:04 PM.

Details

Summary

For an explanation of these patches see D154153.

This patch modifies the lowering of gpu.module & gpu.launch_func in the gpu-to-llvm pass,
allowing the usage of the new GPU compilation mechanism in the patch series ending in D154153.

Instead of removing Modules, this patch preserves the module if it has target attributes so that the
gpu-module-to-binary pass can later serialize them.

Instead of lowering the kernel calls to the LLVM dialect, this patch primarily updates the operation's
arguments, leaving the job of converting the operation into LLVM instructions to the translation stage.
The reason for not lowering the operation to LLVM at this stage is that kernel launches do not have a
single one-to-one representation in LLVM. For example, a kernel launch can be represented by a call
to a kernel stub, like in CUDA or HIP.
Kernel launches are also intrinsically linked to the binary associated with the call, and the binaries are
converted during translation.

Depends on D154149

Diff Detail

Event Timeline

fmorac created this revision.Jun 29 2023, 2:04 PM
Herald added a reviewer: dcaballe. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
fmorac updated this revision to Diff 536081.Jun 29 2023, 6:13 PM

Rebasing.

fmorac published this revision for review.Jun 30 2023, 6:30 AM

Can we get a test for the bare pointer calling convention please?

fmorac updated this revision to Diff 543214.Jul 22 2023, 12:05 PM

Added bare pointer test.

fmorac updated this revision to Diff 543215.Jul 22 2023, 12:06 PM

Added new line.

mehdi_amini added a comment.EditedJul 23 2023, 9:24 PM

Can you add context / motivation in the description? The why is often more important than the what there.

(This would actually be valuable for most of the patches in this series)

mehdi_amini added inline comments.Jul 23 2023, 11:51 PM
mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
1090–1091

You created a symbolTable, we should try to take advantage of it everywhere I think?

1135

If you're just change the operands of the op: can't it be done in-place?

mlir/test/Conversion/GPUCommon/lower-launch-func-bare-ptr.mlir
2

Please remove -allow-unregistered-dialect, you may use the test dialect if needed

I think I understand this: this is for backward compatibility where the new mechanism isn't used, and when it is used there is a target attribute and the LLVM translation will be done through the new mechanism.

As mentioned in another patch: this is the kind of context that is helpful to explain in the description.

Can you add context / motivation in the description? The why is often more important than the what there.

(This would actually be valuable for most of the patches in this series)

Ok, I'll do it for the patch series.

I think I understand this: this is for backward compatibility where the new mechanism isn't used, and when it is used there is a target attribute and the LLVM translation will be done through the new mechanism.
As mentioned in another patch: this is the kind of context that is helpful to explain in the description.

Yes, the changes in this patch preserve the old mechanism while introducing the new one. The idea is to deprecate the old one once this gets approval, and then patch the LaunchFuncOP pattern to only use the new one.

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
1090–1091

Yes, I'll change it.

1135

I think so? I don't remember if the in-place update gave me issues. In any case, I'll change it or say what was the issue.

mlir/test/Conversion/GPUCommon/lower-launch-func-bare-ptr.mlir
2

Ok, yeah this was a test I copied & modified for testing the bare-ptr convention, so I presume it was an old test. I'll change it.

fmorac updated this revision to Diff 547749.Aug 7 2023, 5:49 AM
fmorac marked 2 inline comments as done.
fmorac edited the summary of this revision. (Show Details)

Removed the flag allow-unregistered-dialect & added the option for reusing the symbol table.

mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
1135

I'm not sure the launch_func op can be updated safely in place, as in some instances I change the number of results of the op. (I have to get rid of the async token as it's illegal),

mehdi_amini accepted this revision.Aug 7 2023, 7:51 PM

Please update the commit description to provide better context as discussed before

This revision is now accepted and ready to land.Aug 7 2023, 7:51 PM
fmorac edited the summary of this revision. (Show Details)Aug 9 2023, 7:00 PM
fmorac updated this revision to Diff 549335.Aug 11 2023, 4:10 AM

Rebasing.

This revision was automatically updated to reflect the committed changes.