This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Add NVVM to CUBIN conversion to mlir-opt
ClosedPublic

Authored by csigg on Mar 8 2021, 11:26 AM.

Details

Summary

If MLIR_CUDA_RUNNER_ENABLED, register a 'gpu-to-cubin' conversion pass to mlir-opt.

The next step is to switch CUDA integration tests from mlir-cuda-runner to mlir-opt + mlir-cpu-runner and remove mlir-cuda-runner.

Depends On D98279

Diff Detail

Event Timeline

csigg created this revision.Mar 8 2021, 11:26 AM
csigg requested review of this revision.Mar 8 2021, 11:26 AM
rriddle requested changes to this revision.Mar 8 2021, 11:30 AM
rriddle added inline comments.
mlir/include/mlir/Conversion/GPUToCUBIN/GPUToCUBINPass.h
1 ↗(On Diff #329078)

The Conversion/ directory is intended for dialect->dialect conversions, that doesn't seem to be the case here.

This revision now requires changes to proceed.Mar 8 2021, 11:30 AM
csigg added inline comments.Mar 8 2021, 11:55 AM
mlir/include/mlir/Conversion/GPUToCUBIN/GPUToCUBINPass.h
1 ↗(On Diff #329078)

This change registers the pre-existing GpuKernelToBlobPass pass in mlir/lib/Conversion/GPUCommon/ConvertKernelFuncToBlob.cpp. Where do you think this registration should live, and should the pass move as well?

mehdi_amini added inline comments.
mlir/lib/Conversion/GPUToCUBIN/LowerGPUToCUBIN.cpp
96 ↗(On Diff #329078)

I don't think this safe to do in a multi-threaded environment.

I suspect we're missing an assert for this in the context @ftynse?

mlir/test/Integration/GPU/CUDA/shuffle.mlir
1–3

I think this is a leftover debugging option

rriddle added inline comments.Mar 8 2021, 12:01 PM
mlir/include/mlir/Conversion/GPUToCUBIN/GPUToCUBINPass.h
1 ↗(On Diff #329078)

The pass looks like a much more natural fit for the Target/ directory, given that it is translating an MLIR module to an external format.

mehdi_amini added inline comments.Mar 8 2021, 8:56 PM
mlir/include/mlir/Conversion/GPUToCUBIN/GPUToCUBINPass.h
1 ↗(On Diff #329078)

Not exactly: it takes MLIR and generates MLIR, it only takes a subset of the IR and turn it into a CUBIN serialized into the IR :)

I suspect it may better be a GPU/Transforms pass?

switch CUDA integration tests from mlir-cuda-runner to mlir-opt + mlir-cpu-runner

If mlir-cpu-runner can also run things on the GPUs, it's name should be changed (to mlir-runner?). Or there could be an mlir-gpu-runner in the interest of reducing the link time for those who may just want to run on the CPU and not link in many other passes and dependences?

csigg planned changes to this revision.Mar 9 2021, 11:40 AM
csigg added inline comments.
mlir/include/mlir/Conversion/GPUToCUBIN/GPUToCUBINPass.h
1 ↗(On Diff #329078)

Thanks River and Mehdi. I will move it to GPU/Transforms. For that, I would like to first land a prep revision.

mlir/lib/Conversion/GPUToCUBIN/LowerGPUToCUBIN.cpp
96 ↗(On Diff #329078)

Good point. I will change GpuKernelToBlobPass to a base class (and move it to GPU/Transforms), so that we can add those as dependent dialects.

mlir/test/Integration/GPU/CUDA/shuffle.mlir
1–3

Indeed. Thanks for spotting it.

mehdi_amini added inline comments.Mar 9 2021, 12:03 PM
mlir/lib/Conversion/GPUToCUBIN/LowerGPUToCUBIN.cpp
96 ↗(On Diff #329078)

We can't rely on "dependent dialects": these registration are separate from the dialect registration.

I would suggest using the Pass initialization that is invoked at the beginning of the pipeline before the pass manager runs for this purpose.

csigg updated this revision to Diff 329565.Mar 10 2021, 1:01 AM

Use base class added in D98279.

csigg edited the summary of this revision. (Show Details)Mar 10 2021, 1:03 AM
csigg updated this revision to Diff 329650.Mar 10 2021, 6:37 AM

Rebase, cmake fixes.

csigg added a comment.Mar 10 2021, 6:43 AM

This change should be ready for review again.

mlir/lib/Conversion/GPUToCUBIN/LowerGPUToCUBIN.cpp
96 ↗(On Diff #329078)

Thanks Mehdi. Does the reworked implementation look correct?

mehdi_amini accepted this revision.Mar 10 2021, 9:43 AM

LGTM, but please give a chance to @herhut to have a general look since he reviewed the previous revisions in this series!

mlir/lib/Conversion/GPUToCUBIN/LowerGPUToCUBIN.cpp
96 ↗(On Diff #329078)

I had another hook in mind ( https://mlir.llvm.org/docs/PassManagement/#initialization ) but what you wrote is just equally fine in this case I think!

rriddle accepted this revision.Mar 10 2021, 10:27 AM
rriddle added inline comments.
mlir/include/mlir/Dialect/GPU/Passes.h
89

nit: ///

This revision is now accepted and ready to land.Mar 10 2021, 10:27 AM
csigg updated this revision to Diff 329864.Mar 11 2021, 12:56 AM

Fix nit.

herhut accepted this revision.Mar 11 2021, 1:03 AM

Thanks!

csigg edited the summary of this revision. (Show Details)Mar 11 2021, 1:06 AM
This revision was landed with ongoing or failed builds.Mar 11 2021, 1:07 AM
This revision was automatically updated to reflect the committed changes.