This is an archive of the discontinued LLVM Phabricator instance.

[mlir][gpu] Add the `gpu-module-to-binary` pass.
ClosedPublic

Authored by fmorac on Jun 29 2023, 1:53 PM.

Details

Summary

For an explanation of these patches see D154153.

Commit message:
This pass converts GPU modules into GPU binaries, serializing all targets present
in a GPU module by invoking the serializeToObject target attribute method.

Depends on D154147

Diff Detail

Event Timeline

fmorac created this revision.Jun 29 2023, 1:53 PM
fmorac updated this revision to Diff 536080.Jun 29 2023, 6:12 PM

Rebasing.

fmorac edited the summary of this revision. (Show Details)Jun 29 2023, 6:20 PM
fmorac added a reviewer: mehdi_amini.
fmorac published this revision for review.Jun 30 2023, 6:30 AM
krzysz00 added inline comments.
mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
66

Are these new #define 's I missed in the patch series?

fmorac updated this revision to Diff 543194.Jul 22 2023, 8:11 AM

Rebasing.

mehdi_amini added inline comments.Jul 24 2023, 12:56 AM
mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
74

Same as other patches: can this specify the interface instead of "Attribute"?

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
41

The restriction on ModuleOp does not seem necessary to me.

I would let it be an OperationPass and specify that this finds all the GPUModule op immediately present in the attached regions and convert them to GPUBinaryOp holding a serialization of the result of the codegen of the GPUModule based on its target information.

43

It's not useful to repeat the summary, please flesh this out.

44

If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed.

48

Document please. Do you have tests for it as well?

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
96

Nit: dyn_cast without check for nullptr

120

Pattern aren't obviously appropriate to me.

I would do something a bit more lightweight:

void GpuModuleToBinaryPass::runOnOperation() {
  for (Region &region : getOperation()->getRegion()) {
    for (Block &block : region.getBlocks()) {
      for (auto gpuModuleOP : make_early_inc(block.getOperations<GPUModuleOp>()))
          process(gpuModuleOP);
    }
  }

If we need a public API, it can be one that takes a GPUModuleOp and an OpBuilder and builds a GPUBinaryOp.

mlir/test/Dialect/GPU/module-to-binary.mlir
1

there is no diagnostics to verify here?

fmorac added inline comments.Jul 24 2023, 5:53 AM
mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
74

I'll change it.

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
41

I'll change it.

43

I'll update it.

44

I'll remove it.

48

I'll add better documentation.

Tests: Currently no, I think I need to add a method for parsing attributes as options for this to work as a cmd option, it was added here for having the option on C++. I can create a patch for that (cmd parsing of attributes) and the modify this.

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
66

These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing pipeline.

96

I'll change it.

120

Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can change it,

mlir/test/Dialect/GPU/module-to-binary.mlir
1

You're right, I'll remove it.

fmorac updated this revision to Diff 547745.Aug 7 2023, 5:44 AM
fmorac marked 8 inline comments as done.

Updated tests, changed the Pass to an OperationPass and switched from patterns to looking for nested Modules.

mehdi_amini accepted this revision.Aug 8 2023, 10:21 PM
mehdi_amini added inline comments.
mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
96

Still present

This revision is now accepted and ready to land.Aug 8 2023, 10:21 PM
fmorac updated this revision to Diff 549334.Aug 11 2023, 4:10 AM

Added an assert check for the attribute.
Note: The verifier will also verify that the dyn_cast is valid, as it's a precondition on the elements of the array.

fmorac updated this revision to Diff 549542.Aug 11 2023, 4:07 PM

Separated gpu-module-to-binary test into 2 tests: nvvm & rocdl.
Also updated lit.cfg.py with:

if config.run_cuda_tests:
    config.available_features.add("host-supports-nvptx")

if config.run_rocm_tests:
    config.available_features.add("host-supports-amdgpu")

Allowing the usage of // REQUIRES: host-supports-nvptx for disabling tests.

fmorac updated this revision to Diff 549543.Aug 11 2023, 4:16 PM

Fix patch application.

This revision was automatically updated to reflect the committed changes.