Page MenuHomePhabricator

[MLIR] Change GpuLaunchFuncToGpuRuntimeCallsPass to wrap a RewritePattern with the same functionality.
ClosedPublic

Authored by csigg on Thu, Jul 30, 7:02 AM.

Details

Summary

The RewritePattern will become one of several, and will be part of the LLVM conversion pass (instead of a separate pass following LLVM conversion).

Diff Detail

Event Timeline

csigg created this revision.Thu, Jul 30, 7:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptThu, Jul 30, 7:02 AM
csigg requested review of this revision.Thu, Jul 30, 7:02 AM
csigg retitled this revision from Change GpuLaunchFuncToGpuRuntimeCallsPass to wrap a RewritePattern with the same functionality. to [MLIR] Change GpuLaunchFuncToGpuRuntimeCallsPass to wrap a RewritePattern with the same functionality..Thu, Jul 30, 7:03 AM
csigg added inline comments.Thu, Jul 30, 7:34 AM
mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp
323

Oops. I'm looking into this now.

csigg updated this revision to Diff 281930.Thu, Jul 30, 7:43 AM

Remove unnecessary ArrayRefs<>{}.

herhut added inline comments.Fri, Jul 31, 3:29 AM
mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp
89

What is the advantage of having these as field vs. a local helper function?

159–160

This is dangerous, as the module could be deleted before the global has been created. Also, it is not strictly required as part of the lowering as long as we mark the nested gpu module as legal. It only needs to cleaned away shortly before exporting the module to llvm and that can be done by just traversing the outer module.

213

This alignment seems odd. Not you doing but 0 would be canonical.

324

Please use the rewriter instead,

357

Would launchOp.getNumKernelOperands()work here, as well? It would avoid the search for the kernel.

csigg added a comment.Fri, Jul 31, 4:32 AM

Thank you very much Stephan for the review.

mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp
89

It doesn't make a big difference. Having a FunctionCallBuilder class maybe separates the details a little from the rest of the ConversionPattern. And probably overall it's a little shorter. But I can certainly change it, if you prefer functions.

159–160

I don't think it's dangerous as long as the ConvertLaunchFuncOpToGpuRuntimeCallPattern does not run after EraseGpuModuleOpPattern. In the same pass is OK, and the populateGpuToLLVMConversionPatterns function makes sure they are always part of the same PatternsList.

Why would you like to not clean up the gpu modules when they are no longer needed, but have the user to do this manually? Doesn't populateGpuToLLVMConversionPatterns imply that it removes all gpu ops, instead of leaving one particular op type without uses?

213

Changed to 0.

324

Done, but I did not change the builder type in the other functions. Should I?

357

I don't think so because that would give you the non-flattened number of arguments, right?

herhut added inline comments.Fri, Jul 31, 7:34 AM
mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp
159–160

But how do you guarantee this? If the kernel module is before the functions that contain the launch, then it might get removed already, no?

324

I cannot see that change.

No need to change type as long as it is backed by the rewriter for the pattern. Otherwise the rollback would not work.

357

Good point, thanks!

csigg added inline comments.Sat, Aug 1, 12:19 AM
mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp
159–160

I verified that it does not matter whether the kernel module is before the function or after. rewriter.erase() only marks it for removal, the actual removal happens during the pass' finalize(). So the kernel module will always be available for the gpu.launch_func rewrite in the same pass.

324

Yes, it's backed by the rewriter. You don't see the change because I haven't pushed it. It's always a bit of a dance and I wanted to wait for the other things to get resolved.

csigg updated this revision to Diff 282393.Sat, Aug 1, 6:08 AM

Apply reviewer comments.

Slip in small nfc in mlir-{cuda,rocm}-runner.

herhut accepted this revision.Tue, Aug 4, 6:47 AM
herhut added inline comments.
mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp
159–160

Ah, ok, thanks for checking. In that case this is fine of course.

This revision is now accepted and ready to land.Tue, Aug 4, 6:47 AM
csigg updated this revision to Diff 283531.Thu, Aug 6, 2:09 AM

Rebase.

csigg updated this revision to Diff 283537.Thu, Aug 6, 2:23 AM

Rebase.

csigg updated this revision to Diff 283542.Thu, Aug 6, 2:45 AM

Rebase.

This revision was landed with ongoing or failed builds.Thu, Aug 6, 2:56 AM
This revision was automatically updated to reflect the committed changes.