Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Remove checking operands to be LLVMType because it's guaranteed by the type conversion.
mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp | ||
---|---|---|
327 | defOp could be null, e.g. a block argument, in which case you have to insert these late (maybe at block start?) | |
330 | This is just asyncDependency, right? | |
330 | Does this actually work with a gpu.call? That would create a stream and then enqueue the kernel on the stream, which is a use of the stream. So the record event would be inserted before that use, right? |
Apply thoughtful comments from herhut.
mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp | ||
---|---|---|
327 | Oh, thanks for catching this. I've changed it to insert the mgpuEventRecord at block start if defOp null. | |
330 | Yes, much cleaner. | |
330 | Good point. I changed the insertion point to after the definition of the original operand. If I understand correctly, %t0 = gpu.launch_func async ... %t1 = gpu.wait async [%t0] will now convert to %stream0 = llvm.call @mgpuStreamCreate() llvm.call @mgpuLaunchKernel(%stream0, ...) // assumption: inserted before gpu.launch_func %t0 = gpu.launch_func async ... // marked for deletion %event = llvm.call @mgpuEventCreate() // after definition of %t0, not %stream0 llvm.call @mgpuEventRecord(%event, %stream0) %stream1 = llvm.call @mgpuStreamCreate() llvm.call @mgpuStreamWaitEvent(%stream1, %event) llvm.call @mgpuEventDestroy(%event) %t1 = gpu.wait async [%t0] // marked for deletion Side note: my intention is that streams will only be created with explicit gpu.wait async ops, but your point still holds. |
Thanks.
mlir/lib/Conversion/GPUCommon/ConvertLaunchFuncToRuntimeCalls.cpp | ||
---|---|---|
326 | Can you add a comment here why getOperands is used. This is somewhat special and not obvious. | |
330 | Can you just copy this to the comment in the code. I think it would serve well as documentation, | |
mlir/test/Conversion/GPUCommon/lower-wait-to-gpu-runtime-calls.mlir | ||
6 | Please remember to add a test with a gpu.launch once that supports async launches. |
Can you add a comment here why getOperands is used. This is somewhat special and not obvious.