Verify that gpu.func can only live inside a gpu.module.
Update tests accordingly.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/test/Dialect/GPU/promotion.mlir | ||
---|---|---|
100 | You can remove all of the empty module wrappers, module is implicitly the top-level operation. | |
mlir/test/lib/Transforms/TestGpuMemoryPromotion.cpp | ||
28 ↗ | (On Diff #258911) | Can you just change the pipeline specification in the tests instead of changing this pass? |
30 ↗ | (On Diff #258911) | MLIR uses camelCase for variable names. |
Nice!
mlir/test/Dialect/GPU/promotion.mlir | ||
---|---|---|
1–3 | Maybe something like // RUN: mlir-opt -allow-unregistered-dialect -pass-pipeline='module(gpu.module(gpu.func(test-gpu-memory-promotion)))' -split-input-file %s | FileCheck %s is enough to direct the test pass to the functions through the extra wrapping modules. | |
55 | Why this change? | |
mlir/test/lib/Transforms/TestGpuMemoryPromotion.cpp | ||
28 ↗ | (On Diff #258911) | See my above comment for how to do this. You can then revert these changes, |
@ftynse , @herhut is this giving any added semantic advantage? I have been trying to see if there is a way to target gpu.func directly without going through gpu.launch + outlining. Having the requirement that a gpu.func always live in a gpu.module seems too restrictive. Just want to understand the advantage apart from the fact that all gpu.func need to live in a module "separate" from the host side.
We always had a requirement that gpu.funcs live in a somehow special module. Originally, it was a module with a kernel_module attribute. This is necessary to pick out the modules that should be translated to device-specific dialects. Later, gpu.module was introduced, but the old condition with the attribute was not removed. This change basically completes the transition.
I do see some value in having a dedicated gpu.module to clearly separate what is intended for device and what is not. Your issue with generating a gpu.func without gpu.launch_func seems mostly orthogonal: this is a matter of writing the conversion, not of the rules the resulting IR must follow. If you have another conversion pass (or just IR building code) that creates a gpu.func, it can equally create a gpu.module first. Not having a gpu.module would put the semantics in a weird state when, for example, the same function is called from both gpu.func and from std.func or if there is some global state mutation.
Sorry for the delay. Let's land this as is. If I have a more concrete issue/suggestion on this aspect, I can revisit this.
Maybe something like
is enough to direct the test pass to the functions through the extra wrapping modules.