This patch implements an early outlining transform of omp.target operations in flang. The pass is needed because optimizations may cross target op region boundaries, but with the outlining the resulting functions only contain a single omp.target op plus a func.return, so there should not be any opportunity to optimize across region boundaries.
The patch also adds an interface to be able to store and retrieve the parent function name of the original target operation. This is needed to be able to create correct kernel function names when lowering to LLVM-IR.
Details
Diff Detail
Event Timeline
flang/test/Lower/OpenMP/omp-target-early-outlining.f90 | ||
---|---|---|
6 | Nit: Could you add some CHECK-NOT llvm-lit directives to ensure that unwanted mlir operations are not present? (For example: check if there are no instructions from the second target region) Nit: could you make your checks more flexible? We do not need to check the exact names of constants. IMO these checks are more future proof: !CHECK: %[[CONSTANT_VALUE:.*]] = arith.constant 10 : i32 !CHECK fir.store %[[CONSTANT_VALUE]] to %arg0 : !fir.ref<i32> |
I had a quick look. Some comments and Nits inline.
The build is failing with the following error.
FrontendActions.cpp:309:29: error: no member named 'getIsDevice' in 'mlir::omp::OffloadModuleInterface' isDevice = offloadMod.getIsDevice();
Can you add a test that demonstrates the issue that the outlined function solves?
We probably have a test that lists the passes that are run in order. Didn't you have to update that to get tests to pass?
flang/include/flang/Optimizer/Transforms/Passes.td | ||
---|---|---|
303 | Can you add some additional info on where this pass is called (immediate after lowering) and only required for the device flow? | |
flang/lib/Frontend/FrontendActions.cpp | ||
303–313 | Can this be added in a common place so that the bbc driver also gets this? Would eraseDeadCodeAndBlocks in Bridge.cpp be a good place to add this? That function eliminates dead-code and is a pre-requisite for running other MLIR passes. | |
flang/lib/Optimizer/Transforms/OMPEarlyOutlining.cpp | ||
16–17 | Can these two MLIR includes be moved up? | |
39–41 | Nit: braces not required. | |
flang/test/Fir/external-mangling.fir | ||
101 | Nit: newline | |
flang/test/Lower/OpenMP/omp-target-early-outlining.f90 | ||
2 | Nit: Could you add one other target triple? | |
mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td | ||
292 | Nit: Something is missing at the end of this sentence. | |
mlir/test/Target/LLVMIR/omptarget-region-llvm-target-device.mlir | ||
3 | Nit: Could you add a comment on what the test is testing? | |
4 | Nit: What is this comment for? | |
35–38 | Nit: remove extra lines. |
flang/lib/Frontend/FrontendActions.cpp | ||
---|---|---|
303–313 |
I'm not exactly sure how to add it in there since the entire generated function should be deleted after outlining. I was looking to see if it would be easy to add it to a function higher up in the call chain, but there seems to be some implicit state that is used to generate the function. I will take a closer look to see if it is doable. Otherwise, should this pass added to the bbc driver? | |
flang/test/Lower/OpenMP/omp-target-early-outlining.f90 | ||
6 |
That should not be possible since the functions are created before any optimizations. I suppose there could be a bug in the frond-end, but that is not what this test is checking.
Done.
| |
mlir/test/Target/LLVMIR/omptarget-region-llvm-target-device.mlir | ||
3 |
It should only tests that the resulting name of the generated function is correct and use the parent name that is stored in the attribute. I removed the extra checks since they are really not useful. | |
4 |
Junk, removed. |
Rebased and fixed.
Can you add a test that demonstrates the issue that the outlined function solves?
Yes, where do we add tests that compile from Fortran + optimizations? It requires both the front-end and optimizations.
We probably have a test that lists the passes that are run in order. Didn't you have to update that to get tests to pass?
No, I made no changes to existing tests. This is something that happens to not be triggered by the existing tests. I have an example in the abandoned patch that was supposed to fix CSE.
Can you add a test that demonstrates the issue that the outlined function solves?
Yes, where do we add tests that compile from Fortran + optimizations? It requires both the front-end and optimizations.
You can add the test in the Driver test directory.
LG. Please wait for another acceptance.
flang/lib/Frontend/FrontendActions.cpp | ||
---|---|---|
303–313 | If it is too difficult then no need to add. Also, please verify that the code/passes in eraseDeadCodeAndBlocks do not cause any unnecessary transformation that violates what you achieve by outlining. | |
flang/lib/Optimizer/Transforms/OMPEarlyOutlining.cpp | ||
51 | Should the visibility be set here to setVisibility(mlir::SymbolTable::Visibility::Private)? |
flang/lib/Frontend/FrontendActions.cpp | ||
---|---|---|
303–313 |
The passes in eraseDeadCodeAndBlocks are fine. I added a warning here as well so that people are aware to not add the wrong kind of optimization. | |
flang/lib/Optimizer/Transforms/OMPEarlyOutlining.cpp | ||
51 |
I just checked if this was doable, but since there are no uses of these functions and setting the visibility to Private causes them to be deleted prematurely. They should be deleted once the code generation is done for the omp.target op (outlined into a kernel function) in the module translation to LLVM IR,. |
LGTM
mlir/test/Target/LLVMIR/omptarget-region-llvm-target-device.mlir | ||
---|---|---|
3 | nit typo: stoared |
Can you add some additional info on where this pass is called (immediate after lowering) and only required for the device flow?