This patch adds an operation definition for the OpenMP 'teams' construct.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | ||
---|---|---|
232 | To be picky. No! For modern OpenMP, I use the teams construct in shared memory. |
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | ||
---|---|---|
232 | Thank you for pointing this out, so I understand then that this is not a device-only construct? I'll work on removing this restriction and updating the operation description. |
Changes from 4.5 to 5.0:
The teams construct (see Section 10.2) was extended to support execution on the host device without an enclosing target construct (see Section 13.8).
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | ||
---|---|---|
227 | I think this will force the teams region to have a single block only. I guess that is not the intention here. | |
232 | The standard says the following. What is your usage @tschuett ? | |
mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | ||
882–884 | You can use the AllTypesMatch trait for this. For reference omp.wsloop op. |
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | ||
---|---|---|
232 | I read the same sentence from the standard. |
Thank you for the comments, I just pushed some changes and posted a couple questions inline.
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | ||
---|---|---|
227 | You're right, thank you. I misinterpreted here "single block" to mean "single region". | |
232 | I focused on the part about being strictly nested within a target region, but didn't quite get the "implicit parallel region that surrounds the whole OpenMP program" when I read that paragraph prior to implementing this. I made some changes to check that the enclosing target region doesn't have any other operations, but I'm not sure what exactly to check to make sure it's nested within the implicit global parallel region. Should we just check that omp.teams does not have an omp.parallel parent or is having any OpenMP dialect parent operation (except from omp.target) already unsupported use of omp.teams? | |
mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | ||
882–884 | Unfortunately I can't seem to be able to do this, since both arguments are optional. If any of them is not specified and I set that trait, the TeamsOp::verifyInvariantsImpl() segfaults due to trying to access the type of these undefined arguments. Is there a known workaround for this? |
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | ||
---|---|---|
232 | I think, we can have a simple check here. If the immediate OpenMP dialect parent is any operation other than omp.target then it is an error. Otherwise do not signal any error.
Would that exclude the following? omp.target { %c1 = arith.constant 1 : i32 %c10 = arith.constant 10: i32 omp.teams num_teams(%c1 : i32 to %c10 : i32) { omp.terminator } } | |
mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | ||
882–884 | OK then leave it as is. It probably requires the operands to be always present together. |
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | ||
---|---|---|
232 |
That is what I implemented the first time around, but as @tschuett pointed out, the teams construct can appear outside of target regions, by binding to the implicit parallel region surrounding the program. I think we could check that its direct parent is an omp.target or it doesn't have any direct or indirect parent from the OpenMP dialect. I think that second check follows the definition of being strictly nested within that implicit parallel region. The other alternative is to just check the first, as you say, and put a TODO mentioning the other case where omp.teams should be eventually allowed.
Yes, it would. Now I realize the issue with that. The spec (version 5.2, section 10.2) does state "If a teams region is nested inside a target region, the corresponding target construct must not contain any statements, declarations or directives outside of the corresponding teams construct." But obviously the arguments for the construct's clauses must come from somewhere, and I understood that the region inside of an omp.target couldn't access what was outside of it. I suppose that the way this works is it should have access to device-mapped variables, but I'm not sure how these can be accessed in MLIR. If it is by their SSA value outside of the target region, then I suppose that means values from outside omp.target can actually be used inside (they would only fail at runtime if they weren't mapped properly), so the constants in your example could be hoisted out and the restriction kept, assuming constants are always mapped to all devices. I realize there are many considerations here, so maybe @jsjodin, @agozillon, @domada, @TIFitis have other concerns they'd like to bring up related to the visibility of SSA values across target boundaries and data mapping. For the time being, I don't mind lifting the restriction and allowing other operations to coexist with omp.teams inside of omp.target. |
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | ||
---|---|---|
232 |
This is alrite for now. The first check can also be simple. Only if the immediate OpenMP dialect parent is not an omp.target operation then error out. If on the device side, there is not omp.target operation as per the new design then this check might have to be skipped. |
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | ||
---|---|---|
232 |
Done. After speaking with @jsjodin, the current plan to represent outlined target regions will keep omp.target operations, so it won't need special handling here. |
mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | ||
---|---|---|
879–880 | Just made some changes to make that check in place of the related TODO comment. I'll wait for your thumbs up before landing this |
I think this will force the teams region to have a single block only. I guess that is not the intention here.