This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][OpenMP] Add MLIR operation for OpenMP teams
ClosedPublic

Authored by skatrak on Jul 4 2023, 6:26 AM.

Diff Detail

Event Timeline

skatrak created this revision.Jul 4 2023, 6:26 AM
skatrak requested review of this revision.Jul 4 2023, 6:26 AM
tschuett added inline comments.
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
232

To be picky. No! For modern OpenMP, I use the teams construct in shared memory.

skatrak added inline comments.Jul 4 2023, 6:43 AM
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
232

Thank you for pointing this out, so I understand then that this is not a device-only construct? I'll work on removing this restriction and updating the operation description.

Changes from 4.5 to 5.0:

The teams construct (see Section 10.2) was extended to support execution on the host device
without an enclosing target construct (see Section 13.8).
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
227

I think this will force the teams region to have a single block only. I guess that is not the intention here.

232

The standard says the following. What is your usage @tschuett ?
"A teams region can only be strictly nested within the implicit parallel region or a target region. If a teams construct is nested within a target construct, that target construct must contain no statements, declarations or directives outside of the teams construct."

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
882–884

You can use the AllTypesMatch trait for this. For reference omp.wsloop op.

tschuett added inline comments.Jul 4 2023, 7:37 AM
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
232

I read the same sentence from the standard.

skatrak updated this revision to Diff 537124.Jul 4 2023, 10:12 AM
skatrak marked 2 inline comments as done.

Address reviewer's comments. Lift restriction of being nested inside of omp.target.

Thank you for the comments, I just pushed some changes and posted a couple questions inline.

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
227

You're right, thank you. I misinterpreted here "single block" to mean "single region".

232

I focused on the part about being strictly nested within a target region, but didn't quite get the "implicit parallel region that surrounds the whole OpenMP program" when I read that paragraph prior to implementing this.

I made some changes to check that the enclosing target region doesn't have any other operations, but I'm not sure what exactly to check to make sure it's nested within the implicit global parallel region. Should we just check that omp.teams does not have an omp.parallel parent or is having any OpenMP dialect parent operation (except from omp.target) already unsupported use of omp.teams?

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
882–884

Unfortunately I can't seem to be able to do this, since both arguments are optional. If any of them is not specified and I set that trait, the TeamsOp::verifyInvariantsImpl() segfaults due to trying to access the type of these undefined arguments.

Is there a known workaround for this?

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
232

I think, we can have a simple check here. If the immediate OpenMP dialect parent is any operation other than omp.target then it is an error. Otherwise do not signal any error.

I made some changes to check that the enclosing target region doesn't have any other operations

Would that exclude the following?

omp.target {
  %c1 = arith.constant 1 : i32
  %c10 = arith.constant 10: i32
  omp.teams num_teams(%c1 : i32 to %c10 : i32) {
    omp.terminator
  }
}
mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
882–884

OK then leave it as is. It probably requires the operands to be always present together.

skatrak added inline comments.Jul 5 2023, 3:17 AM
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
232

I think, we can have a simple check here. If the immediate OpenMP dialect parent is any operation other than omp.target then it is an error. Otherwise do not signal any error.

That is what I implemented the first time around, but as @tschuett pointed out, the teams construct can appear outside of target regions, by binding to the implicit parallel region surrounding the program. I think we could check that its direct parent is an omp.target or it doesn't have any direct or indirect parent from the OpenMP dialect. I think that second check follows the definition of being strictly nested within that implicit parallel region. The other alternative is to just check the first, as you say, and put a TODO mentioning the other case where omp.teams should be eventually allowed.

Would that exclude the following?

omp.target {
  %c1 = arith.constant 1 : i32
  %c10 = arith.constant 10: i32
  omp.teams num_teams(%c1 : i32 to %c10 : i32) {
    omp.terminator
  }
}

Yes, it would. Now I realize the issue with that. The spec (version 5.2, section 10.2) does state "If a teams region is nested inside a target region, the corresponding target construct must not contain any statements, declarations or directives outside of the corresponding teams construct." But obviously the arguments for the construct's clauses must come from somewhere, and I understood that the region inside of an omp.target couldn't access what was outside of it.

I suppose that the way this works is it should have access to device-mapped variables, but I'm not sure how these can be accessed in MLIR. If it is by their SSA value outside of the target region, then I suppose that means values from outside omp.target can actually be used inside (they would only fail at runtime if they weren't mapped properly), so the constants in your example could be hoisted out and the restriction kept, assuming constants are always mapped to all devices.

I realize there are many considerations here, so maybe @jsjodin, @agozillon, @domada, @TIFitis have other concerns they'd like to bring up related to the visibility of SSA values across target boundaries and data mapping. For the time being, I don't mind lifting the restriction and allowing other operations to coexist with omp.teams inside of omp.target.

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
232

For the time being, I don't mind lifting the restriction and allowing other operations to coexist with omp.teams inside of omp.target.

This is alrite for now.

The first check can also be simple. Only if the immediate OpenMP dialect parent is not an omp.target operation then error out. If on the device side, there is not omp.target operation as per the new design then this check might have to be skipped.

skatrak updated this revision to Diff 537756.Jul 6 2023, 8:57 AM
skatrak marked 6 inline comments as done.

Update verifier to follow reviewer's recommendations.

skatrak added inline comments.Jul 6 2023, 8:59 AM
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
232

The first check can also be simple. Only if the immediate OpenMP dialect parent is not an omp.target operation then error out. If on the device side, there is not omp.target operation as per the new design then this check might have to be skipped.

Done. After speaking with @jsjodin, the current plan to represent outlined target regions will keep omp.target operations, so it won't need special handling here.

kiranchandramohan accepted this revision.Jul 6 2023, 3:02 PM

LG.

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
879–880

The alternative is to only issue this error if the parent operation is an OpenMP dialect operation and not a TargetOp.

883–884

Nit: Can this type be spelled out?

This revision is now accepted and ready to land.Jul 6 2023, 3:02 PM
skatrak updated this revision to Diff 538116.Jul 7 2023, 6:19 AM
skatrak marked an inline comment as done.

Address comments.

skatrak marked an inline comment as done.Jul 7 2023, 6:20 AM
skatrak added inline comments.
mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
879–880

Just made some changes to make that check in place of the related TODO comment. I'll wait for your thumbs up before landing this

This revision was automatically updated to reflect the committed changes.
skatrak marked an inline comment as done.