This is an archive of the discontinued LLVM Phabricator instance.

OpenMPOpt: Check nested parallelism in target region
ClosedPublic

Authored by randreshg on Jan 4 2023, 12:02 PM.

Details

Summary

Analysis that determines if a parallel region can reach another parallel region in any target region of the TU.
A new global var is emitted with the name of the kernel + "_nested_parallelism", which is either 0 or 1 depending on the result.

Diff Detail

Event Timeline

randreshg created this revision.Jan 4 2023, 12:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 4 2023, 12:02 PM
randreshg requested review of this revision.Jan 4 2023, 12:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 4 2023, 12:02 PM
josemonsalve2 added inline comments.Jan 4 2023, 12:26 PM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
4369

Any reason you want to keep this variable? Why not add the condition directly into the ID?

josemonsalve2 added inline comments.Jan 4 2023, 12:30 PM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
4369

Or probably better make NesterParallelism equal you the condition.

tianshilei1992 added inline comments.Jan 5 2023, 9:14 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
602

Is it good enough to just have two states? Is it possible that we can't tell if a parallel region is nested or not, or do we just conservatively take it as true?

randreshg added inline comments.Jan 5 2023, 11:44 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
602

What others states we might possibly have? The only way we can't tell if a parallel region is nested or not is when we have functions with parallel regions in another TU, for this case we conservatively take it as false, for all the other cases this approach should work.

4369

Not really, we can remove the variable and add it to the IF statement

tianshilei1992 added inline comments.Jan 5 2023, 11:50 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
602

If a function containing a parallel region can be reached from different paths, and the parallel region may or may not be nested, that would be the case right? What's more, I suppose we could do more aggressive optimization if we do not have nested parallelism, which means the conservative way would be to take it as a true. Did I misunderstand anything here?

randreshg added inline comments.Jan 5 2023, 12:06 PM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
602

The case is summarized in the remark omp133. This is the case when a function inside the target region may contain parallel regions but since we don't have information about it, we assume it has no nested parallelism

jdoerfert added inline comments.Jan 5 2023, 12:33 PM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
602

We start optimistic, so assuming no nested parallelism is right.

689

Also known as
NestedParallelism |= KIS.NestedParallelism

3344

Use an int8 type, int1 is always tricky.

4370

Don't use magic constants (5, above), that's why we have these variables (WrapperFunctionArgNo). Also, start at the wrapper, so ParallelRegion. The DepClassTy should be optional.
Style for mayCon..., probably just remove the variable:
NestedParallelism |= !FnAA...

llvm/test/Transforms/OpenMP/nested_parallelism.ll
4

Put the C code as a comment in here so it's easier to read the IR.

342

A lot of this code is really not needed. Maybe start with a simpler example or simplify the IR.

417

Remove all attributes.

440

Remove the TBAA metadata and everything we don't need. !0 - !3 and !6 and !7 are needed. Adjust the offload_info and nvvm annotations, and module.flags accordingly.

josemonsalve2 added inline comments.Jan 5 2023, 12:54 PM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
602

Shilei, are you referring to something like this:

void foo () {
  #pragma omp parallel
  {}
}

void bar() {
  #target
      foo();
}

void zoo() {
  #target parallel
      foo();
}

I believe the optimization is not from the perspective of the inner parallel region, but the kernel itself. So this would be consider nested.

Rafa, Correct me if I am wrong.

tianshilei1992 added inline comments.Jan 5 2023, 1:53 PM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
602

Okay then that makes sense.

randreshg updated this revision to Diff 487039.Jan 6 2023, 7:54 PM

This new patch addresses comments from reviewers

randreshg marked 14 inline comments as done.Jan 6 2023, 7:56 PM
josemonsalve2 added inline comments.Jan 6 2023, 10:17 PM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
4364
NestedParallelism = !FnAA.ReachedKnownParallelRegions.empty() || !FnAA.ReachedUnknownParallelRegions.empty()

I think this is pretty good. Minor nits. I need to check when I'm less sleepy again.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
3344

Unused variable NP.

4364

If you change it, then |=, the current way is OK too.

One thing to add though: !FnAA.getState().isValidState() || ...

randreshg updated this revision to Diff 487080.Jan 7 2023, 4:06 AM

Minor changes

randreshg marked 3 inline comments as done.Jan 7 2023, 4:08 AM
josemonsalve2 added inline comments.Jan 7 2023, 7:10 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
4364

Rafa. I believe this comment by johannes has not been solved. The the if statement before the assignment. It can also be added to this line.

randreshg added inline comments.Jan 7 2023, 7:49 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
4364

I think what he meant was to check if the AA was a valid state before reading
ReachedKnownParallelRegions and
ReachedUnknownParallelRegions. I can also add it to the same line like this:

NestedParallelism |= (!FnAA.ReachedKnownParallelRegions.empty() || !FnAA.ReachedUnknownParallelRegions.empty()) && FnAA.getState().isValidState()

Johannes, please correct me if I’m wrong.

jdoerfert added inline comments.Jan 7 2023, 11:09 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
4364

You got the default wrong. If the state of FnAA is invalid we need to assume the worst. Right now you ignore invalid AAs, which is bad. So if Invalid, or if the other condition, then we assume nested parallelism.

randreshg updated this revision to Diff 487112.Jan 7 2023, 11:40 AM

Invalid state of FnAA is now considered

randreshg marked 2 inline comments as done.Jan 7 2023, 11:41 AM
This revision is now accepted and ready to land.Jan 8 2023, 12:20 PM