This is an archive of the discontinued LLVM Phabricator instance.

[Flang][OpenMP][MLIR] Add lowering from parse tree to MLIR support for Declare Target for functions, subroutines and global data
AbandonedPublic

Authored by agozillon on Mar 14 2023, 9:30 AM.

Details

Summary

This adds lowering support to Flang for OpenMP Declare Target
tagged functions, subroutines and global data in the form of
GlobalOps.

This patch adds an attribute which contains data
provided to the device_type clause (any | host | nohost) and
what I've currently called the "capture" clause which is the
To (Enter) and Link clauses. This attribute is then applied
to functions, subroutines and global data (which are
represented as builtin.FuncOp's and fir.GlobalOp's in
MLIR).

An additional semantic analysis pass is added to Flang
which adds functions used within declare target functions
to the original declare target specifier, effectively marking
them as declare target, with the same clauses as their
parent/caller. This is a form of implicit capture, this appears
to occur within Clang (albeit in a different way) and from my
current reading and understanding of the specification is described
there as well.

This also adds several tests testing current global data, functions and
subroutine support.

The idea is that this attribute can be picked up by amendOperation or
similar infrastructure later during lowering of the LLVM Dialect +
OpenMP dialect to LLVM-IR and have specialised lowering based
on the contained attribute data. This can be seen in:
https://reviews.llvm.org/D149368 which lowers GlobalOp's marked
with a declare target attribute to LLVM-IR.

Filtering of functions/global data that is deemed unneccesary for a
module is currently not done in this patch. The approach currently is
to do this later in the pipeline during the lowering to LLVM-IR as shown
for functions here: https://reviews.llvm.org/D147641 Global data can
likely be done similarly.

This patch has the following related patches on-going:

Function filtering: https://reviews.llvm.org/D147641
Initial lowering of declare target to LLVM-IR for global data: https://reviews.llvm.org/D149368
Movement of OMP IR Builder functionality for lowering declare target data: https://reviews.llvm.org/D149162
Loading in of host module data to properly populate device metadata for declare target: https://reviews.llvm.org/D148370

With these four patches and this patch, initial support for declare target will exist for Flang+OpenMP, however,
further testing and extension for all edge cases will be required and something I will continue to work on. Frontend
semantic analysis support also needs a little look into as well, there's a lot of cases where things compile without
error when they shouldn't and vice versa still. But these patches should yield some initial support.

Diff Detail

Event Timeline

agozillon created this revision.Mar 14 2023, 9:30 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
agozillon requested review of this revision.Mar 14 2023, 9:30 AM
kiranchandramohan added a comment.EditedMar 20 2023, 6:54 AM

The problem with doing this in two stages : i.e during lowering and a pass are that there are two representations for the same thing in the IR. One where the nested functions are not marked, and the other where they are marked.

Would it be better to do this in

  1. Semantics + lowering?

or

  1. PFT + lowering?

So that there is only one representation in the IR.

I believe I understand your concerns.

I think I've tried to do number 2 already, if I understand correctly and the PFT + lowering phase comprises the handleDeclareTarget function inside of OpenMP.cpp (and the other lowering steps to the various MLIR dialects from the PFT). I initially tried to do it while processing the initial declare target lowering inside of handleDeclareTarget, however, you run into the issue that the function the declare target is nested inside hasn't been fully lowered yet, so you cannot tell what other functions are invoked by the declare target function from the currently existing MLIR as it's yet to be generated! The only way I can think of circumventing this is by doing some sort of delayed evaluation that occurs once the function is completed, but I'm unsure how ideal that is.

After running into this issue with the original method I also thought about doing it in the semantic check + PFT generation phase as I believe you suggest as option number 1 (but do please correct me if my understanding is wrong), by pushing the functions (to be marked implicitly declare target) invoked within the declare target function onto the originally generated OmpDeclareTargetSpecifier's To or Link list, and then the handleDeclareTarget function would just treat them as normally. I decided to forgo this and go with the MLIR pass as it seemed like the simpler option that was most likely to work and I wasn't sure if there was a precedent for doing this sort of thing at the semantic level.

I am happy to look into either of these if you think they're better than the current solution! Is there one approach you prefer over another?

As a related aside I was considering doing another pass similarly to this, which removes unrequired declare target functions from the module (e.g. remove device only declare target functions if it's a host module), with the intent of removing unnecessary functions as early as possible (as requested in a prior discussion I believe). The main reason for doing this as an end module pass as opposed to removing them directly in the handleDeclareTarget function is that it is going to run into the same problem, as I believe you need to know if other OpenMP offload code exists in the function before you discard it, e.g. if a declare target host function has a device target region within it, it can't be discarded inside of an OpenMP device module early as we'd lose the target region. Perhaps there's some OpenMP specification rules I'm unaware of that prevent a target region inside of a declare target function, although I fear a similar issue may occur with data mapping operations as well in any case. Would a pass like this be more reasonable than the pass in this patch?

agozillon updated this revision to Diff 510768.Apr 4 2023, 6:11 AM
  • [Flang][OpenMP][MLIR] Remove optimisation passes
  • [Flang][OpenMP] Add Parse Tree Rewrite for Declare Target
agozillon updated this revision to Diff 510769.Apr 4 2023, 6:19 AM
  • [Flang Fix left over artifacts from pass removal
agozillon updated this revision to Diff 510775.Apr 4 2023, 6:24 AM
  • [Flang] Hidden space re-addition

First of all, apologies everyone for the spam, had some issues re-aligning some of the previously changed files back to their old form, teaches me to just revert in future!

This is now updated to do the implicit marking during the PFT rewrite phase, this seemed the most "correct" place to alter the parse tree as it's already done here for the ambiguities that arise from parsing certain Fortran syntaxes (from what I understand at least). However, it could be moved into the semantic phase as well I think by simply moving what I've done in the rewrite phase into the semantic phase, but this seems like it would be the first case of the semantic phase altering the parse tree. One side-affect of having it done in the rewrite phase is that the semantic analysis will be executed on the altered declare target specifier, verifying it as if it were exactly as a user wrote.

I would appreciate another review when you have a spare moment @kiranchandramohan I'm aware you're time is likely stretched thin with the current amount of Fortran-Offload related (and other Fortran/OpenMP related) phabricator patches however!

agozillon added inline comments.Apr 4 2023, 6:48 AM
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
32

I kept these external to the ModuleInterface as I'm unsure if we wish to put FuncOp+fir.GlobalOp related getter and setters into the module interface, have separate interfaces (e.g. an OffloadFuncOp interface for these getters/setters) or leave them as OpenMPDialect static functions. I am open to all of these options.

Sorry for the delay here. I had typed a message but failed to post it. I havent gone through the new changes in details. But can you move the parse-tree rewrite to flang/lib/Semantics/canonicalize-omp.cpp?

Sorry for the delay here. I had typed a message but failed to post it. I havent gone through the new changes in details. But can you move the parse-tree rewrite to flang/lib/Semantics/canonicalize-omp.cpp?

That's no problem Kiran, thank you for picking up the review again. I have now looked into this and unfortunately I cannot do this, as this needs to occur after the ResolveNames pass has been executed (and fills in function symbols correctly) otherwise the Parser::Name's aren't complete and it produces semantic errors and the canonocalize-omp pass occurs prior to the ResolveNames pass. I could move the infrastructure into a separate pass within canoncalize-omp.cpp/h (or a different file entirely) and then create a separate access/invocation function for it that can be invoked after the ResolveNames pass has been executed? This would make the pass for canoncalization and implicit declare target gathering seperate passes invoked at different times but allow for names to be properly resolved before the pass has begun execution The alternative would be to teach ResolveNames to go back and update all of the declare target specifiers, but I don't know how simple this would be and it would be adding additional rather unnecessary work for the compiler to do and perhaps bloat the ResolveNames pass needlessly.

Please let me know which of these options seems most reasonable.

Sorry for the delay here. I had typed a message but failed to post it. I havent gone through the new changes in details. But can you move the parse-tree rewrite to flang/lib/Semantics/canonicalize-omp.cpp?

That's no problem Kiran, thank you for picking up the review again. I have now looked into this and unfortunately I cannot do this, as this needs to occur after the ResolveNames pass has been executed (and fills in function symbols correctly) otherwise the Parser::Name's aren't complete and it produces semantic errors and the canonocalize-omp pass occurs prior to the ResolveNames pass. I could move the infrastructure into a separate pass within canoncalize-omp.cpp/h (or a different file entirely) and then create a separate access/invocation function for it that can be invoked after the ResolveNames pass has been executed? This would make the pass for canoncalization and implicit declare target gathering seperate passes invoked at different times but allow for names to be properly resolved before the pass has begun execution The alternative would be to teach ResolveNames to go back and update all of the declare target specifiers, but I don't know how simple this would be and it would be adding additional rather unnecessary work for the compiler to do and perhaps bloat the ResolveNames pass needlessly.

Please let me know which of these options seems most reasonable.

There is a resolve-directives.cpp file and it has a pass that resolves the names in OpenMP. But i guess it would be better to go for a separate pass in a separate file. Usually adding a pass is costly. But there are a few cases in my mind where it will be useful.

agozillon updated this revision to Diff 511381.Apr 6 2023, 5:52 AM
  • [Flang][Parser] Move to canonicalize-omp
  • [Flang][Semantics] Add finalize-omp files, move ImplicitDeclareTargetCapture pass here
agozillon updated this revision to Diff 511383.Apr 6 2023, 5:58 AM
  • [Flang][NFC] Try to remove changes to canonicalize-omp.cpp
agozillon updated this revision to Diff 511384.Apr 6 2023, 6:00 AM
  • [Flang][NFC] Add missing newline to end of file

Sorry for the delay here. I had typed a message but failed to post it. I havent gone through the new changes in details. But can you move the parse-tree rewrite to flang/lib/Semantics/canonicalize-omp.cpp?

That's no problem Kiran, thank you for picking up the review again. I have now looked into this and unfortunately I cannot do this, as this needs to occur after the ResolveNames pass has been executed (and fills in function symbols correctly) otherwise the Parser::Name's aren't complete and it produces semantic errors and the canonocalize-omp pass occurs prior to the ResolveNames pass. I could move the infrastructure into a separate pass within canoncalize-omp.cpp/h (or a different file entirely) and then create a separate access/invocation function for it that can be invoked after the ResolveNames pass has been executed? This would make the pass for canoncalization and implicit declare target gathering seperate passes invoked at different times but allow for names to be properly resolved before the pass has begun execution The alternative would be to teach ResolveNames to go back and update all of the declare target specifiers, but I don't know how simple this would be and it would be adding additional rather unnecessary work for the compiler to do and perhaps bloat the ResolveNames pass needlessly.

Please let me know which of these options seems most reasonable.

There is a resolve-directives.cpp file and it has a pass that resolves the names in OpenMP. But i guess it would be better to go for a separate pass in a separate file. Usually adding a pass is costly. But there are a few cases in my mind where it will be useful.

I've added a new file set called finalize-omp.h/cpp which now contains the pass, I'm happy to change the names if they seem incorrect to you.

Just a small ping for reviewer attention please, if at all possible, to push this patch a little further along the pipeline!

Can you write a few tests where I can see the modifications made by the finalize-omp pass? You can add the test in flang/test/Semantics/OpenMP.

agozillon updated this revision to Diff 513264.Apr 13 2023, 8:52 AM
  • [Flang][OpenMP][MLIR] Remove optimisation passes
  • [Flang][OpenMP] Add Parse Tree Rewrite for Declare Target
  • [Flang][Parser] Move to canonicalize-omp
  • [Flang][Semantics] Add finalize-omp files, move ImplicitDeclareTargetCapture pass here
  • [Flang][NFC] Try to remove changes to canonicalize-omp.cpp
  • [Flang][NFC] Add missing newline to end of file
  • [Flang][OpenMP] Add declare-target-implicit-capture-rewrite.f90 test
agozillon updated this revision to Diff 513267.Apr 13 2023, 8:56 AM
  • Fix alterations made from rebasing
agozillon added a comment.EditedApr 13 2023, 9:23 AM

Can you write a few tests where I can see the modifications made by the finalize-omp pass? You can add the test in flang/test/Semantics/OpenMP.

I've added the flang/test/Semantics/OpenMP/declare-target-implicit-capture-rewrite.f90 test in the most recent commit, it checks the -fdebug-dump-parse-tree meets the expected modifications. Hopefully it's what you were looking for! Still working my way around the appropriate testing infrastructure for each component I change so I'm sorry for missed tests.

Just a small ping to keep this patch on peoples radar and to try and get some reviewer attention (I am aware you're all quite busy, so thank you very much for any time you can spare) for more review feedback to push this patch further along the pipeline!

skatrak added inline comments.Apr 24 2023, 4:59 AM
flang/lib/Lower/OpenMP.cpp
2476

It might be preferable to use llvm::SmallVector<Fortran::semantics::Symbol, 0> here, according to this: https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h.

2550

Could this condition be simplified by something like this?

if (currentDeclTar != deviceType)
  mlir::omp::OpenMPDialect::setDeclareTarget(op, mlir::omp::DeclareTargetDeviceType::any);
2561

Nit: Maybe it's just a matter of personal preference, but I think a switch statement here would be slightly more readable. Feel free to ignore my comment if you don't agree.

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
32

For consistency with the way that module-level OpenMP attributes are accessed, I think it would be a good idea to do the same for these new attributes attached to functions and globals. However, I'd be interested to hear what others think about this. One question that comes up is: would we want a DeclareTargetInterface (maybe with a more descriptive/generic name) for both cases, or an OffloadFunctionInterface plus an OffloadGlobalInterface (both with the same methods at first)?

agozillon updated this revision to Diff 516393.Apr 24 2023, 6:55 AM
agozillon marked 3 inline comments as done.
  • [Flang][OpenMP][MLIR] Remove optimisation passes
  • [Flang][OpenMP] Add Parse Tree Rewrite for Declare Target
  • [Flang][Parser] Move to canonicalize-omp
  • [Flang][Semantics] Add finalize-omp files, move ImplicitDeclareTargetCapture pass here
  • [Flang][NFC] Try to remove changes to canonicalize-omp.cpp
  • [Flang][NFC] Add missing newline to end of file
  • [Flang][OpenMP] Add declare-target-implicit-capture-rewrite.f90 test
  • Fix alterations made from rebasing
  • Tidy up handleDeclareTarget
agozillon updated this revision to Diff 516396.Apr 24 2023, 7:02 AM
  • readd space I removed on rebase.
agozillon updated this revision to Diff 516401.Apr 24 2023, 7:12 AM
  • Try to re-align cmakefile after rebase

Updated based on @skatrak's feedback! sorry for the spam, rebased and had some issue re-adding the exact formatting of the CMakefile

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
32

Yes, perhaps @kiranchandramohan has some preference, I would lean more towards DeclareTargetInterface, if we have no other Offload features currently for Global/Functions!

agozillon updated this revision to Diff 517927.Apr 28 2023, 8:04 AM
  • [Flang][OpenMP][MLIR] Update Declare Target for initial GlobalOps support
agozillon retitled this revision from [Flang][OpenMP][MLIR] Add lowering from parse tree to MLIR support for Declare Target for functions and subroutines to [Flang][OpenMP][MLIR] Add lowering from parse tree to MLIR support for Declare Target for functions, subroutines and global data.Apr 28 2023, 8:23 AM
agozillon edited the summary of this revision. (Show Details)
agozillon added a comment.EditedApr 28 2023, 8:32 AM

Recent update has extended support in this patch from functions and subroutines to global data in the form of fir.GlobalOps. There is likely some other types that need consideration and future extension of this work in the future as well, such as allocatable's which I've yet to look into (but plan too).

flang/test/Lower/OpenMP/omp-declare-target-data.f90
15

These tests will fail the build-bot here, because the dependent patches aren't accepted upstream just yet, but a Flang -> LLVM-IR test is the only way I can think of testing the device portion of this test, however, it can be in another seperate file. I also think the OpenMP support for offloading in Flang could benefit greatly from a test suite that goes directly from Fortran -> LLVM-IR, alongside the already existent tests that filter down in stages.

If this patch is accepted and able to be submitted before the dependent portions, I'll remove this failing segment (then wait on the phabricator buildbot to be happy) to upstream it.

agozillon updated this revision to Diff 519205.May 3 2023, 12:29 PM
agozillon edited the summary of this revision. (Show Details)
  • [Flang][OpenMP][MLIR] Update Declare Target for initial GlobalOps support
  • [Flang][OpenMP][MLIR] Add DeclareTargetInterface in place of Declare Target Dialect functions
agozillon updated this revision to Diff 519206.May 3 2023, 12:29 PM
  • [Flang][OpenMP][MLIR] Add DeclareTargetInterface in place of Declare Target Dialect functions

The recent update has moved all of the dialect functions relating to the declare target attribute into a new interface that is applied to LLVM::FuncOp/LLVM::GlobalOp/Fir::GlobalOp/Func::FuncOp. Essentially anything that currently gets marked declare target in the Flang pipeline, although as it's an interface it's easy to apply to other operations as necessary.

Can you add a test case with a recursive call?

agozillon updated this revision to Diff 519831.May 5 2023, 6:13 AM
  • [Flang][OpenMP][MLIR] Update Declare Target for initial GlobalOps support
  • [Flang][OpenMP][MLIR] Add DeclareTargetInterface in place of Declare Target Dialect functions
  • [Flang][OpenMP][Test] Remove the intentional fails that were for discussion purposes
  • [Flang][OpenMP][Test] Add recursive tests to make sure there's no infinity run from the compiler

Most recent commit adds recursive test cases at the end of omp-declare-target-func-and-subr.f90 and declare-target-implicit-capture-rewrite.f90, the pass handles these appropriately it appears, so no further changes were required. I have also removed the (intentionally) failing test checks to see how the buildbot handles the patch after a rebase.

agozillon updated this revision to Diff 519900.May 5 2023, 9:37 AM
  • [Flang][OpenMP][Test] Remove illegal link tests (no semantic check for it yet, but there will be eventually)
jsjodin added inline comments.May 8 2023, 10:39 AM
flang/lib/Lower/OpenMP.cpp
2476

It might be preferable to use llvm::SmallVector<Fortran::semantics::Symbol, 0> here, according to this: https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h.

Or to not specify any value at all:

In the absence of a well-motivated choice for the number of inlined elements N, it is recommended to use SmallVector<T> (that is, omitting the N). This will choose a default number of inlined elements reasonable for allocation on the stack (for example, trying to keep sizeof(SmallVector<T>) around 64 bytes).
flang/lib/Semantics/finalize-omp.cpp
29

I think it would be helpful to have a short description how this works.

flang/test/Lower/OpenMP/omp-declare-target-func-and-subr.f90
112

Nit: inconjunction -> in conjunction (multiple places)

agozillon updated this revision to Diff 520658.May 9 2023, 4:56 AM
  • [Flang][OpenMP] Made minor alterations based on reviewer comments
agozillon marked 2 inline comments as done.May 9 2023, 5:01 AM

Made an attempt to address @jsjodin's review comments in the last patch

flang/lib/Lower/OpenMP.cpp
2476

Unfortunately, eliding the size isn't possible with this SmallVector, perhaps because of the change to using a std::pair, it hits a static_assert! So providing 0 appears to be a good middle ground.

skatrak added inline comments.May 9 2023, 5:09 AM
flang/lib/Lower/OpenMP.cpp
2476

I think the problem might not even be the std::pair. I tried this with Symbol and it failed to compile as well, and thought it might be just that it's a large structure so it refuses to provide a default number of elements to place in the stack.

jsjodin accepted this revision.May 9 2023, 7:51 AM

LGTM

This revision is now accepted and ready to land.May 9 2023, 7:51 AM

I will have a look at this soon (in a day or two).
Are the declares handled, if inside a module?

I will have a look at this soon (in a day or two).

That's no problem I was going to leave it a few days and leave a ping here in-case you wished to have a look over it before I up-streamed it!

Are the declares handled, if inside a module?

The data tests are all within a module, and work, although they do not need the semantic pass. So I will add another small set of tests with functions and subroutines in a module just in-case as the more tests the better (so if you have any other ideas, please do mention them), however, in this case I am reasonably certain it will work without issue.

flang/lib/Lower/OpenMP.cpp
2476

That's probably the case then, it did seem size related and I'd assumed it was perhaps the coupling of the two in a pair!

This patch is rather large and touches both Flang and MLIR. Would you be OK to move the semantics part and its test to a separate patch?

This patch is rather large and touches both Flang and MLIR. Would you be OK to move the semantics part and its test to a separate patch?

I would be happy too, this patch has grown a fair chunk over time! The flang/test/Lower/OpenMP/omp-declare-target-func-and-subr.f90 also depends on the semantic pass as well (some use of implicit capturing, but testing the IR generates okay, rather than the parse tree), so I will move that into the semantic portion of the patch set too if that is okay.

This patch is rather large and touches both Flang and MLIR. Would you be OK to move the semantics part and its test to a separate patch?

I would be happy too, this patch has grown a fair chunk over time! The flang/test/Lower/OpenMP/omp-declare-target-func-and-subr.f90 also depends on the semantic pass as well (some use of implicit capturing, but testing the IR generates okay, rather than the parse tree), so I will move that into the semantic portion of the patch set too if that is okay.

If it is only the lowering test without any change in the lowering code then that is fine. Otherwise that can come in a later patch.

Also update the relevant section of the document flang/docs/OpenMP-semantics.md.

I have split this patch into 3 rather than 2, to make it more bite-sized.

The attribute and interface segment: https://reviews.llvm.org/D150328
The semantic pass segment: https://reviews.llvm.org/D150323
The PFT -> MLIR lowering segment: https://reviews.llvm.org/D150329

I will keep this patch open until the other three are closed. This is for record, reference and historical comment purposes (so.. mainly for me, but it might be of use to others as well). But future discussion can go to their representative segment.

agozillon abandoned this revision.Jun 13 2023, 9:40 AM

Going to abandon this patch, as 2/3 of the components are upstreamed now and the only patch open is the implicit marking via semantic analysis pass and it's diverged quite a bit from this patch.

flang/test/Fir/omp-declare-target-data.fir