Page MenuHomePhabricator

[Flang][OpenMP][MLIR] Add lowering from parse tree to MLIR support for Declare Target for functions and subroutines
Needs ReviewPublic

Authored by agozillon on Tue, Mar 14, 9:30 AM.

Details

Summary

This adds lowering support to Flang for OpenMP Declare Target
tagged functions and subroutines, it currently supports only the
to and device_type clauses. Although, it is very likely the current
infrastructure is easily extendable to link (findFuncAndVarSyms
should work with ease), this has yet to be tested with global
data, but this will be a future extension of this work.

This patch adds an attribute which contains data
provided to the device_type clause (any | host | nohost) and
marks MLIR FuncOps with this attribute where neccesary.

An additional pass is added to Flang which marks functions used
within declare target functions as declare target with their
callers device_type (or a combination if they are marked
multiple times). This is a form of implicit capture, this appears
to occur within Clang (albeit in a different way) and from my
current reading and understanding of the specification is described
there as well. However, perhaps this is not the right way to go
about this. Unfortunately the Fir.CallOps within
FIR make the current form of the pass a
little difficult to generalise to an OpenMP dialect pass
within the MLIR infrastructure as it would add Flang
dependencies to the MLIR project.

This also adds omp-declare-target.f90 and attr.mlir as tests to test
the attribute and the current lowering support.

The idea is that this attribute can be picked up by amendOperation or
similar infrastructure later during lowering of the LLVM Dialect +
OpenMP dialect to LLVM-IR and have specialised lowering based
on the contained device_type clause.

Naive filtering/removal of declare target functions was attempted
within the initial lowering inside of the new handleDeclareTarget
function based on the omp.is_device attribute tied to the
module. However, this cannot be done at this stage as the
function itself may contain a target region that has yet to be
lowered. However, I believe this filtering could be done after
the initial module lowering with a supplementary pass similar
to the one added within this patch. But I have yet to investigate
this direction.

Diff Detail

Event Timeline

agozillon created this revision.Tue, Mar 14, 9:30 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
agozillon requested review of this revision.Tue, Mar 14, 9:30 AM
kiranchandramohan added a comment.EditedMon, Mar 20, 6:54 AM

The problem with doing this in two stages : i.e during lowering and a pass are that there are two representations for the same thing in the IR. One where the nested functions are not marked, and the other where they are marked.

Would it be better to do this in

  1. Semantics + lowering?

or

  1. PFT + lowering?

So that there is only one representation in the IR.

I believe I understand your concerns.

I think I've tried to do number 2 already, if I understand correctly and the PFT + lowering phase comprises the handleDeclareTarget function inside of OpenMP.cpp (and the other lowering steps to the various MLIR dialects from the PFT). I initially tried to do it while processing the initial declare target lowering inside of handleDeclareTarget, however, you run into the issue that the function the declare target is nested inside hasn't been fully lowered yet, so you cannot tell what other functions are invoked by the declare target function from the currently existing MLIR as it's yet to be generated! The only way I can think of circumventing this is by doing some sort of delayed evaluation that occurs once the function is completed, but I'm unsure how ideal that is.

After running into this issue with the original method I also thought about doing it in the semantic check + PFT generation phase as I believe you suggest as option number 1 (but do please correct me if my understanding is wrong), by pushing the functions (to be marked implicitly declare target) invoked within the declare target function onto the originally generated OmpDeclareTargetSpecifier's To or Link list, and then the handleDeclareTarget function would just treat them as normally. I decided to forgo this and go with the MLIR pass as it seemed like the simpler option that was most likely to work and I wasn't sure if there was a precedent for doing this sort of thing at the semantic level.

I am happy to look into either of these if you think they're better than the current solution! Is there one approach you prefer over another?

As a related aside I was considering doing another pass similarly to this, which removes unrequired declare target functions from the module (e.g. remove device only declare target functions if it's a host module), with the intent of removing unnecessary functions as early as possible (as requested in a prior discussion I believe). The main reason for doing this as an end module pass as opposed to removing them directly in the handleDeclareTarget function is that it is going to run into the same problem, as I believe you need to know if other OpenMP offload code exists in the function before you discard it, e.g. if a declare target host function has a device target region within it, it can't be discarded inside of an OpenMP device module early as we'd lose the target region. Perhaps there's some OpenMP specification rules I'm unaware of that prevent a target region inside of a declare target function, although I fear a similar issue may occur with data mapping operations as well in any case. Would a pass like this be more reasonable than the pass in this patch?