This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][OpenMP] Initial Lowering of Declare Target for Data
ClosedPublic

Authored by agozillon on Apr 27 2023, 11:45 AM.

Details

Summary

This patch adds initial lowering for DeclareTargetAttr on
GlobalOp's utilising registerTargetGlobalVariable
and getAddrOfDeclareTargetVar from the
OMPIRBuilder.

DeclareTargetAttr encapsulates in MLIR a declare
target declaration applied to a piece of global data
or a function and indicates it should undergo
special handling.

The patch currently does not factor in filtering
based on device_type clauses (e.g. no emission of
globals for device if host specified), this will come in
a future iteration.

Diff Detail

Event Timeline

agozillon created this revision.Apr 27 2023, 11:45 AM
agozillon requested review of this revision.Apr 27 2023, 11:45 AM
agozillon added a comment.EditedApr 27 2023, 11:50 AM

This patch depends on https://reviews.llvm.org/D149162 and https://reviews.llvm.org/D148370 alongside https://reviews.llvm.org/D146063 which will undergo some further updates tomorrow to support this lowering patch (changes to the Attribute for retaining more clause information and increased PST -> MLIR lowering support).

  • The traditional (for me) missing end of file newline
agozillon added inline comments.Apr 27 2023, 12:03 PM
mlir/test/Target/LLVMIR/openmp-llvm.mlir
2500 ↗(On Diff #517666)

There's a slightly more frontend (Flang) oriented test that I'll add to https://reviews.llvm.org/D146063 which tests device lowering via Fortran -> LLVM-IR checking.

I am not too sure it's possible to test the device lowering from here as it requires a pre-made .bc file that matches the test to be specified as an attribute which isn't ideal, but I am also not too sure how to incorporate a filepath that can change based on location of the test into MLIR either e.g. filepath="someusersfolderpath/test.bc"

agozillon updated this revision to Diff 534575.Jun 26 2023, 8:39 AM
  • [Flang] Fix functions after rebase
  • Add load generation for device and ref swap for host, and add device test
Herald added a project: Restricted Project. · View Herald TranscriptJun 26 2023, 8:39 AM
agozillon updated this revision to Diff 534581.Jun 26 2023, 8:43 AM
  • Remove the rebase artifact and add a newline at end of test
agozillon added a comment.EditedJun 26 2023, 9:46 AM

If someone has time to review this patch it would be much appreciated, it's at a stage where the current changes can likely go in, if everyone is happy with them of course and it passes review. However, if we wish to wait a little longer until more kernel launch code is in that's also perfectly fine, but I'd still love to get this to a stage where we're happy with the way it works and it's ready to go in when the rest lands!

The most recent update completes the generation of the IR that allows declare target link and target op's map to be used in conjunction to pass data to and from the device (for simple cases at least, int's and arrays of ints, possibly other basic types, but I have yet to test with allocatables or complex types). It requires some additional glue code for target op that can be added in a seperate patch after https://reviews.llvm.org/D151035 has landed (the glue code is just generic combined info data generation that will need to be done for all map clauses). The update adds generation of additional Loads for device arguments, further maptype modification for map arguments based on if it's a declare target map item and swapping the base pointer to the appropriate reference pointer when the mapped variable is a declare target link variable. This is in addition to the generation of ref pointers and relevant metadata that was done by the patch prior to the recent update.

The current test depends on: https://reviews.llvm.org/D153781 as it's in a test that currently makes use of split-input-file (although, this is for neatness, not necessity, it can likely be altered to not use it or be in a seperate file)

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
1465

This currently isn't tested in patch (I've tested it downstream though of course), it alters the host mapping, it requires the dependent patch here: https://reviews.llvm.org/D151035 and then possibly another small additional patch to link kernel argument generation up in OpenMPToLLVMIRtranslation.cpp

However, this patch can likely land without the dependents, this little segment just needs to have an additional test added later, which can be added alongside some proper execution tests when we have more of the kernel launch and argument generation in upstream.

I would appreciate very much if a reviewer could get a spare minute to give this patch an initial lookover (the buildbot break is related to https://reviews.llvm.org/D153781 and can be worked around by removing the split-input-file option from the mlir-translate tool), thank you very much for your time

Thanks. I have started going through the patch. I have one question about a deletion, see inline.

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
1879–1887

Not clear whether I understand this. Haven't you used the global that was generated by MLIR llvm dialect translation for registering and getting the address of? So why does this need a deletion here?

mlir/test/Target/LLVMIR/omptarget-declare-target-llvm.mlir
5

Is this an LLVM IR attribute or an MLIR attribute that you are talking about here?

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
1828

Nit: expand the type here.

1835–1836

Nit: expand type here.

1867

Nit: Could you tag the parameters with the name if they are constants or nullptr?

1878

Nit: Could you tag the parameters with the name if they are constants or nullptr?

Thanks. I have started going through the patch. I have one question about a deletion, see inline.

Thank you very much Kiran. I'll get to fixing the Nits hopefully tomorrow or at the beginning of next week. I've done my best to reply to your questions so far!

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
1879–1887

For device this is not required and is effectively just dead code when this case is reached as the "real" global will not be used, but has unfortunately already been generated when we get to this stage. We could just not delete it and let some later llvm optimisation pass clean it up, if we wish to be very safe, or add it to a list for removal at the end of the module lowering.

On device for cases such as declare target link(some_int), some_int is effectively just a global null pointer that the OpenMP runtime will "magically" read/write the data into as necessary and the device can access it through, this global being deleted is seperate from it (this is just my understanding, someone with more OpenMP knowledge can possibly elaborate further on the details or disprove my understanding). The call to getAddrOfDeclareTargetVar will generate something like the following in this case:

@_some_int_decl_tgt_ref_ptr = weak global ptr nullptr

And then the device kernel will use the pointer to access the data, we simply need the globals name, to keep it synchronized across device and host and so we know it refers to the same data!

mlir/test/Target/LLVMIR/omptarget-declare-target-llvm.mlir
5

In this case it's an MLIR attribute, the host-ir-file-path MLIR Attr applied to the module, it's required by the device pass and passed from the host pass after it's generated the .bc file for the host segment of the end binary. It's required to populate the OpenMPIRBuilder with the necessary information for it to generate the OpenMP metadata for kernels and declare target and perhaps some other things so that the runtime can appropriately manage data and kernels (at least from my understanding so far). It's primarily required because we must generate synchronized metadata across both the host and device pass, Clang does the same thing at the moment a side affect of multiple passes I believe as SYCL does something similar.

In theory you could have a seperate device version of this test, that will compile this host test from an mlir file to .bc and then use the generated .bc file in the test, but I don't think it'd work as we have to then generate the MLIR attribute that holds the file path on the fly, unless it's possible to always guarantee where the file will be placed (can't really predict a users directory structure for example, although, perhaps you can just specify it relative from the test file though) as then you can just hard-code the attribute with the fixed filepath.

agozillon added inline comments.Jul 6 2023, 9:34 AM
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
1879–1887

Slight amend: Some other information than the name is needed from the global as well, such as llvm type etc. to generate the ref ptr for device I believe, so it is required up until this point I think as we don't have the same access to this information as Clang. But perhaps it's entirely retrievable from the MLIR type and value.

agozillon marked 4 inline comments as done.
  • Apply the nit comments from reviewer

I tried to address your Nit's in the last commit!

agozillon updated this revision to Diff 556737.Sep 13 2023, 5:09 PM

Rebase, move flag changing from PFT lowering, to MLIR -> LLVM-IR lowering, teach early outliner pass to handle declare target globals, add a new runtime test now that enough is in place for one

Herald added a project: Restricted Project. · View Herald Transcript

A small ping for a review of this patch, breathing some new life into this as there's now enough upstream for the patch to fully function and have a runtime test. Some minor changes to the patch mentioned briefly in the previous message, so hopefully we can get this to a state we're happy with before the end of the month and the full swap to Github!

flang/lib/Optimizer/Transforms/OMPEarlyOutlining.cpp
100 ↗(On Diff #556737)

These changes are a slightly regressed and simplified version of those in: https://reviews.llvm.org/D158735 depending on which lands first a rebase will be required, but it shouldn't be a problem.

I wish/wished to keep this patch self reliant and out of the more complex stack of patches as there's enough upstream to support this and it's a very different piece of functionality and landing all 5 in tandem could become rather complex if there's buildbot failures!

A small ping on this patch for some attention please if possible, it'd be nice to push it closer to acceptance if at all feasible!

@skatrak could you help review this?

Thank you Andrew for this work! I've got a few comments, but most of them are just style-related or pointing out typos.

flang/lib/Optimizer/Transforms/OMPEarlyOutlining.cpp
32 ↗(On Diff #556737)

Maybe rename this to isAddressOfGlobalDeclareTarget or similar, pass in the mlir::Value and check for the defining op inside this function. It should reduce some code duplication among the two calls to this function and also make its purpose clearer.

33 ↗(On Diff #556737)

Nit: Perhaps for consistency I would swap out fir::AddrOfOp and fir::GlobalOp in the declared variable names here for auto, and replace auto for mlir::Value in the two new loops below over inputs and targetOp.getMapOperands(), so that you only use auto where the type is already spelled out in the initialization.

94 ↗(On Diff #556737)
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
1412

Nit: It should be fine to pass by value here.

1417
1418

The two calls to getDefiningOp I think can be simplified like this.

1421

This should be inside of an 'if' statement, since the returned value is accessed right after with no presence checks.

1431

I think this check is redundant. By comparing the result of declareTargetGlobal.getDeclareTargetCaptureClause() against some non-null values we're already making sure that the condition won't be satisfied if there is no omp.declare_target attribute.

1437

Can you refactor this suffix generation code into a function? I think it would help with readability here.

1475

Nit: It might be more readable to structure it as:

if (refPtr)
  mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_PTR_AND_OBJ;
else if (isTargetParams)
  mapFlag |= llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TARGET_PARAM;

But feel free to ignore if you disagree.

1692

I think llvm::ArrayRef<mlir::Value> is better suited here.

1695
1699
1701
1706
1734

According to the programmer's manual here, it is preferred to avoid hardcoding the "small" size of small vectors when used as output parameters.

1736

I think it's more readable to negate the condition rather than adding a continue statement here, but feel free to ignore this comment if you disagree.

1779
1786–1895

Is there any reason to make this function visible outside the translation unit?

1809
1813

This seems like it would crash if targetTripleAttr was not present or a StringAttr, since the output of dyn_cast_or_null is dereferenced without checking first.

1817

Can you provide a default to fail gracefully if loc turns out to be null? Or throw out an error earlier when it's initialized, if that's not possible.

1823

Please add comments with the parameter names for the ones set to false or nullptr here and in the call below to getAddrOfDeclareTargetVar().

agozillon updated this revision to Diff 557086.Sep 19 2023, 5:04 PM
agozillon marked 23 inline comments as done.
  • Update based on reviewer comments and rebasing
agozillon added a comment.EditedSep 19 2023, 5:05 PM

Made an attempt at addressing all review points made by @skatrak in the prior patch as well as rebase on most recent upstream changes. Let me know what you think when you have a spare moment @skatrak! Thank you very much for your review!

flang/lib/Optimizer/Transforms/OMPEarlyOutlining.cpp
33 ↗(On Diff #556737)

rebased on the recent map changes I upstreamed, so the changeset here isn't required anymore! but I've taken into account these review points and done my best to apply them none the less, as they're good suggestions for tidying it up a bit more.

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
1786–1895

minor rebase issue on my end from when you added the function upstream before me I think! no reason for it not to be static thank you for the catch.

1817

The call to getTargetEntryUniqueInfo that uses it emits an error when it can't generate a unique id from the file data given, so I've attempted to emit some empty information in this case so that it fails gracefully (and silently) here and the error from the OMPIRBuilder function is emitted.

agozillon updated this revision to Diff 557087.Sep 19 2023, 8:09 PM
  • Update test to use only explicit variables (implicits don't work quite yet)
agozillon added inline comments.Sep 19 2023, 8:20 PM
openmp/libomptarget/test/offloading/fortran/declare-target-array-in-target-region.f90
19 ↗(On Diff #557087)

I've changed this test to only use explicitly mapped variables, as implicitly mapped variables don't work quite yet, a little awkward looking but it works similarly to the previous iteration. I believe check-openmp/runtimes may have been picking up my flang build with downstream changes previously and giving a little bit of a false pass for the previous iteration, after rebasing it's picked up the right one though.

The implicit map handling should come soon, how we handle it depends on how the IsolatedFromAbove changes that @TIFitis is working on goes, as I think his current work will supersede the small pass I worked on for it and retire the EarlyOutlinigPass.

Thanks Andrew for the changes. Just a couple of nits, but overall LGTM.

flang/lib/Optimizer/Transforms/OMPEarlyOutlining.cpp
44 ↗(On Diff #557087)

Just like in the other spot I suggested the same change, this line and the next can be simplified like this.

180 ↗(On Diff #557087)
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
1432

I think we can remove const here: https://mlir.llvm.org/docs/Rationale/UsageOfConst/

1472

If we can assert at this point that mapTypes can only contain IntegerAttr at this point, it's better to do mapTypes[index].cast<IntegerAttr>. Otherwise, we should check the result of the dyn_cast before calling getUInt on it to avoid a crash.

1831

Seems like clang-format missed breaking this line

agozillon updated this revision to Diff 557113.Sep 20 2023, 7:09 AM
agozillon marked 5 inline comments as done.
  • Fix reviewer comments

I think I've covered all the review points you made in the last update, please let me know if I'm missing anything though!

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
1472

It can only contain an IntegerAttr, a UInt to be precise in this case, otherwise there'd be complaints somewhere else in the compiler about mismatched types! So in this case I've swapped it to a cast.

skatrak accepted this revision.Sep 20 2023, 7:28 AM

Thank you, LGTM!

This revision is now accepted and ready to land.Sep 20 2023, 7:28 AM

Thank you for the review and help @skatrak!

@kiranchandramohan would you like some time to review and approve the change set?

Thank you for the review and help @skatrak!

@kiranchandramohan would you like some time to review and approve the change set?

@skatrak has given a great review. Thanks @skatrak.

Please go ahead.

openmp/libomptarget/test/offloading/fortran/declare-target-array-in-target-region.f90
1 ↗(On Diff #556737)

Nice test.

@kiranchandramohan thank you! I'll land the patch in the next couple of hours then provided no one has any further feedback or input during that time period.

openmp/libomptarget/test/offloading/fortran/declare-target-array-in-target-region.f90
1 ↗(On Diff #556737)

Thank you, it's a team effort, and it will hopefully be less convoluted to make tests in the near future after Akash's work.

Fortran has a lot of hidden generation of variables/constants that are generated outside of the target region but used inside of it, at least for the moment, that creates some hidden implicit map dependencies unfortunately e.g. a do loop used inside of a target region will create some implicit dependencies on the index variable, and loop constants defining its range (the former you can explicitly map, the latter not so much, although perhaps there's a Fortran syntax for a do loop that can allow for it that I don't know!). It'd be reasonably easy to teach the lowering to handle it better, but it'd be an interim work around, so I think it's better to wait for @TIFitis IsolatedFromAbove work so it can be handled correctly and concisely.

Thank you, I'm looking into it. Any insights into it are appreciated :-)

vzakhari added inline comments.
flang/lib/Optimizer/Transforms/OMPEarlyOutlining.cpp
137 ↗(On Diff #557133)

FWIW, doesn't this invalidate the underlying iterator?

agozillon added inline comments.Sep 20 2023, 4:44 PM
flang/lib/Optimizer/Transforms/OMPEarlyOutlining.cpp
137 ↗(On Diff #557133)

it might actually, thank you for the catch.

found the issue now working on a fix for it.

should be fixed with this commit: https://github.com/llvm/llvm-project/commit/171d8c40288a995f7c34b0479f31241132af67a5 which tries to address @vzakhari's comment and the memory leak I inadvertently added (my apologies, and sorry for the annoying unrelated to your commits buildbot messages for the past few hours)