This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][MLIR] Refactor and extend current map support by adding MapInfoOp and DataBoundsOp operations to the OpenMP Dialect
ClosedPublic

Authored by agozillon on Aug 24 2023, 6:43 AM.

Details

Summary

This patch adds two new operations:

The first is the DataBoundsOp, which is based on OpenACC's DataBoundsOp,
which holds stride, index, extent, lower bound and upper bounds
which will be used in future follow up patches to perform initial
array sectioning of mapped arrays, and Fortran pointer and
allocatable mapping. Similarly to OpenACC, this new OpenMP
DataBoundsOp also comes with a new OpenMP type, which
helps to restrict operations to accepting only
DataBoundsOp as an input or output where necessary
(or other related operations that implement this type as
a return).

The patch also adds the MapInfoOp which rolls up some of
the old map information stored in target
operations into this new operation, and adds new
information that will be utilised in the lowering of mapped
variables, e.g. the aforementioned DataBoundsOp, but also a
new ByCapture OpenMP MLIR attribute, and isImplicit boolean
attribute. Both the ByCapture and isImplicit arguments will
affect the lowering from the OpenMP dialect to LLVM-IR in
minor but important ways, such as shifting the final maptype
or generating different load/store combinations to maintain
semantics with the OpenMP standard and alignment with the
current Clang OpenMP output as best as possible.

This MapInfoOp operation is slightly based on OpenACC's
DataEntryOp, the main difference other than some slightly
different fields (e,g, isImplicit/MapType/ByCapture) is that
OpenACC's data operations "inherit" (the MLIR ODS
equivalent) from this operation, whereas in OpenMP operations
that utilise MapInfoOp's are composed of/contain them.

A series of these MapInfoOp (one per map clause list item) is
now held by target operations that represent OpenMP
directives that utilise map clauses, e.g. TargetOp. MapInfoOp's
do not have their own specialised lowering to LLVM-IR, instead
the lowering is dependent on the particular container of the
MapInfoOp's, e.g. TargetOp has its own lowering to LLVM-IR
which utilised the information stored inside of MapInfoOp's to
affect it's lowering and the end result of the LLVM-IR generated,
which in turn can differ for host and device.

This patch contains these operations, minor changes to the
printing and parsing to support them, changes to tests (only
those relevant to this segment of the patch, other test
additions and changes are in other dependent
patches in this series) and some alterations to the OpenMPToLLVM
rewriter to support the new OpenMP type and operations.

This patch is one in a series that are dependent on each
other:

https://reviews.llvm.org/D158734
https://reviews.llvm.org/D158735
https://reviews.llvm.org/D158737

Diff Detail

Event Timeline

agozillon created this revision.Aug 24 2023, 6:43 AM
Herald added a project: Restricted Project. · View Herald Transcript
agozillon requested review of this revision.Aug 24 2023, 6:43 AM

Trying to take a page from @skatrak's playbook and build a patch stack (a bit late to be learning this considering we're swapping to LLVM PR's soon I believe, but it felt useful with these patches). So this is patch 1/4 in a series that will depend on each other, with the intent that the 4th will pass CI via phabricators "Depends on" support, if I don't manage to make it work, it is still possible to review each patch separately and I can open up a larger patch that contains all of the segments to test that it passes CI before final commit once we are happy with each individual component. As unfortunately this patch series is very interlinked and rather large.

Patch 1/4 of the map changes from the patch series made by: https://reviews.llvm.org/D158732, https://reviews.llvm.org/D158735, https://reviews.llvm.org/D158737 and https://reviews.llvm.org/D158734 that aims to expand the current map support of the OpenMP dialect and lower from Fortran -> MLIR -> LLVM IR, future support on expanding the lowering from the OpenMP dialect to LLVM IR for map on TargetOp (omp target) will be forthcoming for declare target and explicit map variables hopefully in the near future (implicit likely a little while after that).

This patch should not pass CI on its own, patch 4: https://reviews.llvm.org/D158737, should pass provided the dependencies are setup correctly.

A small ping for some attention on this patch if a reviewer can spare some time, thank you very much!

Please write an RFC in the MLIR discourse channel for this update.

agozillon added a comment.EditedAug 31 2023, 6:42 AM

Please write an RFC in the MLIR discourse channel for this update.

No problem @kiranchandramohan, I have created one here:

https://discourse.llvm.org/t/rfc-updating-extending-the-openmp-dialects-map-to-support-array-boundaries-capture-type-and-a-refactoring/73149

Please do add to it if you wish to add further discussion points.

clementval added inline comments.
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
1038

Would it make sense to have a common interface (BoundLikeOp) so lowering can be shared more easily.

razvanlupusoru added inline comments.
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
1098

Are the variable operand methods used anywhere?

mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir
196–204

It would improve readability if map_type_value was also printed as a string here.

199

I cannot tell if both the target operation and the map_entry operation hold the same information about the mapping flags. If yes, it might be a good idea to relegate it to a single spot - namely the map_entry operation.

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
1038

I personally agree with Valentin's comment. I am in the process of creating a new folder for sharing interfaces between OpenMP and OpenACC. And the interface BoundLikeOp likely will find a good home there. By sharing the interfaces we are not coupling the two dialects and we can still have separate implementation with different semantics.

I also don't think this suggestion needs anything done in this patch - I am OK with omp dialect introducing its own DataBoundsOp at this time and attaching the interface later.

Thank you both very much for your feedback, I appreciate it a lot, I believe I've answered all of the comments inline!

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
1038

I'd certainly be happy to try and share as much as possible across the two implementations.

Although, as @razvanlupusoru mentions I would prefer to do this in a future patch if at all possible after the initial patch to land the existing shared folder lands (or if someone else would like to take up the task of creating a shared interface that'd be perfectly fine as well)!

1098

Yes, inside of OpenMPToLLVM.cpp's lowering process, the call RegionLessOpWithVarOperandsConversion uses these to generically iterate over any operation that implements these functions operands. It's likely possible to create a specialised lowering, but the lowering at the moment is just the default, i.e. it's effectively not lowered to LLVM just updated with alterations to the operands from my current understanding at least.

mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir
196–204

I am a little hesitant to do so. Primarily because; as you've noticed in the comment below, the information is the same as the one displayed in the primary holding operation, however, the stored information is actually a lot more than just from/to and friends, the textual pretty printing only shows a small subset of the underlying information stored in the number. It's a bit-shifted integer composed of these flags: https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Frontend/OpenMP/OMPConstants.h#L193C38-L193C38 and the number presented here is the number that'll inevitably be in the LLVM-IR, so it makes it simple at a glance to compare the MLIR to the LLVM-IR. But perhaps that's just me and I've spent too long debugging map entries in LLVM-IR!

But yes, if we wanted to pretty print it fully and be able to retain proper MLIR parsing for these map entries I think we'd need to fully represent every flag in that enumerator which the current map implementation does not do. For example it's possible for something to be OMP_MAP_TARGET_PARAM | OMP_MAP_LITERAL | OMP_MAP_IMPLICIT, which would read something like (param, literal, implicit -> map_var) and there's a fair number of other combinations from what I can gather from Clang, which we only scratch the surface on at the moment in Fortran. What the current pretty printing seems to do is filter out the implementation specific details and present the reader with the specification defined wording for things.

So in this case I'd prefer to keep it as is, but if we do wish to change it I am also happy to do so.

199

They do currently, albeit this one filters out a lot of the implementation details stored in the information to present the specification defined wording for readers to understand a little easier. The other is just a bit-shifted series of flags, some that I believe are not specification defined and are implementation details. I gave a little more background in the prior comment.

However, aside from that it's tied to the operations printing as I tried to maintain the status quo of the original map pretty printing which uses the current operations type to pretty print some of it's values e.g. if it's a target_enter_data vs a target_exit_data we'll print the same information differently, e.g. release vs alloc. We can get around this by not making this operation specific distinction anymore in the pretty printing though and have some kind of shared printing e.g. release_or_alloc, but I'm not sure if that'd be confusing?

razvanlupusoru accepted this revision.Sep 6 2023, 12:54 PM
razvanlupusoru added inline comments.
mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir
196–204

OK - I understand! I think that ideally only the map entry operations hold the semantics of what needs done in the map operation. So that means eventually one would want this operation to also print with words the bit fields. I think it is OK to have both the numerical value and the words. Would you mind doing this in follow-up work?

199

From a design perspective, it would be better if the mapping operation was in a single spot - otherwise we run into possibility of mismatch.

This revision is now accepted and ready to land.Sep 6 2023, 12:54 PM
clementval added inline comments.Sep 6 2023, 1:00 PM
mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir
199

As @razvanlupusoru pointed out, this design feels weird. The information is duplicated now and it doesn't feel right.

agozillon added inline comments.Sep 8 2023, 10:14 AM
mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir
199

I'm happy to do it in this patch, I'll move the pretty printing to be only part of the map_entry, remove the numeric value and only have a string representation for it, this will involve introducing more wording to represent the entirety of the bitfield and we'll have to combine the release and alloc keywords into a single keyword, perhaps exit_release_or_enter_alloc?

I like the number but that's more from an it's easier to debug things perspective, but I understand the opportunity for mismatches in the written IR make it a little problematic, so I'm fine with a single string representation of it if we're happy with that!

This change will likely result in some other alterations across the patch series as a lot of the tests will need some alteration (and perhaps a subsequent accept), so I apologies for the noise ahead of time.

agozillon updated this revision to Diff 556314.Sep 8 2023, 3:27 PM

Make there be only one source of truth for map type on the map_entry operations and stringify all elements of the bitfield, adjust tests accordingly (tests will also be adjusted on other patches in the series)

razvanlupusoru accepted this revision.Sep 11 2023, 8:11 AM

Thank you!

@razvanlupusoru @clementval thank you both very much for your reviews!

I will leave the patch series open until Thursday to give time for further comments/feedback from other reviewers (in particular @kiranchandramohan and @jsjodin) at that point if no further comments/change requests are received I will push it upstream!

Thanks @agozillon for this update.

The patch also adds a new MapEntryOp operation which rolls up some of the old map information stored in target operations into this new operation, and adds new information that will be utilised in the lowering of mapped variables, e.g. the aforementioned BoundsOp, but also a new ByCapture OpenMP MLIR attribute

Is this Operation only present for collecting information? Or does it also manage the mapping of the data? I believe OpenACC has separate operations for creating copies both into and out of an OpenACC region (Small snippet from the proposal below). If it is only for collecting the information, would it be better to call it MapInfo instead of MapEntry?

Data clause operations which have semantics at both entry and exit of region will be decomposed:
acc copy => acc.copyin (before region) + acc.copyout (after region)

An alternative proposal was to continue to take in a reference to the FIR descriptor, and collect the relevant information from the descriptor while lowering from FIR to LLVM. Did you try this?
https://github.com/flang-compiler/f18-llvm-project/pull/915

I will leave the patch series open until Thursday to give time for further comments/feedback from other reviewers (in particular @kiranchandramohan and @jsjodin) at that point if no further comments/change requests are received I will push it upstream!

We will need a review from @TIFitis (who did the initial work on the map clause).

mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
1038

It is probably better to define this in terms of the terminology used in the OpenMP standard. I guess this will be Array Sections in OpenMP.
https://www.openmp.org/spec-html/5.0/openmpsu21.html

1121

The description does not say what this represents or how this will be used.
Can you add a summary as well?

mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
1555

Nit: newline

agozillon added a comment.EditedSep 12 2023, 8:42 AM

Thanks @agozillon for this update.

No problem! Thank you for the review.

The patch also adds a new MapEntryOp operation which rolls up some of the old map information stored in target operations into this new operation, and adds new information that will be utilised in the lowering of mapped variables, e.g. the aforementioned BoundsOp, but also a new ByCapture OpenMP MLIR attribute

Is this Operation only present for collecting information? Or does it also manage the mapping of the data? I believe OpenACC has separate operations for creating copies both into and out of an OpenACC region (Small snippet from the proposal below). If it is only for collecting the information, would it be better to call it MapInfo instead of MapEntry?

Data clause operations which have semantics at both entry and exit of region will be decomposed:
acc copy => acc.copyin (before region) + acc.copyout (after region)

It would probably be better to call it MapInfo in that case, as the Target related operations are in charge of the mapping via processMapInfo/genMapInfo and utilise these for information rather than a direct lowering.

An alternative proposal was to continue to take in a reference to the FIR descriptor, and collect the relevant information from the descriptor while lowering from FIR to LLVM. Did you try this?
https://github.com/flang-compiler/f18-llvm-project/pull/915

I've not, at least not for array sectioning, the initial bounds work by OpenACC seemed more apt/appealing, especially to maintain some similarities across the implementations.

However, I am currently looking into a reasonable way to access the descriptors for lowering pointers and pointer contained in a derived type in the OMPIRBuilder, so perhaps it'd be useful for that, so thank you very much!

For pointers it appears you can still use the bounds information as it's generated (and it just accesses the descriptor), but for pointers contained within a derived type you cannot; at least out-of-the-box it seems, but perhaps I just need to teach the initial PFT lowering of the map to handle derived-types a bit differently (although I'm not sure the original intention for bounds was to extend further than array sectioning), but that means implicit captures of derived types won't be able to map contained pointers implicitly I think (at least trivially). So I am still leaning towards indicating to the OMPIRBuilder in someway that it's lowering a Fortran pointer with a descriptor so it can directly access related information for offloading, but I am not sure of the best way to do that unfortunately, I have a few ideas that I am thinking about, but I am not sure what the best or most reasonable way to do it would be. So if anyone has any feedback or thoughts on this I'd love to hear about it.

I will leave the patch series open until Thursday to give time for further comments/feedback from other reviewers (in particular @kiranchandramohan and @jsjodin) at that point if no further comments/change requests are received I will push it upstream!

We will need a review from @TIFitis (who did the initial work on the map clause).

Sounds good, more than happy to wait for @TIFitis to review the series! I'll wait for @TIFitis's feedback as well before I address the renaming of MapEntry to MapInfo and the addition of improved/more OpenMP specific documentation!

I've not, at least not for array sectioning, the initial bounds work by OpenACC seemed more apt/appealing, especially to maintain some similarities across the implementations.

I have given a lot of thought about Fortran array slices in the context of OpenACC. :) Which is why in the end I ended up deciding to use a data bounds operation. I believe the same will apply to OpenMP.

I ended up explaining a few of the points around the need of the bounds operation here (in the 3rd part of my long response):
https://discourse.llvm.org/t/rfc-openacc-dialect-data-operation-improvements/69825/6?u=razvan.lupusoru
Here are the relevant pieces of the points I made there:

  • OpenACC slice semantics are not the same as slice semantics on the source language.
  • In both cases, the varPtr is the loaded address from the same descriptor.
  • The FIR inside the region is unaffected by the slicing operation. The original descriptor is still used.

So why is this relevant? Because the question was asked why not obtain the information from the FIR descriptor? Well, array slicing operations in directive models is not the same as slicing operation in the source language. So we cannot always use descriptor. Thus the bounds operation adds a uniform way to create data operations.

So I am still leaning towards indicating to the OMPIRBuilder in someway that it's lowering a Fortran pointer with a descriptor so it can directly access related information for offloading,

I am not convinced this additional complexity is needed. We can decompose the descriptor data action and the data action into two mapping operations. And the data action can indeed use bounds operation that loads from descriptor.

For pointers it appears you can still use the bounds information as it's generated (and it just accesses the descriptor), but for pointers contained within a derived type you cannot;

To me this seems like an implementation limitation, not a design one. Is this the case you had in mind?

program test_derived_ptr
  type container
    integer :: spacer
    integer, pointer, dimension(:) :: ptrvar
  end type
  
  type(container) :: base
  integer, pointer, dimension(:) :: secondptrvar
  integer, dimension(10), target :: array
  
  base%ptrvar => array
  secondptrvar => array
  
  !$acc data copy(base%ptrvar,secondptrvar)
  !$acc end data
end program

In this case, I indeed see that one of the operations has bounds and one does not:

%26 = acc.copyin varPtr(%25 : !fir.ptr<!fir.array<?xi32>>)   -> !fir.ptr<!fir.array<?xi32>> {dataClause = #acc<data_clause acc_copy>, name = "base%ptrvar"}
...
%32 = acc.bounds   lowerbound(%c0_3 : index) upperbound(%31 : index) stride(%30#2 : index) startIdx(%29#0 : index) {strideInBytes = true}
%33 = fir.box_addr %27 : (!fir.box<!fir.ptr<!fir.array<?xi32>>>) -> !fir.ptr<!fir.array<?xi32>>
%34 = acc.copyin varPtr(%33 : !fir.ptr<!fir.array<?xi32>>)   bounds(%32) -> !fir.ptr<!fir.array<?xi32>> {dataClause = #acc<data_clause acc_copy>, name = "secondptrvar"}

I don't see though why we couldn't generate a bounds operation for the derived type case. @clementval

I've not, at least not for array sectioning, the initial bounds work by OpenACC seemed more apt/appealing, especially to maintain some similarities across the implementations.

I have given a lot of thought about Fortran array slices in the context of OpenACC. :) Which is why in the end I ended up deciding to use a data bounds operation. I believe the same will apply to OpenMP.

I ended up explaining a few of the points around the need of the bounds operation here (in the 3rd part of my long response):
https://discourse.llvm.org/t/rfc-openacc-dialect-data-operation-improvements/69825/6?u=razvan.lupusoru
Here are the relevant pieces of the points I made there:

  • OpenACC slice semantics are not the same as slice semantics on the source language.
  • In both cases, the varPtr is the loaded address from the same descriptor.
  • The FIR inside the region is unaffected by the slicing operation. The original descriptor is still used.

So why is this relevant? Because the question was asked why not obtain the information from the FIR descriptor? Well, array slicing operations in directive models is not the same as slicing operation in the source language. So we cannot always use descriptor. Thus the bounds operation adds a uniform way to create data operations.

So I am still leaning towards indicating to the OMPIRBuilder in someway that it's lowering a Fortran pointer with a descriptor so it can directly access related information for offloading,

I am not convinced this additional complexity is needed. We can decompose the descriptor data action and the data action into two mapping operations. And the data action can indeed use bounds operation that loads from descriptor.

For pointers it appears you can still use the bounds information as it's generated (and it just accesses the descriptor), but for pointers contained within a derived type you cannot;

To me this seems like an implementation limitation, not a design one. Is this the case you had in mind?

program test_derived_ptr
  type container
    integer :: spacer
    integer, pointer, dimension(:) :: ptrvar
  end type
  
  type(container) :: base
  integer, pointer, dimension(:) :: secondptrvar
  integer, dimension(10), target :: array
  
  base%ptrvar => array
  secondptrvar => array
  
  !$acc data copy(base%ptrvar,secondptrvar)
  !$acc end data
end program

In this case, I indeed see that one of the operations has bounds and one does not:

%26 = acc.copyin varPtr(%25 : !fir.ptr<!fir.array<?xi32>>)   -> !fir.ptr<!fir.array<?xi32>> {dataClause = #acc<data_clause acc_copy>, name = "base%ptrvar"}
...
%32 = acc.bounds   lowerbound(%c0_3 : index) upperbound(%31 : index) stride(%30#2 : index) startIdx(%29#0 : index) {strideInBytes = true}
%33 = fir.box_addr %27 : (!fir.box<!fir.ptr<!fir.array<?xi32>>>) -> !fir.ptr<!fir.array<?xi32>>
%34 = acc.copyin varPtr(%33 : !fir.ptr<!fir.array<?xi32>>)   bounds(%32) -> !fir.ptr<!fir.array<?xi32>> {dataClause = #acc<data_clause acc_copy>, name = "secondptrvar"}

I don't see though why we couldn't generate a bounds operation for the derived type case. @clementval

That's probably a missing case in the code generating the bound op. I'll look into this.

I've not, at least not for array sectioning, the initial bounds work by OpenACC seemed more apt/appealing, especially to maintain some similarities across the implementations.

I have given a lot of thought about Fortran array slices in the context of OpenACC. :) Which is why in the end I ended up deciding to use a data bounds operation. I believe the same will apply to OpenMP.

I ended up explaining a few of the points around the need of the bounds operation here (in the 3rd part of my long response):
https://discourse.llvm.org/t/rfc-openacc-dialect-data-operation-improvements/69825/6?u=razvan.lupusoru
Here are the relevant pieces of the points I made there:

  • OpenACC slice semantics are not the same as slice semantics on the source language.
  • In both cases, the varPtr is the loaded address from the same descriptor.
  • The FIR inside the region is unaffected by the slicing operation. The original descriptor is still used.

So why is this relevant? Because the question was asked why not obtain the information from the FIR descriptor? Well, array slicing operations in directive models is not the same as slicing operation in the source language. So we cannot always use descriptor. Thus the bounds operation adds a uniform way to create data operations.

So I am still leaning towards indicating to the OMPIRBuilder in someway that it's lowering a Fortran pointer with a descriptor so it can directly access related information for offloading,

I am not convinced this additional complexity is needed. We can decompose the descriptor data action and the data action into two mapping operations. And the data action can indeed use bounds operation that loads from descriptor.

For pointers it appears you can still use the bounds information as it's generated (and it just accesses the descriptor), but for pointers contained within a derived type you cannot;

To me this seems like an implementation limitation, not a design one. Is this the case you had in mind?

program test_derived_ptr
  type container
    integer :: spacer
    integer, pointer, dimension(:) :: ptrvar
  end type
  
  type(container) :: base
  integer, pointer, dimension(:) :: secondptrvar
  integer, dimension(10), target :: array
  
  base%ptrvar => array
  secondptrvar => array
  
  !$acc data copy(base%ptrvar,secondptrvar)
  !$acc end data
end program

In this case, I indeed see that one of the operations has bounds and one does not:

%26 = acc.copyin varPtr(%25 : !fir.ptr<!fir.array<?xi32>>)   -> !fir.ptr<!fir.array<?xi32>> {dataClause = #acc<data_clause acc_copy>, name = "base%ptrvar"}
...
%32 = acc.bounds   lowerbound(%c0_3 : index) upperbound(%31 : index) stride(%30#2 : index) startIdx(%29#0 : index) {strideInBytes = true}
%33 = fir.box_addr %27 : (!fir.box<!fir.ptr<!fir.array<?xi32>>>) -> !fir.ptr<!fir.array<?xi32>>
%34 = acc.copyin varPtr(%33 : !fir.ptr<!fir.array<?xi32>>)   bounds(%32) -> !fir.ptr<!fir.array<?xi32>> {dataClause = #acc<data_clause acc_copy>, name = "secondptrvar"}

I don't see though why we couldn't generate a bounds operation for the derived type case. @clementval

That's probably a missing case in the code generating the bound op. I'll look into this.

I think there actually is a semi-related TODO in there, as I was wondering if it works for pointers, how hard would it be to extend it for derived types, and I was just looking into seeing if I could extend it myself! :-) But as you're the expert in this area, I'll just leave looking into this particular mapping case for further down the line and focus on just individual pointer mapping for the moment until you've had time to look into it further (although, I'm happy to look into it instead, if desired). As I am more than happy to utilise the bounds once it supports it, if it was intended/is fine to support this use-case.

I've not, at least not for array sectioning, the initial bounds work by OpenACC seemed more apt/appealing, especially to maintain some similarities across the implementations.

I have given a lot of thought about Fortran array slices in the context of OpenACC. :) Which is why in the end I ended up deciding to use a data bounds operation. I believe the same will apply to OpenMP.

I ended up explaining a few of the points around the need of the bounds operation here (in the 3rd part of my long response):
https://discourse.llvm.org/t/rfc-openacc-dialect-data-operation-improvements/69825/6?u=razvan.lupusoru
Here are the relevant pieces of the points I made there:

  • OpenACC slice semantics are not the same as slice semantics on the source language.
  • In both cases, the varPtr is the loaded address from the same descriptor.
  • The FIR inside the region is unaffected by the slicing operation. The original descriptor is still used.

So why is this relevant? Because the question was asked why not obtain the information from the FIR descriptor? Well, array slicing operations in directive models is not the same as slicing operation in the source language. So we cannot always use descriptor. Thus the bounds operation adds a uniform way to create data operations.

Thank you very much for providing further context behind the original decision! I believe OpenMP has similar concerns as you mention.

So I am still leaning towards indicating to the OMPIRBuilder in someway that it's lowering a Fortran pointer with a descriptor so it can directly access related information for offloading,

I am not convinced this additional complexity is needed. We can decompose the descriptor data action and the data action into two mapping operations. And the data action can indeed use bounds operation that loads from descriptor.

You're very likely correct, it is additional unnecessary complexity if the bounds operation can support this case.

For pointers it appears you can still use the bounds information as it's generated (and it just accesses the descriptor), but for pointers contained within a derived type you cannot;

To me this seems like an implementation limitation, not a design one. Is this the case you had in mind?

program test_derived_ptr
  type container
    integer :: spacer
    integer, pointer, dimension(:) :: ptrvar
  end type
  
  type(container) :: base
  integer, pointer, dimension(:) :: secondptrvar
  integer, dimension(10), target :: array
  
  base%ptrvar => array
  secondptrvar => array
  
  !$acc data copy(base%ptrvar,secondptrvar)
  !$acc end data
end program

That is one of the cases I was thinking of, but I was also thinking of the case (which I now think after some further reading is illegal) where the clause specifies just the derived type so:

$acc data copy(base)

and then the underlying pointers in the derived type (ptrvar in this case) would be implicitly mapped, similarly to the member spacer. However, on further re-reading of the OpenMP specification (hence the long time for a reply, sorry for that) I don't think that's actually legal in OpenMP. I think you'd have to specify the derived type and the pointer member to map individually, so something like:

!$acc data copy(base, base%ptrvar)

Although, I think with Flang's descriptors you could technically implicitly map the pointers contained in the structure, but that's against the OpenMP specification it seems (but on the off-chance I misunderstood again, if someone knows it's legal please do mention it)! Perhaps because it'd result in more non-user specified data mapping than is necessary in a lot of cases, or because C/C++ lacks descriptors or other complex reasons I've not thought about :-)

In this case, I indeed see that one of the operations has bounds and one does not:

%26 = acc.copyin varPtr(%25 : !fir.ptr<!fir.array<?xi32>>)   -> !fir.ptr<!fir.array<?xi32>> {dataClause = #acc<data_clause acc_copy>, name = "base%ptrvar"}
...
%32 = acc.bounds   lowerbound(%c0_3 : index) upperbound(%31 : index) stride(%30#2 : index) startIdx(%29#0 : index) {strideInBytes = true}
%33 = fir.box_addr %27 : (!fir.box<!fir.ptr<!fir.array<?xi32>>>) -> !fir.ptr<!fir.array<?xi32>>
%34 = acc.copyin varPtr(%33 : !fir.ptr<!fir.array<?xi32>>)   bounds(%32) -> !fir.ptr<!fir.array<?xi32>> {dataClause = #acc<data_clause acc_copy>, name = "secondptrvar"}

I don't see though why we couldn't generate a bounds operation for the derived type case. @clementval

That would be very useful, as I'd very much like to be able to use bounds to map pointer elements of derived types if at all possible! (when explicitly mapped :-) )

I've not, at least not for array sectioning, the initial bounds work by OpenACC seemed more apt/appealing, especially to maintain some similarities across the implementations.

I have given a lot of thought about Fortran array slices in the context of OpenACC. :) Which is why in the end I ended up deciding to use a data bounds operation. I believe the same will apply to OpenMP.

I ended up explaining a few of the points around the need of the bounds operation here (in the 3rd part of my long response):
https://discourse.llvm.org/t/rfc-openacc-dialect-data-operation-improvements/69825/6?u=razvan.lupusoru
Here are the relevant pieces of the points I made there:

  • OpenACC slice semantics are not the same as slice semantics on the source language.
  • In both cases, the varPtr is the loaded address from the same descriptor.
  • The FIR inside the region is unaffected by the slicing operation. The original descriptor is still used.

So why is this relevant? Because the question was asked why not obtain the information from the FIR descriptor? Well, array slicing operations in directive models is not the same as slicing operation in the source language. So we cannot always use descriptor. Thus the bounds operation adds a uniform way to create data operations.

Thanks for the explanation.
-> Is your point based on the assumption that we will use the Array Section Box in the region? Rather if we just reuse the FIR representation for Array Sections (i.e a new FIR Box B2 (for array(5:10)) created from the box for the whole input array B1 (for array) for only the map clause (or MLIR operand), I am assuming we can extract the relevant info (base_ptr, lower-bound, etc) from it.
-> Is capturing the base_ptr required, if we are using the same descriptor for the Array?
Note: I am not asking for a change but to understand the issues in the other representation.

agozillon updated this revision to Diff 556777.Sep 14 2023, 5:44 AM
agozillon marked 3 inline comments as done.

Update patch to address reviewer comments, more doucmentation for ops,
changed name of MapEntry->MapInfo, regress the clause parsing/printing to (almost) original state and fixed tests

agozillon added inline comments.Sep 14 2023, 5:52 AM
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
1038

It's likely it will be used for a little more than just array sectioning, but that is one of its uses, it can also be utilised for pointer mapping and hopefully pointers that are member of derived-types/structs (the former I have an initial implementation of downstream, the latter I'm waiting on some future adjustments to the initial data bounds generation).

In at least Fortran for the latter, the bounds gives access to the relevant sections of the lowered descriptor information but hopefully it'll give other dialects (or languages if/when the C++ MLIR layer becomes available) the ability to lower this type of information as well for OpenMP.

Since the intent is to cover slightly more cases than just array sectioning I've stuck with the DataBoundsOp name, but I've attempted to clarify the operation a little more and make it a little more OpenMP specific. I am happy to change the name though if you think something else is better suited!

LG. Nice work extracting the bits from the OpenACC proposal. Also thanks @razvanlupusoru @clementval.

Could you update the Summary and title as well? It might also be good to document in the Summary where it differs from OpenACC, and say that it is the target and target data operations that manages the mapping.

LG. Nice work extracting the bits from the OpenACC proposal. Also thanks @razvanlupusoru @clementval.

Thank you very much. And yes it's awesome work from @razvanlupusoru @clementval so thank you again (and thank you for the reviews as well).

Could you update the Summary and title as well? It might also be good to document in the Summary where it differs from OpenACC, and say that it is the target and target data operations that manages the mapping.

No problem, happy to do so.

Would it be possible to get an acceptance on the other patch segments in the series if you're happy with there current state as well (sorry for the bother)? After that I'll wait for @TIFitis to give his seal of approval as well and then I'll perhaps give a day or twos grace for final input before I land the patch series if that sounds reasonable?

agozillon retitled this revision from [OpenMP][MLIR] Refactor and extend current map support by adding MapEntry and Bounds operations to the OpenMP Dialect to [OpenMP][MLIR] Refactor and extend current map support by adding MapInfoOp and DataBoundsOp operations to the OpenMP Dialect.Sep 14 2023, 9:55 AM
agozillon edited the summary of this revision. (Show Details)
TIFitis accepted this revision.Sep 18 2023, 5:21 AM
agozillon updated this revision to Diff 556989.Sep 18 2023, 8:04 PM

rebase to keep level with the other patches in the series

This revision was landed with ongoing or failed builds.Sep 19 2023, 6:26 AM
This revision was automatically updated to reflect the committed changes.