This is an archive of the discontinued LLVM Phabricator instance.

[fir] Add fir.array_access op
ClosedPublic

Authored by clementval on Oct 25 2021, 5:49 AM.

Details

Summary

The array_access provides a reference to a single element from an array
value. This is *not* a view in the immutable array, otherwise it couldn't
be stored to. It can be see as a logical copy of the element and its
position in the array. This reference can be written to and modified without
changing the original array.

The array_access operation is used to fetch the memory reference of an
element in an array value.

fortran
      real :: a(n,m)
      ...
      ... a ...
      ... a(r,s+1) ...

One can use fir.array_access to recover the implied memory reference to
the element a(i,j) in an array expression a as shown above. It can also
be used to recover the reference element a(r,s+1) in the second
expression.

mlir
      %s = fir.shape %n, %m : (index, index) -> !fir.shape<2>
      // load the entire array 'a'
      %v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>
      // fetch the value of one of the array value's elements
      %1 = fir.array_access %v, %i, %j : (!fir.array<?x?xf32>, index, index) -> !fir.ref<f32>

More information about array_access and other array operations can be
found in flang/docs/FIRArrayOperations.md.

This patch is part of the upstreaming effort from fir-dev branch.

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

Diff Detail

Event Timeline

clementval created this revision.Oct 25 2021, 5:49 AM
clementval requested review of this revision.Oct 25 2021, 5:49 AM
mehdi_amini requested changes to this revision.Oct 25 2021, 8:32 PM

I am a bit concerned by the semantics here: fir.array_load is supposed to provide an immutable SSA value that correspond to an array. The fir.array_access presented here seems to get some sort of (mutable?) memory address "after the fact".
This really breaks the SSA immutable value-type model associated with fir.array.

This revision now requires changes to proceed.Oct 25 2021, 8:32 PM

This likely requires a longer design discussion, maybe on the mailing list. But:

  • If we have an immutable copy of an array in an SSA value, I would think that we'd get a single element by extracting it.
  • If we want a reference to the element in memory, then we shouldn't try to get it from such an SSA value but from the reference: basically an operation performing in-memory array-indexing / GEP-style manipulation.

This likely requires a longer design discussion, maybe on the mailing list. But:

  • If we have an immutable copy of an array in an SSA value, I would think that we'd get a single element by extracting it.
  • If we want a reference to the element in memory, then we shouldn't try to get it from such an SSA value but from the reference: basically an operation performing in-memory array-indexing / GEP-style manipulation.

I'll let @schweitz answer here since he is the one who designed the array operations and will have deeper explanation on that.

FIR array ops are meant to deal with Fortran's copy-in/copy-out semantics. They introduce a separation of concerns between lowering the parse trees and doing dependence analysis in the front end or resorting to brute-force introduction of ubiquitous array copies.

These operations do not break SSA properties. An array value is preserved and immutable. If an array is modified, then there are array operations that are used to modify the array and the SSA value is required to be threaded like any other newly created value.

Not all types in the IR are first-class with operators in the value domain. Lowering treats these types intuitively as objects in memory with properties that are only known at run-time. To advance the Fortran compiler in a timely manner, a mechanism to access these dynamic elements using memory operations is required. These ops give copy-in/copy-out semantics on arrays under both by-value and by-reference models.

It is fully understood that other solutions exist. Some were considered at length. This one meets the project schedule.

FIR array ops are meant to deal with Fortran's copy-in/copy-out semantics. They introduce a separation of concerns between lowering the parse trees and doing dependence analysis in the front end or resorting to brute-force introduction of ubiquitous array copies.

Is there a documentation on this design? Maybe a design discussion on the mailing-list you could point me to?

Let's see if I understand correctly, starting from:

 %s = fir.shape %n, %m : (index, index) -> !fir.shape<2>
// load the entire array 'a'
%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>
// update the value of one of the array value's elements
// %r_{ij} = %f  if (i,j) = (%i,%j),   %v_{ij} otherwise
%r = fir.array_update %v, %f, %i, %j : (!fir.array<?x?xf32>, f32, index, index) -> !fir.array<?x?xf32>
fir.array_merge_store %v, %r to %a : !fir.ref<!fir.array<?x?xf32>>

Then adding an array and a load:

 %s = fir.shape %n, %m : (index, index) -> !fir.shape<2>
// load the entire array 'a'
%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>
// update the value of one of the array value's elements
// %r_{ij} = %f  if (i,j) = (%i,%j),   %v_{ij} otherwise
%r = fir.array_update %v, %f, %i, %j : (!fir.array<?x?xf32>, f32, index, index) -> !fir.array<?x?xf32>
fir.array_merge_store %v, %r to %a : !fir.ref<!fir.array<?x?xf32>>
%p = fir.array_access %v, %I, %j : (!fir.array<?x?xf32>, index) -> !fir.ref<f32>
%l = fir.load %p : !fir.ref<f32>

Is the same as the following where the array_access is moved before the store:

 %s = fir.shape %n, %m : (index, index) -> !fir.shape<2>
// load the entire array 'a'
%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>
%p = fir.array_access %v, %i, %j : (!fir.array<?x?xf32>, index) -> !fir.ref<f32>
// update the value of one of the array value's elements
// %r_{ij} = %f  if (i,j) = (%i,%j),   %v_{ij} otherwise
%r = fir.array_update %v, %f, %i, %j : (!fir.array<?x?xf32>, f32, index, index) -> !fir.array<?x?xf32>
fir.array_merge_store %v, %r to %a : !fir.ref<!fir.array<?x?xf32>>
%l = fir.load %p : !fir.ref<f32>

Is the same as the following where the load is also moved before the store:

 %s = fir.shape %n, %m : (index, index) -> !fir.shape<2>
// load the entire array 'a'
%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>
%p = fir.array_access %v, %i, %j : (!fir.array<?x?xf32>, index) -> !fir.ref<f32>
%l = fir.load %p : !fir.ref<f32>
// update the value of one of the array value's elements
// %r_{ij} = %f  if (i,j) = (%i,%j),   %v_{ij} otherwise
%r = fir.array_update %v, %f, %i, %j : (!fir.array<?x?xf32>, f32, index, index) -> !fir.array<?x?xf32>
fir.array_merge_store %v, %r to %a : !fir.ref<!fir.array<?x?xf32>>

Something I am not clear on, is am I allow to store to this ref or is it "constant"?
For example is this legal IR:

%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>
%p = fir.array_access %v, %i, %j : (!fir.array<?x?xf32>, index) -> !fir.ref<f32>
fir.store %scalar to %p : !fir.ref<f32> // store to the

It is fully understood that other solutions exist. Some were considered at length. This one meets the project schedule.

Sure, but when you're proposing to make tradeoff like that, it is useful to make it explicit and document them carefully (I. Mentioning the alternatives, etc. is also useful. In general I invite you to conduct such discussion on the mailing-list (or Discourse I don't know what's preferred).

I'm also in general looking to anticipate not getting into situations like the discussion in https://reviews.llvm.org/D109579

The array ops are limited use only. They model copy-in/copy-out semantics over a Fortran statement. The Fortran statement boundary can be a bit slippery and hard to define precisely in terms of lists of Ops. Effectively, the array_load(s) and array_merge_store are a pairing that brackets the lifetime of the array copies.

array_fetch and array_update are defined to work as getter/setter pairs on values of elements from loaded array copies. These have "GEP-like" syntax and semantics.

array_access and array_amend are defined to work as getter/setter pairs on references to elements in loaded array copies. array_access has "GEP-like" syntax. array_amend annotates which loaded array copy is being written to. It is invalid to update an array copy without array_amend; doing so will result in undefined behavior.

The documentation is sparse here and we'll add some more examples.

A small (unedited) example that may be helpful.

subroutine s(a,l,u)
  type t
    integer m
  end type t
  type(t) :: a(:)
  integer :: l, u
  forall (i=l:u)
    a(i) = a(u-i+1)
  end forall
end

which is translated to FIR as

func @_QPs(%arg0: !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, %arg1: !fir.ref<i32>, %arg2: !fir.ref<i32>) {
  %0 = fir.alloca i32 {adapt.valuebyref, bindc_name = "i"}
  %1 = fir.load %arg1 : !fir.ref<i32>
  %2 = fir.convert %1 : (i32) -> index
  %3 = fir.load %arg2 : !fir.ref<i32>
  %4 = fir.convert %3 : (i32) -> index
  %c1 = arith.constant 1 : index
  %5 = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>>
  %6 = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>>
  %7 = fir.do_loop %arg3 = %2 to %4 step %c1 unordered iter_args(%arg4 = %5) -> (!fir.array<?x!fir.type<_QFsTt{m:i32}>>) {
    %8 = fir.convert %arg3 : (index) -> i32
    fir.store %8 to %0 : !fir.ref<i32>
    %c1_i32 = arith.constant 1 : i32
    %9 = fir.load %arg2 : !fir.ref<i32>
    %10 = fir.load %0 : !fir.ref<i32>
    %11 = arith.subi %9, %10 : i32
    %12 = arith.addi %11, %c1_i32 : i32
    %13 = fir.convert %12 : (i32) -> i64
    %14 = fir.convert %13 : (i64) -> index
    %15 = fir.array_access %6, %14 {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>>
    %16 = fir.load %0 : !fir.ref<i32>
    %17 = fir.convert %16 : (i32) -> i64
    %18 = fir.convert %17 : (i64) -> index
    %19 = fir.array_access %arg4, %18 {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>>
    %20 = fir.load %15 : !fir.ref<!fir.type<_QFsTt{m:i32}>>
    fir.store %20 to %19 : !fir.ref<!fir.type<_QFsTt{m:i32}>>
    %21 = fir.array_amend %arg4, %19 : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.ref<!fir.type<_QFsTt{m:i32}>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>>
    fir.result %21 : !fir.array<?x!fir.type<_QFsTt{m:i32}>>
  }
  fir.array_merge_store %5, %7 to %arg0 : !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>
  return
}

The the extra documentation here would be really helpful. In particular, from my 30,000 ft. view, it is not always clear when something is driven by a lower-level design choice tradeoff versus a goal of maintaining/capturing aspects of Fortran-level semantics that are necessary/beneficial for FIR-specific analysis and optimization.

Thanks for the example, that's very helpful!

Just trying to lay down how I understand this, it seems like there should be a reasonable abstract model allowing to reason about each of these ops/types in isolation:

  1. !fir.array is a value type, so it is immutable. It is a multi-dimensional array of value, but also keeps track of which values are "updated", where "updated" just means that it comes from the right-hand side of an array_amend or updated with a fir.store.
  2. fir.array_load loads a !fir.array` from memory, no value is marked as "updated".
  3. fir.array_access provides a reference to a single element from an array. This is *not* a view in the immutable array, otherwise it couldn't be stored to. It can be thought of as a copy of the array element + the coordinate in the array. On its own this reference can be written to and modified without changing the array it comes from. Something unclear with this abstract model would be the lifetime of this new "object": but we could consider that this has the effect of an "alloca" and dies with the current scope.
  4. fir.store on a reference obtained by fir.array_access is storing a new value to this reference, nothing special here.
  5. fir.amend updates an array with a reference: the reference contains a value and the location where to update the value. The location in the resulting array is marked as "updated".
  6. fir.array_merge_store updates the value on its LHS with the "updated value" from its RHS and store the result.

Is this consistent with the semantics model you have? (I don't have a complete picture of your implementation downstream, so it'd be useful to describe this and write it down, happy to help with this if needed)

(An alternative model to this "array_access" that produces a writeable location could be to produce a new array instead, taking inspiration of a similar value type array: the builtin.tensor type (see https://mlir.llvm.org/docs/Dialects/TensorOps/#tensorinsert-mlirtensorinsertop ). But that has other issues.)

(as a side-note, there is something a bit hard to grasp in your example in terms of SSA semantics when combining a parallel loop with an iterarg, but that's also an active topic of discussion in MLIR in general: we just looked at it recently again in the context of parallel tiling on tensors)

  1. fir.array_access provides a reference to a single element from an array. This is *not* a view in the immutable array, otherwise it couldn't be stored to. It can be thought of as a copy of the array element + the coordinate in the array. On its own this reference can be written to and modified without changing the array it comes from. Something unclear with this abstract model would be the lifetime of this new "object": but we could consider that this has the effect of an "alloca" and dies with the current scope.

To clarify how I see this, in pseudo C++ %p = fir.array_access %v, %i, %j : (!fir.array<?x?xf32>, index) -> !fir.ref<f32>

struct 2d_float_array_access_ref {
  float value;
  int64_t coord[2]
};
#define ARRAY_ACCESS(array, i, j)  \
  2d_float_array_access_ref *p = (2d_float_array_access_ref)alloca(sizeof(2d_float_array_access_ref)); \
  p->value = array[i][j]; \
  p->coord[0] = i; \
  p->coord[1] = j; \

#define STORE_REF(value, ref)  ref->value = value;
#define LOAD_REF(value, ref)  ref->value;

Yes, I believe that's the essence of it. Making logical copies, keeping a transaction log on an updated array, and then merging the final result back to the original. The purpose being to do the dependence analysis and elide copies where possible as an MLIR pass.

Not sure if this was mentioned before, but a value-based element update has preference. (i.e., fir.array_fetch and fir.array_update.) However, that's not entirely sufficient as certain types of values in Fortran are implicitly memory-bound, such as the trivial derived type example. In full generality, the size and layout of these values can be statically non-determinable.

Is there a place where to write this all down? flang/docs/... (either a new section in an existing doc or a new doc?)

Is there a place where to write this all down? flang/docs/... (either a new section in an existing doc or a new doc?)

I think either a section in flang/docs/FortranIR.md or just a new doc would be fine since it's likely to take lots of place with example and explanation.

Update op description after FIRArrayOperations.md documentation
Rebase

clementval edited the summary of this revision. (Show Details)Jan 21 2022, 1:51 AM
schweitz accepted this revision.Jan 21 2022, 8:55 AM

@mehdi_amini Are we good to go here now that we have the initial version of the documentation online? I updated the description here to reflect the discussion we had here and in the doc review.

mehdi_amini resigned from this revision.Jan 22 2022, 11:02 AM

Thanks for updating the description, I'll move out of your way here, I won't have time to page-in the doc again and re-review here! Thanks

This revision is now accepted and ready to land.Jan 22 2022, 11:02 AM
flang/include/flang/Optimizer/Dialect/FIROps.td
1570

Can this be verified in the verifier? Or is it too costly due to the transitive look up?

clementval added inline comments.Jan 24 2022, 5:22 AM
flang/include/flang/Optimizer/Dialect/FIROps.td
1570

I think that it can be done. Do you mind if we do a separate patch for this?

I have a few Nit comments. Otherwise LGTM.

flang/include/flang/Optimizer/Dialect/FIROps.td
1570

The reason I asked is that sometimes the verification checks which are expensive are in a pass.

That is OK with me. If no immediate plans then please create a ticket or let me know I will. It might be a good task for a beginner.

flang/lib/Optimizer/Dialect/FIROps.cpp
505

Nit: Remove auto

511

Nit: Remove auto

513

Nit: Can you add a test for this check?

mehdi_amini added inline comments.Jan 24 2022, 6:19 PM
flang/include/flang/Optimizer/Dialect/FIROps.td
1570

It's not clear to me that having such a structural requirement on the IR is a good idea in general: this makes it fragile to transformations in general and is quite concerning.

clementval marked 2 inline comments as done.Jan 24 2022, 11:26 PM
clementval added inline comments.
flang/include/flang/Optimizer/Dialect/FIROps.td
1570

This was discussed extensively in the array operations documentation patch. Just to make sure I understand your comment in the right way. Do you mean we should not verify such constraint or drop it? If we drop it, I guess the transformation that requires it will check for it and fails if it cannot verify this.

mehdi_amini added inline comments.Jan 24 2022, 11:30 PM
flang/include/flang/Optimizer/Dialect/FIROps.td
1570

I think requiring this in the IR isn't a good design in itself and should be revisited. Anytime you make an SSA value not "standalone" but requiring to be structurally produced with specific constraints is breaking the very core definition of what is an SSA value.
Relying on such pattern to match for optimization purpose is one thing, but forcing it for correctness in the compiler is an entirely another can of worms.

We touched on it in the document patch: I still object to this. I don't think the design itself has really been properly reviewed and discussed upstream (as in: an RFC on discourse or on the development mailing-list).
(whatever happened downstream isn't very relevant to me).

Remove auto
Add invalid test

clementval added inline comments.Jan 25 2022, 4:04 AM
flang/include/flang/Optimizer/Dialect/FIROps.td
1570

How different is than imposing a specific type? We could have a specific type derived from the array type we have currently and change the op definition to use this type instead of accepting all fir_SequenceType. In the end the restriction is pretty much similar or maybe I miss smth.
I agree that having such "restriction" hard-coded would make transformation quite difficult. In practice, right now, this is kind of an implied restriction since there is no other way to construct an meaninginful and "invalid" IR with those operations.

clementval marked an inline comment as done.Jan 25 2022, 6:00 AM
flang/include/flang/Optimizer/Dialect/FIROps.td
1570

The array operations are for restricted use and only for modelling the Fortran Array semantics. It helps lower the Array Expressions (to MLIR) and inserts temporaries wherever required. From what I understand the Array Value Copy pass will replace these array operations with other FIR operations and I believe this is one of the first passes (correct me If I am wrong) that will run. Hence I think any issue with the current design is localized to the pass and the array operations. So I was thinking it is OK to go ahead with the current design unless there are cases that will lead to incorrect code being generated.

I agree that we have not had RFC discussions about these topics. This is due to a lack of expertise (outside the FIR core team) and the development happening in the fir-dev repo. Since Fortran Array expressions are very important, if we pursue alternative designs without accepting the current design we will continue to have the unfortunate situation of two repos since I imagine this discussion will take a lot of time.

Since @mehdi_amini and @clementval have spent a lot of time reviewing and getting the patch ready, I am feeling uncomfortable submitting without further action or requesting action without submitting.

I guess one action would be for @clementval and @schweitz to document the alternative approaches considered as well.

mehdi_amini added inline comments.Jan 25 2022, 10:06 AM
flang/include/flang/Optimizer/Dialect/FIROps.td
1570

How different is than imposing a specific type? We could have a specific type derived from the array type we have currently and change the op definition to use this type instead of accepting all fir_SequenceType. In the end the restriction is pretty much similar or maybe I miss smth.

It is very different: using a type does not require structurally that you can find the operation that produces it. For example it does not forbid to outline the code, or split the basic block and make it a basic block arguments.
IIRC there has been issues with generic canonicalization for fir because of this kind of issues.

clementval added inline comments.Jan 25 2022, 10:15 AM
flang/include/flang/Optimizer/Dialect/FIROps.td
1570

Right. In this particular case the restriction does not prevent to split the block and make it a basic argument but it's through we had issue on another op.

So what do we do here? I think we are back to where we were before doing the array operations documentation. I would advocate for this to go through like this and when the full picture is available for everyone (not only people working on fir-dev) we could reopen the discussion and see if we find a better design.
Regarding the restriction written in the description we can also remove it as it is not enforced. Just talking about FIR itself it is not required to have it but only the translation will look for such construction.

@kiranchandramohan @mehdi_amini @schweitz

clementval edited the summary of this revision. (Show Details)Feb 2 2022, 12:57 PM
This comment was removed by mehdi_amini.

So what do we do here? I think we are back to where we were before doing the array operations documentation. I would advocate for this to go through like this and when the full picture is available for everyone (not only people working on fir-dev) we could reopen the discussion and see if we find a better design.

Right: it is worth finishing upstreaming fir-dev, with a TODO about revisiting the design and the underlying principles about this part of FIR.
(I had removed any blocker on this revision earlier to signal that you could move forward from my point of view, sorry if I wasn't clear!)

clementval added a comment.EditedFeb 2 2022, 1:06 PM

So what do we do here? I think we are back to where we were before doing the array operations documentation. I would advocate for this to go through like this and when the full picture is available for everyone (not only people working on fir-dev) we could reopen the discussion and see if we find a better design.

Right: it is worth finishing upstreaming fir-dev, with a TODO about revisiting the design and the underlying principles about this part of FIR.
(I had removed any blocker on this revision earlier to signal that you could move forward from my point of view, sorry if I wasn't clear!)

Thanks Mehdi for commenting back on that. I'll add a proper TODO here before moving on.

This revision was automatically updated to reflect the committed changes.