This is an archive of the discontinued LLVM Phabricator instance.

flang/docs/FIRArrayOperations.md
19–20	Can we link to a doc explaining this? (assuming this exists on the web, otherwise it would be worth an introduction)
40
95	It's not clear to me in this example why array_merge_store needs %v? Why isn't it just `fir.array_store %r to %a` enough? Seems like the array_update should have all the information here.
123	Why this restriction?
139	Shouldn't the expression in Fortran above shows i and j somehow?
158
159	I'm not comfortable with this description: from our previous discussion as it is described it looks like this creates a mutable reference to an element in the immutable array. We converged last time on a semantics where this implicitly creates a new memory location with automatic-allocation scope and copy the value there, as well as remembering the original coordinates this value came from. But this is a bit iffy as well: it seems like a `Ref<i32>` can't be universally defined where it comes from an `array_access` or from a `fir.alloca`. This seems a bit loosely defined to me right now.
181	Same question: why do we need this restriction?
185	You need to update the description of the "array type" to mention what this means.
196	I wonder why array_amend is needed at all actually. When can't it just be replaced with a load of the `ref<T>` and an `array_update`?
275	It'd help if this code was simplified, for example the alloca for the loop induction variable seems spurious and just adds noise, similarly for the redundant conversions. The SSA value could get name for readability of the example as well. And finally some inline comment in the code to describe it and map it to the original source code would really help I think. func @_QPs(%arg0: !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, %arg1: !fir.ref<i32>, %arg2: !fir.ref<i32>) { %l = fir.load %arg1 : !fir.ref<i32> %l_index = fir.convert %1 : (i32) -> index %u = fir.load %arg2 : !fir.ref<i32> %u_index = fir.convert %u : (i32) -> index %c1 = arith.constant 1 : index // This is the "copy-in" array used on the RHS of the expression. It will be indexed into and loaded at each iteration. %array_a_src = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>> // This is the "seed" for the "copy-out" array on the LHS. It'll flow from iteration to iteration and gets // updated at each iteration. %array_a_dest_init = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>> %array_a_final = fir.do_loop %i = %l_index to %u_index step %c1 unordered iter_args(%array_a_dest = % array_a_dest_init) -> (!fir.array<?x!fir.type<_QFsTt{m:i32}>>) { // Compute indexing for the RHS and array the element. %u_minus_i = arith.subi %u, %I : index // u-i %u_minus_i_plus_one = arith.addi %u_minus_i, %c1: index // u-i+1 %a_src_ref = fir.array_access %array_a_src, %u_minus_i_plus_one {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> %a_src_elt = fir.load %a_src_ref : !fir.ref<!fir.type<_QFsTt{m:i32}>> // Get the reference to the element in the array on the LHS %a_dst_ref = fir.array_access %array_a_dest, %i {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> // Store the value, and update the array fir.store %a_src_elt to %a_dst_ref : !fir.ref<!fir.type<_QFsTt{m:i32}>> %updated_array_a = fir.array_amend %array_a, %a_dst_ref : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.ref<!fir.type<_QFsTt{m:i32}>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>> // Forward the current updated array to the next iteration. fir.result %updated_array_a : !fir.array<?x!fir.type<_QFsTt{m:i32}>> } // Store back the result by merging the initial value loaded before the loop // with the final one produced by the loop. fir.array_merge_store %array_a_dest_init, %array_a_final to %arg0 : !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>> return } This is an interesting example, and before even getting into the optimization section, it could be useful to explain how this works, and only then get into how it is optimized. You introduced each operation individually, but seeing it all fit together is another story.

thanks for adding the doc! That's gonna be very helpful for onboarding people I think

rovka added a subscriber: rovka.Dec 6 2021, 12:58 AM

rovka added inline comments.

flang/docs/FIRArrayOperations.md
22	Nit: you might want to rephrase this so we don't have to change the number if we ever have to add or remove one of these.
66
71
79
200
203
203	So, IIUC, the 'array-value-copy' is both an analysis pass and a transformation pass, right? I think it would be good to clarify that up front, because I wasn't 100% certain of that until I saw there's a before/after example below. Also explain exactly what the pass does, since it seems to be more than just copy elision, which was what I first understood when I read this text.
336

Fix typos

flang/docs/FIRArrayOperations.md
203	The transformation needs an analysis first to be performed. More description of the pass should go directly in the pass file itself. The purpose of having an example here is just to show some array operations in use in a bigger example.

Harbormaster completed remote builds in B137613: Diff 391997.Dec 6 2021, 2:13 AM

clementval marked 3 inline comments as done.Dec 7 2021, 11:26 AM

clementval added inline comments.

flang/docs/FIRArrayOperations.md

In this example it has everything but it's not always the case like in the example below:

func @conversion_with_temporary(%arr0 : !fir.ref<!fir.array<10xi32>>) {
   %c10 = arith.constant 10 : index
   %1 = fir.shape %c10 : (index) -> !fir.shape<1>
   %2 = fir.array_load %arr0(%1) : (!fir.ref<!fir.array<10xi32>>, !fir.shape<1>) -> !fir.array<10xi32>
   %c10_i64 = arith.constant 10 : i64
   %3 = fir.convert %c10_i64 : (i64) -> index
   %c1_i64 = arith.constant 1 : i64
   %c-1_i64 = arith.constant -1 : i64
   %4 = fir.shape %c10 : (index) -> !fir.shape<1>
   %5 = fir.slice %c10_i64, %c1_i64, %c-1_i64 : (i64, i64, i64) -> !fir.slice<1>
   %6 = fir.array_load %arr0(%4) [%5] : (!fir.ref<!fir.array<10xi32>>, !fir.shape<1>, !fir.slice<1>) -> !fir.array<10xi32>
   %c1 = arith.constant 1 : index
   %c0 = arith.constant 0 : index
   %7 = arith.subi %3, %c1 : index
   %8 = fir.do_loop %arg0 = %c0 to %7 step %c1 unordered iter_args(%arg1 = %2) -> (!fir.array<10xi32>) {
     %9 = fir.array_fetch %6, %arg0 : (!fir.array<10xi32>, index) -> i32
     %10 = fir.array_update %arg1, %9, %arg0 : (!fir.array<10xi32>, i32, index) -> !fir.array<10xi32>
     fir.result %10 : !fir.array<10xi32>
   }
   fir.array_merge_store %2, %8 to %arr0 : !fir.array<10xi32>, !fir.array<10xi32>, !fir.ref<!fir.array<10xi32>>
   return
 }

139

This Fortran example will generate an implied loop so the actual i and j are not present in the user code but will be generated.

clementval marked 2 inline comments as done.Dec 7 2021, 1:34 PM

clementval added inline comments.

flang/docs/FIRArrayOperations.md
123	The array "abstraction" in FIR is not meant to be used anywhere. Fortran itself doesn't have "array values" in any sane sense of that term. To avoid to introduce arrays universally in FIR we restrict the use of the abstraction to precisely where we want it and where it is useful.
181	See answer above.

clementval added inline comments.Dec 7 2021, 1:43 PM

flang/docs/FIRArrayOperations.md
19–20	We have this in the doc: https://github.com/flang-compiler/f18-llvm-project/blob/main/flang/docs/FortranForCProgrammers.md We can probably add a small introduction here.
275	We can simplify and add more explanation to the example for sure.

mehdi_amini added inline comments.Dec 7 2021, 8:48 PM

flang/docs/FIRArrayOperations.md
95	I don't quite get the different in this example? If I track the loop, you have the loaded array used a initial induction value, then you update one element, and then pass it along to the next iteration until the end. %8 seems to contain the array fully updated?
123	I don't feel this is a clear answer. What does "To avoid to introduce arrays universally" means for example? Restricting the use of array makes sense: that means that `fir.add` will not accept to add two arrays for example, but that does not explain clearly to me the link between fetch and load. In general: an SSA value is equal to another SSA value, what you're saying here is that inside the IR this isn't the case and a transformation that would make an SSA value a block argument for example isn't valid. This seems fragile and extremely uncommon (I don't know of any precent in any dialect I worked on), so I'd insist to look for any alternative to such a strong design constraint: the bar to justify this should be really high and we're not there yet IMO.

Address comments

flang/docs/FIRArrayOperations.md
95	Agreed you can track it back all the way to `fir.array_load`. Since we know it during lowering why make it harder to track it and not just having it right away?
123	I think the restriction is probably not phrased correctly. What we want to mean is that the value must trace back transitively to an `array_load` as the dominating source. We have places where those SSA value are block arguments. For example, this is totally fine FIR: %1 = array_load fir.loop ... iter_args (%2 = %1) ... %3 = array_fetch %2
159	Description updated.
185	Which "array type" are you referencing to? (`!fir.array`)?
196	In some cases we do not want to load derived types/characters because they may have non-compile time constant sizes (cannot be loaded/might be big aggregates for which load/store is problematic (cause long LLVM compile time + asserts at a certain point).
275	I update this section and kept just the example with annotation. I think it makes more sense to keep this documentation related to the array operations. Finer documentation on array-value-copy is probably better in the pass itself as comment or in a separate documentation.

mehdi_amini added inline comments.Dec 8 2021, 11:35 AM

flang/docs/FIRArrayOperations.md
95	My comment explaining how %8 should be enough isn't about the ability tracking it back to the fir.array_load, this is about the SSA value being self consistent in the semantics model. It isn't clear to me in the description of fir.array_merge_store what is the first operand useful for. I suspect it helps tying thing together in some lowering later potentially, but I'm not sure how (and it isn't documented).
123	For example, this is totally fine FIR: %1 = array_load fir.loop ... iter_args (%2 = %1) ... %3 = array_fetch %2 Seems like %2 after the first iteration will be the result of the yield of the previous iteration: what will produce this value? Unless it is an array_load, your condition does not seem met to me.
196	Can you document this with an example illustrating such a case?

Harbormaster completed remote builds in B138217: Diff 392845.Dec 8 2021, 11:50 AM

clementval added inline comments.Dec 9 2021, 9:33 AM

flang/docs/FIRArrayOperations.md

Array expressions work with a context (such as a statement). FIR does not model Fortran statements directly. Instead we "bound" a region of FIR with array_loads and an array_merge_store.
The analysis looks for these Ops as markers that define the region of IR to be considered for analysis.

123

This is a more complete example:

%7 = fir.array_load %arg0 [%6] : (!fir.box<!fir.array<?xf32>>, !fir.slice<1>) 
-> !fir.array<?xf32>
...
%18 = fir.do_loop %arg5 = %c0 to %17 step %c1 iter_args(%arg6 = %7) -> (!fir.array<?xf32>) {
  %19 = fir.array_fetch %arg6, %arg5 : (!fir.array<?xf32>, index) -> f32
  %20 = fir.array_update %arg6, %19, %arg5 : (!fir.array<?xf32>, f32, index) -> !fir.array<?xf32>
  fir.result %20 : !fir.array<?xf32>
}
fir.array_merge_store %7, %18 to %arg0[%6] : !fir.array<?xf32>, !fir.array<?xf32>, !fir.box<!fir.array<?xf32>>, !fir.slice<1>

196

Here is an example with character assignment.

Fortran

subroutine foo(c1, c2, n)
  integer(8) :: n
  character(n) :: c1(:), c2(:)
  c1 = c2
end subroutine

That would results in a cleaned-up FIR that looks like this:

func @_QPfoo(%arg0: !fir.box<!fir.array<?x!fir.char<1,?>>>, %arg1: !fir.box<!fir.array<?x!fir.char<1,?>>>, %arg2: !fir.ref<i64>) {
    %0 = fir.load %arg2 : !fir.ref<i64>
    %c0 = arith.constant 0 : index
    %1:3 = fir.box_dims %arg0, %c0 : (!fir.box<!fir.array<?x!fir.char<1,?>>>, index) -> (index, index, index)
    %2 = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.char<1,?>>>) -> !fir.array<?x!fir.char<1,?>>
    %3 = fir.array_load %arg1 : (!fir.box<!fir.array<?x!fir.char<1,?>>>) -> !fir.array<?x!fir.char<1,?>>
    %c1 = arith.constant 1 : index
    %4 = arith.subi %1#1, %c1 : index
    %5 = fir.do_loop %arg3 = %c0 to %4 step %c1 unordered iter_args(%arg4 = %2) -> (!fir.array<?x!fir.char<1,?>>) {
      %6 = fir.array_access %3, %arg3 : (!fir.array<?x!fir.char<1,?>>, index) -> !fir.ref<!fir.char<1,?>>
      %7 = fir.array_access %arg4, %arg3 : (!fir.array<?x!fir.char<1,?>>, index) -> !fir.ref<!fir.char<1,?>>
      %false = arith.constant false
      %8 = fir.convert %7 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
      %9 = fir.convert %6 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
      fir.call @llvm.memmove.p0i8.p0i8.i64(%8, %9, %0, %false) : (!fir.ref<i8>, !fir.ref<i8>, i64, i1) -> ()
      %10 = fir.array_amend %arg4, %7 : (!fir.array<?x!fir.char<1,?>>, !fir.ref<!fir.char<1,?>>) -> !fir.array<?x!fir.char<1,?>>
      fir.result %10 : !fir.array<?x!fir.char<1,?>>
    }
    fir.array_merge_store %2, %5 to %arg0 : !fir.array<?x!fir.char<1,?>>, !fir.array<?x!fir.char<1,?>>, !fir.box<!fir.array<?x!fir.char<1,?>>>
    return
  }
  func private @llvm.memmove.p0i8.p0i8.i64(!fir.ref<i8>, !fir.ref<i8>, i64, i1)
}

We do not want to start loading !fir.ref<!fir.char<1,?>> here because it has dynamic length, and even if constant, could be way too long.

mehdi_amini added inline comments.Dec 9 2021, 4:03 PM

flang/docs/FIRArrayOperations.md
95	I understand the "statement problem" in Fortran, but I don't quite make the direct connection in FIR. Basically you're saying "we're propagating the 'bound' of the statement here", but that does not really say why it is needed or useful?
123	Right, so here `array_fetch` will consume the result of the `array_update` after the first iteration and not the result from an `array_load`. It may be a problem of definition, when you write "that can be trace back transitively to an `array_load` " ; do you see `array_update` as "transparent" in here? If so what is the list of such "transparent" operations?
196	Uh, I'm not totally understanding all the pieces here, in particular you're converting the references to `!fir.char<1,?>` to simple i8 reference and use the N arguments for the memmove size, but at the same time you're saying that they are dynamic length. if constant, could be way too long. You're saying that basically `array_amend` is a fused form of `load`+`update` because the optimizer bails out and can't reconstruct the fused form later in the lowering? All of this may be fine: it just deserves a lot of documentation I think.

Update restrictions infos. Rest of the updates will come shortly.

flang/docs/FIRArrayOperations.md
95	It's useful because it bounds the second of the FIR that represents the statements like a `FORALL`. Statement have specific semantics and like I mentioned before it is useful for analysis for example to know the "bound" defined by array_load and array_merge_store.

Harbormaster completed remote builds in B139423: Diff 394545.Dec 15 2021, 6:37 AM

mehdi_amini added inline comments.Dec 15 2021, 1:14 PM

flang/docs/FIRArrayOperations.md
95	I feel we're going in rounds here, you keep telling it is useful and I keep asking you to elaborate on this: can you put it in the doc. A reader of the doc shouldn't have to "guess" the design and a design doc should go beyond explaining just what it does but also capture the "why" it is done this way. If you say "it is useful", then please demonstrate it, or illustrate with examples.

mehdi_amini added inline comments.Dec 15 2021, 1:16 PM

flang/docs/FIRArrayOperations.md
95	Another way to approach it is to analyze "what if this operand wasn't provided?" and answer the question with a description of how/where the optimizer or the codegen would be infeasible or very inconvenient.

Update the doc

flang/docs/FIRArrayOperations.md
196	`array_update` serves two purposes: it is an array access, and it denotes the array as being changed. The lhs is distinguished from everything on the rhs. Fortran arrays are implicitly memory bound as are some other Fortran type/kind entities. For entities that can be atomically promoted to the value domain, we use `array_fetch` and `array_update`. For those type/kinds that cannot be promoted to values, we must leave them in a memory reference domain, and we use `array_access` and `array_amend`. `array_access` and `array_amend` split the two purposes of `array_update` into two distinct ops. `array_access` captures the array access semantics and `array_amend` denotes which `array_access` is the lhs.

Harbormaster completed remote builds in B140440: Diff 395928.Dec 22 2021, 2:36 PM

Ping

mehdi_amini added inline comments.Jan 19 2022, 3:38 PM

flang/docs/FIRArrayOperations.md
95	You pinged the revision but I don't think you actually followed up here, or did I miss something?

clementval added inline comments.Jan 20 2022, 9:31 AM

flang/docs/FIRArrayOperations.md
95	We have added some documentation in the last paragraph just below.

schweitz accepted this revision.Jan 20 2022, 10:19 AM

schweitz added inline comments.

flang/docs/FIRArrayOperations.md
119	A more pejorative term.
122

This revision is now accepted and ready to land.Jan 20 2022, 10:19 AM

Include Eric's suggestions

clementval marked 2 inline comments as done.Jan 20 2022, 12:36 PM

clementval added a reviewer: kiranchandramohan.Jan 20 2022, 1:23 PM

Harbormaster completed remote builds in B144663: Diff 401749.Jan 20 2022, 2:30 PM

mehdi_amini added inline comments.Jan 20 2022, 10:00 PM

flang/docs/FIRArrayOperations.md
95	OK, thanks. I won't have bandwidth right now to page-in again all the context from back when I was actively trying to understand this in detail, so it's probably better if you just land this now.

clementval marked 7 inline comments as done.Jan 21 2022, 12:55 AM

clementval added inline comments.

flang/docs/FIRArrayOperations.md
95	Thanks for the reviews. The documentation can probably still be improved here and there and I guess it will be updated over time when other in the community get a chance to look at it.

Closed by commit rG69825f369302: [fir] Add array operations documentation (authored by clementval). · Explain WhyJan 21 2022, 12:57 AM

This revision was automatically updated to reflect the committed changes.

clementval marked an inline comment as done.

clementval added a commit: rG69825f369302: [fir] Add array operations documentation.

Revision Contents

Path

Size

flang/

docs/

FIRArrayOperations.md

342 lines

Diff 401889

flang/docs/FIRArrayOperations.md

This file was added.

<!--===- docs/FIRArrayOperations.md

Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

See https://llvm.org/LICENSE.txt for license information.

SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

-->

# Design: FIR Array operations

```eval_rst

.. contents::

:local:

```

## General

The array operations in FIR model the copy-in/copy-out semantics over Fortran

statements.

mehdi_aminiUnsubmitted

Not Done

Can we link to a doc explaining this? (assuming this exists on the web, otherwise it would be worth an introduction)

mehdi_amini: Can we link to a doc explaining this? (assuming this exists on the web, otherwise it would be…

clementvalAuthorUnsubmitted

Done

We have this in the doc: https://github.com/flang-compiler/f18-llvm-project/blob/main/flang/docs/FortranForCProgrammers.md

We can probably add a small introduction here.

clementval: We have this in the doc: https://github.com/flang-compiler/f18-llvm…

Fortran language semantics sometimes require the compiler to make a temporary

copy of an array or array slice. Situations where this can occur include:

rovkaUnsubmitted

Done

statements.

- They are currently 6 array operations:

+ There are currently 6 array operations:

- `fir.array_load`

Nit: you might want to rephrase this so we don't have to change the number if we ever have to add or remove one of these.

rovka: Nit: you might want to rephrase this so we don't have to change the number if we ever have to…

* Passing a non-contiguous array to a procedure that does not declare it as

assumed-shape.

* Array expressions, especially those involving `RESHAPE`, `PACK`, and `MERGE`.

* Assignments of arrays where the array appears on both the left and right-hand

sides of the assignment.

* Assignments of `POINTER` arrays.

There are currently the following operations:

- `fir.array_load`

- `fir.array_merge_store`

- `fir.array_fetch`

- `fir.array_update`

- `fir.array_access`

- `fir.array_amend`

`array_load`(s) and `array_merge_store` are a pairing that brackets the lifetime

of the array copies.

mehdi_aminiUnsubmitted

Done

syntax. `array_amend` annotates which loaded array copy is being written to.

- It is invalid to update an array copy without array_amend; doing so will result

+ It is invalid to update an array copy without `array_amend`; doing so will result

in undefined behavior.

mehdi_amini:

`array_fetch` and `array_update` are defined to work as getter/setter pairs on

values of elements from loaded array copies. These have "GEP-like" syntax and

semantics.

Fortran arrays are implicitly memory bound as are some other Fortran type/kind

entities. For entities that can be atomically promoted to the value domain,

we use `array_fetch` and `array_update`.

`array_access` and `array_amend` are defined to work as getter/setter pairs on

references to elements in loaded array copies. `array_access` has "GEP-like"

syntax. `array_amend` annotates which loaded array copy is being written to.

It is invalid to update an array copy without `array_amend`; doing so will

result in undefined behavior.

For those type/kinds that cannot be promoted to values, we must leave them in a

memory reference domain, and we use `array_access` and `array_amend`.

## array_load

This operation taken with `array_merge_store` captures Fortran's

copy-in/copy-out semantics. One way to think of this is that array_load

creates a snapshot copy of the entire array. This copy can then be used

as the "original value" of the array while the array's new value is

computed. The `array_merge_store` operation is the copy-out semantics, which

merge the updates with the original array value to produce the final array

result. This abstracts the copy operations as opposed to always creating

rovkaUnsubmitted

Done

shown above. Subsequent changes to the memory containing the array do not

- alter its composite value. This operation let's one load an array as a

+ alter its composite value. This operation lets one load an array as a

value while applying a runtime shape, shift, or slice to the memory

rovka:

copies or requiring dependence analysis be performed on the syntax trees

and before lowering to the IR.

Load an entire array as a single SSA value.

rovkaUnsubmitted

Done

```mlir

- %s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shape<2>

+ %s = fir.shape_shift %lb1, %ex1, %lb2, %ex2 : (index, index, index, index) -> !fir.shape<2>

// load the entire array 'a'

rovka:

```fortran

real :: a(o:n,p:m)

...

... = ... a ...

```

One can use `fir.array_load` to produce an ssa-value that captures an

immutable value of the entire array `a`, as in the Fortran array expression

rovkaUnsubmitted

Done

# array_merge_store

- The `array_merge_store` operation store a merged array value to memory.

+ The `array_merge_store` operation stores a merged array value to memory.

```fortran

rovka:

shown above. Subsequent changes to the memory containing the array do not

alter its composite value. This operation lets one load an array as a

value while applying a runtime shape, shift, or slice to the memory

reference, and its semantics guarantee immutability.

```mlir

%s = fir.shape_shift %lb1, %ex1, %lb2, %ex2 : (index, index, index, index) -> !fir.shape<2>

// load the entire array 'a'

%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>

// a fir.store here into array %a does not change %v

```

# array_merge_store

The `array_merge_store` operation stores a merged array value to memory.

mehdi_aminiUnsubmitted

Done

It's not clear to me in this example why array_merge_store needs %v?
Why isn't it just fir.array_store %r to %a enough? Seems like the array_update should have all the information here.

mehdi_amini: It's not clear to me in this example why array_merge_store needs %v? Why isn't it just `fir.

clementvalAuthorUnsubmitted

Done

In this example it has everything but it's not always the case like in the example below:

func @conversion_with_temporary(%arr0 : !fir.ref<!fir.array<10xi32>>) {
   %c10 = arith.constant 10 : index
   %1 = fir.shape %c10 : (index) -> !fir.shape<1>
   %2 = fir.array_load %arr0(%1) : (!fir.ref<!fir.array<10xi32>>, !fir.shape<1>) -> !fir.array<10xi32>
   %c10_i64 = arith.constant 10 : i64
   %3 = fir.convert %c10_i64 : (i64) -> index
   %c1_i64 = arith.constant 1 : i64
   %c-1_i64 = arith.constant -1 : i64
   %4 = fir.shape %c10 : (index) -> !fir.shape<1>
   %5 = fir.slice %c10_i64, %c1_i64, %c-1_i64 : (i64, i64, i64) -> !fir.slice<1>
   %6 = fir.array_load %arr0(%4) [%5] : (!fir.ref<!fir.array<10xi32>>, !fir.shape<1>, !fir.slice<1>) -> !fir.array<10xi32>
   %c1 = arith.constant 1 : index
   %c0 = arith.constant 0 : index
   %7 = arith.subi %3, %c1 : index
   %8 = fir.do_loop %arg0 = %c0 to %7 step %c1 unordered iter_args(%arg1 = %2) -> (!fir.array<10xi32>) {
     %9 = fir.array_fetch %6, %arg0 : (!fir.array<10xi32>, index) -> i32
     %10 = fir.array_update %arg1, %9, %arg0 : (!fir.array<10xi32>, i32, index) -> !fir.array<10xi32>
     fir.result %10 : !fir.array<10xi32>
   }
   fir.array_merge_store %2, %8 to %arr0 : !fir.array<10xi32>, !fir.array<10xi32>, !fir.ref<!fir.array<10xi32>>
   return
 }

clementval: In this example it has everything but it's not always the case like in the example below: ```…

mehdi_aminiUnsubmitted

Done

I don't quite get the different in this example?
If I track the loop, you have the loaded array used a initial induction value, then you update one element, and then pass it along to the next iteration until the end. %8 seems to contain the array fully updated?

mehdi_amini: I don't quite get the different in this example? If I track the loop, you have the loaded array…

clementvalAuthorUnsubmitted

Done

Agreed you can track it back all the way to fir.array_load. Since we know it during lowering why make it harder to track it and not just having it right away?

clementval: Agreed you can track it back all the way to `fir.array_load`. Since we know it during lowering…

mehdi_aminiUnsubmitted

Done

My comment explaining how %8 should be enough isn't about the ability tracking it back to the fir.array_load, this is about the SSA value being self consistent in the semantics model.

It isn't clear to me in the description of fir.array_merge_store what is the first operand useful for.

I suspect it helps tying thing together in some lowering later potentially, but I'm not sure how (and it isn't documented).

mehdi_amini: My comment explaining how %8 should be enough isn't about the ability tracking it back to the…

clementvalAuthorUnsubmitted

Done

clementval: Array expressions work with a context (such as a statement). FIR does not model Fortran…

mehdi_aminiUnsubmitted

Done

I understand the "statement problem" in Fortran, but I don't quite make the direct connection in FIR. Basically you're saying "we're propagating the 'bound' of the statement here", but that does not really say why it is needed or useful?

mehdi_amini: I understand the "statement problem" in Fortran, but I don't quite make the direct connection…

clementvalAuthorUnsubmitted

Done

It's useful because it bounds the second of the FIR that represents the statements like a FORALL. Statement have specific semantics and like I mentioned before it is useful for analysis for example to know the "bound" defined by array_load and array_merge_store.

clementval: It's useful because it bounds the second of the FIR that represents the statements like a…

mehdi_aminiUnsubmitted

Done

I feel we're going in rounds here, you keep telling it is useful and I keep asking you to elaborate on this: can you put it in the doc.
A reader of the doc shouldn't have to "guess" the design and a design doc should go beyond explaining just what it does but also capture the "why" it is done this way.
If you say "it is useful", then please demonstrate it, or illustrate with examples.

mehdi_amini: I feel we're going in rounds here, you keep telling it is useful and I keep asking you to…

mehdi_aminiUnsubmitted

Done

Another way to approach it is to analyze "what if this operand wasn't provided?" and answer the question with a description of how/where the optimizer or the codegen would be infeasible or very inconvenient.

mehdi_amini: Another way to approach it is to analyze "what if this operand wasn't provided?" and answer the…

mehdi_aminiUnsubmitted

Done

You pinged the revision but I don't think you actually followed up here, or did I miss something?

mehdi_amini: You pinged the revision but I don't think you actually followed up here, or did I miss…

clementvalAuthorUnsubmitted

Done

We have added some documentation in the last paragraph just below.

clementval: We have added some documentation in the last paragraph just below.

mehdi_aminiUnsubmitted

Done

OK, thanks.

I won't have bandwidth right now to page-in again all the context from back when I was actively trying to understand this in detail, so it's probably better if you just land this now.

mehdi_amini: OK, thanks. I won't have bandwidth right now to page-in again all the context from back when I…

clementvalAuthorUnsubmitted

Done

Thanks for the reviews. The documentation can probably still be improved here and there and I guess it will be updated over time when other in the community get a chance to look at it.

clementval: Thanks for the reviews. The documentation can probably still be improved here and there and I…

```fortran

real :: a(n,m)

...

a = ...

```

One can use `fir.array_merge_store` to merge/copy the value of `a` in an

array expression as shown above.

```mlir

%v = fir.array_load %a(%shape) : ...

%r = fir.array_update %v, %f, %i, %j : (!fir.array<?x?xf32>, f32, index, index) -> !fir.array<?x?xf32>

fir.array_merge_store %v, %r to %a : !fir.ref<!fir.array<?x?xf32>>

```

This operation merges the original loaded array value, `%v`, with the

chained updates, `%r`, and stores the result to the array at address, `%a`.

This operation taken with `array_load`'s captures Fortran's

copy-in/copy-out semantics. The first operands of `array_merge_store` is the

result of the initial `array_load` operation. While this value could be

retrieved by reference chasiing through the different array operations it is

useful to have it on hand directly for analysis passes since this directly

schweitzUnsubmitted

Done

result of the initial `array_load` operation. While this value could be

- retrieved by going back through the different array operations it is useful to

+ retrieved by reference chasing through the different array operations it is useful to

have it on hand directly for analysis passes since this directly defines the

A more pejorative term.

schweitz: A more pejorative term.

defines the "bounds" of the Fortran statement represented by these operations.

The intention is to allow copy-in/copy-out regions to be easily delineated,

analyzed, and optimized.

schweitzUnsubmitted

Done

"bounds" of the Fortran statement represented by these operations. The intention

- is to allow easily copy-in/copy-out regions to be fused.

+ is to allow copy-in/copy-out regions to be easily delineated, analyzed, and optimized.

## array_fetch

schweitz:

mehdi_aminiUnsubmitted

Not Done

Why this restriction?

mehdi_amini: Why this restriction?

clementvalAuthorUnsubmitted

Done

The array "abstraction" in FIR is not meant to be used anywhere. Fortran itself doesn't have "array values" in any sane sense of that term. To avoid to introduce arrays universally in FIR we restrict the use of the abstraction to precisely where we want it and where it is useful.

clementval: The array "abstraction" in FIR is not meant to be used anywhere. Fortran itself doesn't have…

mehdi_aminiUnsubmitted

Not Done

I don't feel this is a clear answer. What does "To avoid to introduce arrays universally" means for example?
Restricting the use of array makes sense: that means that fir.add will not accept to add two arrays for example, but that does not explain clearly to me the link between fetch and load.

In general: an SSA value is equal to another SSA value, what you're saying here is that inside the IR this isn't the case and a transformation that would make an SSA value a block argument for example isn't valid.

This seems fragile and extremely uncommon (I don't know of any precent in any dialect I worked on), so I'd insist to look for any alternative to such a strong design constraint: the bar to justify this should be really high and we're not there yet IMO.

mehdi_amini: I don't feel this is a clear answer. What does "To avoid to introduce arrays universally" means…

clementvalAuthorUnsubmitted

Done

I think the restriction is probably not phrased correctly. What we want to mean is that the value must trace back transitively to an array_load as the dominating source.
We have places where those SSA value are block arguments.

For example, this is totally fine FIR:

%1 = array_load
fir.loop ... iter_args (%2 = %1) ...
  %3 = array_fetch %2

clementval: I think the restriction is probably not phrased correctly. What we want to mean is that the…

mehdi_aminiUnsubmitted

Not Done

For example, this is totally fine FIR:

%1 = array_load
fir.loop ... iter_args (%2 = %1) ...
  %3 = array_fetch %2

Seems like %2 after the first iteration will be the result of the yield of the previous iteration: what will produce this value?
Unless it is an array_load, your condition does not seem met to me.

mehdi_amini: ``` For example, this is totally fine FIR: %1 = array_load fir.loop ... iter_args (%2 = %1) ...

clementvalAuthorUnsubmitted

Done

This is a more complete example:

%7 = fir.array_load %arg0 [%6] : (!fir.box<!fir.array<?xf32>>, !fir.slice<1>) 
-> !fir.array<?xf32>
...
%18 = fir.do_loop %arg5 = %c0 to %17 step %c1 iter_args(%arg6 = %7) -> (!fir.array<?xf32>) {
  %19 = fir.array_fetch %arg6, %arg5 : (!fir.array<?xf32>, index) -> f32
  %20 = fir.array_update %arg6, %19, %arg5 : (!fir.array<?xf32>, f32, index) -> !fir.array<?xf32>
  fir.result %20 : !fir.array<?xf32>
}
fir.array_merge_store %7, %18 to %arg0[%6] : !fir.array<?xf32>, !fir.array<?xf32>, !fir.box<!fir.array<?xf32>>, !fir.slice<1>

clementval: This is a more complete example: ``` %7 = fir.array_load %arg0 [%6] : (!fir.box<!fir.array<?

mehdi_aminiUnsubmitted

Not Done

Right, so here array_fetch will consume the result of the array_update after the first iteration and not the result from an array_load.

It may be a problem of definition, when you write "that can be trace back transitively to an array_load " ; do you see array_update as "transparent" in here? If so what is the list of such "transparent" operations?

mehdi_amini: Right, so here `array_fetch` will consume the result of the `array_update` after the first…

## array_fetch

The `array_fetch` operation fetches the value of an element in an array value.

```fortran

real :: a(n,m)

...

... a ...

... a(r,s+1) ...

```

One can use `fir.array_fetch` to fetch the (implied) value of `a(i,j)` in

an array expression as shown above. It can also be used to extract the

element `a(r,s+1)` in the second expression.

```mlir

mehdi_aminiUnsubmitted

Done

Shouldn't the expression in Fortran above shows i and j somehow?

mehdi_amini: Shouldn't the expression in Fortran above shows i and j somehow?

clementvalAuthorUnsubmitted

Done

This Fortran example will generate an implied loop so the actual i and j are not present in the user code but will be generated.

clementval: This Fortran example will generate an implied loop so the actual `i` and `j` are not present in…

%s = fir.shape %n, %m : (index, index) -> !fir.shape<2>

// load the entire array 'a'

%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>

// fetch the value of one of the array value's elements

%1 = fir.array_fetch %v, %i, %j : (!fir.array<?x?xf32>, index, index) -> f32

```

It is only possible to use `array_fetch` on an `array_load` result value or a

value that can be trace back transitively to an `array_load` as the dominating

source. Other array operation such as `array_update` can be in between.

## array_update

The `array_update` operation is used to update the value of an element in an

array value. A new array value is returned where all element values of the input

array are identical except for the selected element which is the value passed in

the update.

```fortran

mehdi_aminiUnsubmitted

Done

## array_access

- The `array_access` operationis used to fetch the memory reference of an element

+ The `array_access` operation is used to fetch the memory reference of an element

in an array value.

mehdi_amini:

real :: a(n,m)

mehdi_aminiUnsubmitted

Done

I'm not comfortable with this description: from our previous discussion as it is described it looks like this creates a mutable reference to an element in the immutable array.
We converged last time on a semantics where this implicitly creates a new memory location with automatic-allocation scope and copy the value there, as well as remembering the original coordinates this value came from.
But this is a bit iffy as well: it seems like a Ref<i32> can't be universally defined where it comes from an array_access or from a fir.alloca.

This seems a bit loosely defined to me right now.

mehdi_amini: I'm not comfortable with this description: from our previous discussion as it is described it…

clementvalAuthorUnsubmitted

Done

Description updated.

clementval: Description updated.

...

a = ...

```

One can use `fir.array_update` to update the (implied) value of `a(i,j)`

in an array expression as shown above.

```mlir

%s = fir.shape %n, %m : (index, index) -> !fir.shape<2>

// load the entire array 'a'

%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>

// update the value of one of the array value's elements

// %r_{ij} = %f if (i,j) = (%i,%j), %v_{ij} otherwise

%r = fir.array_update %v, %f, %i, %j : (!fir.array<?x?xf32>, f32, index, index) -> !fir.array<?x?xf32>

fir.array_merge_store %v, %r to %a : !fir.ref<!fir.array<?x?xf32>>

```

An array value update behaves as if a mapping function from the indices

to the new value has been added, replacing the previous mapping. These

mappings can be added to the ssa-value, but will not be materialized in

memory until the `fir.array_merge_store` is performed.

`fir.array_update` can be seen as an array access with a notion that the array

mehdi_aminiUnsubmitted

Done

Same question: why do we need this restriction?

mehdi_amini: Same question: why do we need this restriction?

clementvalAuthorUnsubmitted

Done

See answer above.

clementval: See answer above.

will be changed at the accessed position when `fir.array_merge_store` is

performed.

## array_access

mehdi_aminiUnsubmitted

Not Done

You need to update the description of the "array type" to mention what this means.

mehdi_amini: You need to update the description of the "array type" to mention what this means.

clementvalAuthorUnsubmitted

Done

Which "array type" are you referencing to? (!fir.array)?

clementval: Which "array type" are you referencing to? (`!fir.array`)?

The `array_access` provides a reference to a single element from an array value.

This is *not* a view in the immutable array, otherwise it couldn't be stored to.

It can be see as a logical copy of the element and its position in the array.

Tis reference can be written to and modified withoiut changing the original

array.

The `array_access` operation is used to fetch the memory reference of an element

in an array value.

```fortran

mehdi_aminiUnsubmitted

Not Done

I wonder why array_amend is needed at all actually.
When can't it just be replaced with a load of the ref<T> and an array_update?

mehdi_amini: I wonder why array_amend is needed at all actually. When can't it just be replaced with a load…

clementvalAuthorUnsubmitted

Done

In some cases we do not want to load derived types/characters because they may have non-compile time constant sizes (cannot be loaded/might be big aggregates for which load/store is problematic (cause long LLVM compile time + asserts at a certain point).

clementval: In some cases we do not want to load derived types/characters because they may have non-compile…

mehdi_aminiUnsubmitted

Not Done

Can you document this with an example illustrating such a case?

mehdi_amini: Can you document this with an example illustrating such a case?

clementvalAuthorUnsubmitted

Done

Here is an example with character assignment.

Fortran

subroutine foo(c1, c2, n)
  integer(8) :: n
  character(n) :: c1(:), c2(:)
  c1 = c2
end subroutine

That would results in a cleaned-up FIR that looks like this:

func @_QPfoo(%arg0: !fir.box<!fir.array<?x!fir.char<1,?>>>, %arg1: !fir.box<!fir.array<?x!fir.char<1,?>>>, %arg2: !fir.ref<i64>) {
    %0 = fir.load %arg2 : !fir.ref<i64>
    %c0 = arith.constant 0 : index
    %1:3 = fir.box_dims %arg0, %c0 : (!fir.box<!fir.array<?x!fir.char<1,?>>>, index) -> (index, index, index)
    %2 = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.char<1,?>>>) -> !fir.array<?x!fir.char<1,?>>
    %3 = fir.array_load %arg1 : (!fir.box<!fir.array<?x!fir.char<1,?>>>) -> !fir.array<?x!fir.char<1,?>>
    %c1 = arith.constant 1 : index
    %4 = arith.subi %1#1, %c1 : index
    %5 = fir.do_loop %arg3 = %c0 to %4 step %c1 unordered iter_args(%arg4 = %2) -> (!fir.array<?x!fir.char<1,?>>) {
      %6 = fir.array_access %3, %arg3 : (!fir.array<?x!fir.char<1,?>>, index) -> !fir.ref<!fir.char<1,?>>
      %7 = fir.array_access %arg4, %arg3 : (!fir.array<?x!fir.char<1,?>>, index) -> !fir.ref<!fir.char<1,?>>
      %false = arith.constant false
      %8 = fir.convert %7 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
      %9 = fir.convert %6 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>
      fir.call @llvm.memmove.p0i8.p0i8.i64(%8, %9, %0, %false) : (!fir.ref<i8>, !fir.ref<i8>, i64, i1) -> ()
      %10 = fir.array_amend %arg4, %7 : (!fir.array<?x!fir.char<1,?>>, !fir.ref<!fir.char<1,?>>) -> !fir.array<?x!fir.char<1,?>>
      fir.result %10 : !fir.array<?x!fir.char<1,?>>
    }
    fir.array_merge_store %2, %5 to %arg0 : !fir.array<?x!fir.char<1,?>>, !fir.array<?x!fir.char<1,?>>, !fir.box<!fir.array<?x!fir.char<1,?>>>
    return
  }
  func private @llvm.memmove.p0i8.p0i8.i64(!fir.ref<i8>, !fir.ref<i8>, i64, i1)
}

We do not want to start loading !fir.ref<!fir.char<1,?>> here because it has dynamic length, and even if constant, could be way too long.

clementval: Here is an example with character assignment. Fortran ``` subroutine foo(c1, c2, n) integer…

mehdi_aminiUnsubmitted

Not Done

Uh, I'm not totally understanding all the pieces here, in particular you're converting the references to !fir.char<1,?> to simple i8 reference and use the N arguments for the memmove size, but at the same time you're saying that they are dynamic length.

if constant, could be way too long.

You're saying that basically array_amend is a fused form of load+update because the optimizer bails out and can't reconstruct the fused form later in the lowering?

All of this may be fine: it just deserves a lot of documentation I think.

mehdi_amini: Uh, I'm not totally understanding all the pieces here, in particular you're converting the…

clementvalAuthorUnsubmitted

Done

array_update serves two purposes: it is an array access, and it denotes the array as being changed. The lhs is distinguished from everything on the rhs.

Fortran arrays are implicitly memory bound as are some other Fortran type/kind entities. For entities that can be atomically promoted to the value domain, we use array_fetch and array_update. For those type/kinds that cannot be promoted to values, we must leave them in a memory reference domain, and we use array_access and array_amend.

array_access and array_amend split the two purposes of array_update into two distinct ops. array_access captures the array access semantics and array_amend denotes which array_access is the lhs.

clementval: `array_update` serves two purposes: it is an array access, and it denotes the array as being…

real :: a(n,m)

...

... a ...

... a(r,s+1) ...

rovkaUnsubmitted

Done

## Array value copy pass

- One of the main purpose of the array operations present in FIR is to be able to

+ One of the main purposes of the array operations present in FIR is to be able to

perform the dependence analysis and elide copies where possible with a MLIR

rovka:

```

One can use `fir.array_access` to recover the implied memory reference to

rovkaUnsubmitted

Done

pass. This pass is called the `array-value-copy` pass.

- The analysis detects if there are any conflicts. A conflicts is when one of the

+ The analysis detects if there are any conflicts. A conflict is when one of the

following cases occurs:

rovka:

rovkaUnsubmitted

Done

So, IIUC, the 'array-value-copy' is both an analysis pass and a transformation pass, right? I think it would be good to clarify that up front, because I wasn't 100% certain of that until I saw there's a before/after example below. Also explain exactly what the pass does, since it seems to be more than just copy elision, which was what I first understood when I read this text.

rovka: So, IIUC, the 'array-value-copy' is both an analysis pass and a transformation pass, right? I…

clementvalAuthorUnsubmitted

Done

The transformation needs an analysis first to be performed. More description of the pass should go directly in the pass file itself. The purpose of having an example here is just to show some array operations in use in a bigger example.

clementval: The transformation needs an analysis first to be performed. More description of the pass should…

the element `a(i,j)` in an array expression `a` as shown above. It can also

be used to recover the reference element `a(r,s+1)` in the second

expression.

```mlir

%s = fir.shape %n, %m : (index, index) -> !fir.shape<2>

// load the entire array 'a'

%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32>

// fetch the value of one of the array value's elements

%1 = fir.array_access %v, %i, %j : (!fir.array<?x?xf32>, index, index) -> !fir.ref<f32>

```

It is only possible to use `array_access` on an `array_load` result value or a

value that can be trace back transitively to an `array_load` as the dominating

source. Other array operation such as `array_amend` can be in between.

`array_access` if mainly used with `character`'s arrays and arrays of derived

types where because they might have a non-compile time sizes that would be

useless too load entirely or too big to load.

Here is a simple example with a `character` array assignment.

Fortran

```

subroutine foo(c1, c2, n)

integer(8) :: n

character(n) :: c1(:), c2(:)

c1 = c2

end subroutine

```

It results in this cleaned-up FIR:

```

func @_QPfoo(%arg0: !fir.box<!fir.array<?x!fir.char<1,?>>>, %arg1: !fir.box<!fir.array<?x!fir.char<1,?>>>, %arg2: !fir.ref<i64>) {

%0 = fir.load %arg2 : !fir.ref<i64>

%c0 = arith.constant 0 : index

%1:3 = fir.box_dims %arg0, %c0 : (!fir.box<!fir.array<?x!fir.char<1,?>>>, index) -> (index, index, index)

%2 = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.char<1,?>>>) -> !fir.array<?x!fir.char<1,?>>

%3 = fir.array_load %arg1 : (!fir.box<!fir.array<?x!fir.char<1,?>>>) -> !fir.array<?x!fir.char<1,?>>

%c1 = arith.constant 1 : index

%4 = arith.subi %1#1, %c1 : index

%5 = fir.do_loop %arg3 = %c0 to %4 step %c1 unordered iter_args(%arg4 = %2) -> (!fir.array<?x!fir.char<1,?>>) {

%6 = fir.array_access %3, %arg3 : (!fir.array<?x!fir.char<1,?>>, index) -> !fir.ref<!fir.char<1,?>>

%7 = fir.array_access %arg4, %arg3 : (!fir.array<?x!fir.char<1,?>>, index) -> !fir.ref<!fir.char<1,?>>

%false = arith.constant false

%8 = fir.convert %7 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>

%9 = fir.convert %6 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8>

fir.call @llvm.memmove.p0i8.p0i8.i64(%8, %9, %0, %false) : (!fir.ref<i8>, !fir.ref<i8>, i64, i1) -> ()

%10 = fir.array_amend %arg4, %7 : (!fir.array<?x!fir.char<1,?>>, !fir.ref<!fir.char<1,?>>) -> !fir.array<?x!fir.char<1,?>>

fir.result %10 : !fir.array<?x!fir.char<1,?>>

}

fir.array_merge_store %2, %5 to %arg0 : !fir.array<?x!fir.char<1,?>>, !fir.array<?x!fir.char<1,?>>, !fir.box<!fir.array<?x!fir.char<1,?>>>

return

}

func private @llvm.memmove.p0i8.p0i8.i64(!fir.ref<i8>, !fir.ref<i8>, i64, i1)

}

```

`fir.array_access` and `fir.array_amend` split the two purposes of

`fir.array_update` into two distinct operations to work on type/kind that must

reside in the memory reference domain. `fir.array_access` captures the array

access semantics and `fir.array_amend` denotes which `fir.array_access` is the

lhs.

We do not want to start loading the entire `!fir.ref<!fir.char<1,?>>` here since

it has dynamic length, and even if constant, could be too long to do so.

## array_amend

The `array_amend` operation marks an array value as having been changed via a

reference obtain by an `array_access`. It acts as a logical transaction log

that is used to merge the final result back with an `array_merge_store`

mehdi_aminiUnsubmitted

Done

It'd help if this code was simplified, for example the alloca for the loop induction variable seems spurious and just adds noise, similarly for the redundant conversions.
The SSA value could get name for readability of the example as well.
And finally some inline comment in the code to describe it and map it to the original source code would really help I think.

func @_QPs(%arg0: !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, %arg1: !fir.ref<i32>, %arg2: !fir.ref<i32>) {
  %l = fir.load %arg1 : !fir.ref<i32>
  %l_index = fir.convert %1 : (i32) -> index
  %u = fir.load %arg2 : !fir.ref<i32>
  %u_index = fir.convert %u : (i32) -> index
  %c1 = arith.constant 1 : index
  // This is the "copy-in" array used on the RHS of the expression. It will be indexed into and loaded at each iteration.
  %array_a_src = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>>

  // This is the "seed" for the "copy-out" array on the LHS. It'll flow from iteration to iteration and gets
  // updated at each iteration.
  %array_a_dest_init = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>>
  
  %array_a_final = fir.do_loop %i = %l_index to %u_index step %c1 unordered iter_args(%array_a_dest = % array_a_dest_init) -> (!fir.array<?x!fir.type<_QFsTt{m:i32}>>) {
    // Compute indexing for the RHS and array the element.
    %u_minus_i = arith.subi %u, %I : index // u-i
    %u_minus_i_plus_one = arith.addi %u_minus_i, %c1: index // u-i+1
    %a_src_ref = fir.array_access %array_a_src, %u_minus_i_plus_one {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>>
    %a_src_elt = fir.load %a_src_ref : !fir.ref<!fir.type<_QFsTt{m:i32}>>

    // Get the reference to the element in the array on the LHS
    %a_dst_ref = fir.array_access %array_a_dest, %i {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>>

    // Store the value, and update the array
    fir.store %a_src_elt to %a_dst_ref : !fir.ref<!fir.type<_QFsTt{m:i32}>>
    %updated_array_a = fir.array_amend %array_a, %a_dst_ref : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.ref<!fir.type<_QFsTt{m:i32}>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>>

    // Forward the current updated array to the next iteration.
    fir.result %updated_array_a : !fir.array<?x!fir.type<_QFsTt{m:i32}>>
  }
  // Store back the result by merging the initial value loaded before the loop
  // with the final one produced by the loop.
  fir.array_merge_store %array_a_dest_init, %array_a_final to %arg0 : !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>
  return
}

This is an interesting example, and before even getting into the optimization section, it could be useful to explain how this works, and only then get into how it is optimized.
You introduced each operation individually, but seeing it all fit together is another story.

mehdi_amini: It'd help if this code was simplified, for example the alloca for the loop induction variable…

clementvalAuthorUnsubmitted

Done

We can simplify and add more explanation to the example for sure.

clementval: We can simplify and add more explanation to the example for sure.

clementvalAuthorUnsubmitted

Done

I update this section and kept just the example with annotation. I think it makes more sense to keep this documentation related to the array operations. Finer documentation on array-value-copy is probably better in the pass itself as comment or in a separate documentation.

clementval: I update this section and kept just the example with annotation. I think it makes more sense to…

operation.

```mlir

// fetch the value of one of the array value's elements

%1 = fir.array_access %v, %i, %j : (!fir.array<?x?xT>, index, index) -> !fir.ref<T>

// modify the element by storing data using %1 as a reference

%2 = ... %1 ...

// mark the array value

%new_v = fir.array_amend %v, %2 : (!fir.array<?x?xT>, !fir.ref<T>) -> !fir.array<?x?xT>

```

## Example

Here is an example of a FIR code using several array operations together. The

example below is a simplified version of the FIR code comiing from the

following Fortran code snippet.

```fortran

subroutine s(a,l,u)

type t

integer m

end type t

type(t) :: a(:)

integer :: l, u

forall (i=l:u)

a(i) = a(u-i+1)

end forall

end

```

func @_QPs(%arg0: !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, %arg1: !fir.ref<i32>, %arg2: !fir.ref<i32>) {

%l = fir.load %arg1 : !fir.ref<i32>

%l_index = fir.convert %l : (i32) -> index

%u = fir.load %arg2 : !fir.ref<i32>

%u_index = fir.convert %u : (i32) -> index

%c1 = arith.constant 1 : index

// This is the "copy-in" array used on the RHS of the expression. It will be indexed into and loaded at each iteration.

%array_a_src = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>>

// This is the "seed" for the "copy-out" array on the LHS. It'll flow from iteration to iteration and gets

// updated at each iteration.

%array_a_dest_init = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>>

%array_a_final = fir.do_loop %i = %l_index to %u_index step %c1 unordered iter_args(%array_a_dest = %array_a_dest_init) -> (!fir.array<?x!fir.type<_QFsTt{m:i32}>>) {

// Compute indexing for the RHS and array the element.

%u_minus_i = arith.subi %u_index, %i : index // u-i

%u_minus_i_plus_one = arith.addi %u_minus_i, %c1: index // u-i+1

%a_src_ref = fir.array_access %array_a_src, %u_minus_i_plus_one {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>>

%a_src_elt = fir.load %a_src_ref : !fir.ref<!fir.type<_QFsTt{m:i32}>>

// Get the reference to the element in the array on the LHS

%a_dst_ref = fir.array_access %array_a_dest, %i {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>>

// Store the value, and update the array

fir.store %a_src_elt to %a_dst_ref : !fir.ref<!fir.type<_QFsTt{m:i32}>>

%updated_array_a = fir.array_amend %array_a_dest, %a_dst_ref : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.ref<!fir.type<_QFsTt{m:i32}>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>>

// Forward the current updated array to the next iteration.

fir.result %updated_array_a : !fir.array<?x!fir.type<_QFsTt{m:i32}>>

}

rovkaUnsubmitted

Done

%14 = arith.subi %13, %c1_3 : index

- // Move tha value back from the copy to the original array

+ // Move the value back from the copy to the original array

fir.do_loop %arg3 = %c0_2 to %14 step %c1_3 {

rovka:

// Store back the result by merging the initial value loaded before the loop

// with the final one produced by the loop.

fir.array_merge_store %array_a_dest_init, %array_a_final to %arg0 : !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>

return

}

```

This is an archive of the discontinued LLVM Phabricator instance.

[fir] Add array operations documentationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 401889

flang/docs/FIRArrayOperations.md

[fir] Add array operations documentation
ClosedPublic