This patch adds documentation on FIR array operations
and their usage.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
flang/docs/FIRArrayOperations.md | ||
---|---|---|
19–20 | Can we link to a doc explaining this? (assuming this exists on the web, otherwise it would be worth an introduction) | |
40 | ||
95 | It's not clear to me in this example why array_merge_store needs %v? | |
123 | Why this restriction? | |
139 | Shouldn't the expression in Fortran above shows i and j somehow? | |
158 | ||
159 | I'm not comfortable with this description: from our previous discussion as it is described it looks like this creates a mutable reference to an element in the immutable array. This seems a bit loosely defined to me right now. | |
181 | Same question: why do we need this restriction? | |
185 | You need to update the description of the "array type" to mention what this means. | |
196 | I wonder why array_amend is needed at all actually. | |
275 | It'd help if this code was simplified, for example the alloca for the loop induction variable seems spurious and just adds noise, similarly for the redundant conversions. func @_QPs(%arg0: !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, %arg1: !fir.ref<i32>, %arg2: !fir.ref<i32>) { %l = fir.load %arg1 : !fir.ref<i32> %l_index = fir.convert %1 : (i32) -> index %u = fir.load %arg2 : !fir.ref<i32> %u_index = fir.convert %u : (i32) -> index %c1 = arith.constant 1 : index // This is the "copy-in" array used on the RHS of the expression. It will be indexed into and loaded at each iteration. %array_a_src = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>> // This is the "seed" for the "copy-out" array on the LHS. It'll flow from iteration to iteration and gets // updated at each iteration. %array_a_dest_init = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>> %array_a_final = fir.do_loop %i = %l_index to %u_index step %c1 unordered iter_args(%array_a_dest = % array_a_dest_init) -> (!fir.array<?x!fir.type<_QFsTt{m:i32}>>) { // Compute indexing for the RHS and array the element. %u_minus_i = arith.subi %u, %I : index // u-i %u_minus_i_plus_one = arith.addi %u_minus_i, %c1: index // u-i+1 %a_src_ref = fir.array_access %array_a_src, %u_minus_i_plus_one {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> %a_src_elt = fir.load %a_src_ref : !fir.ref<!fir.type<_QFsTt{m:i32}>> // Get the reference to the element in the array on the LHS %a_dst_ref = fir.array_access %array_a_dest, %i {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> // Store the value, and update the array fir.store %a_src_elt to %a_dst_ref : !fir.ref<!fir.type<_QFsTt{m:i32}>> %updated_array_a = fir.array_amend %array_a, %a_dst_ref : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.ref<!fir.type<_QFsTt{m:i32}>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>> // Forward the current updated array to the next iteration. fir.result %updated_array_a : !fir.array<?x!fir.type<_QFsTt{m:i32}>> } // Store back the result by merging the initial value loaded before the loop // with the final one produced by the loop. fir.array_merge_store %array_a_dest_init, %array_a_final to %arg0 : !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>> return } This is an interesting example, and before even getting into the optimization section, it could be useful to explain how this works, and only then get into how it is optimized. |
thanks for adding the doc! That's gonna be very helpful for onboarding people I think
flang/docs/FIRArrayOperations.md | ||
---|---|---|
22 | Nit: you might want to rephrase this so we don't have to change the number if we ever have to add or remove one of these. | |
66 | ||
71 | ||
79 | ||
200 | ||
203 | ||
203 | So, IIUC, the 'array-value-copy' is both an analysis pass and a transformation pass, right? I think it would be good to clarify that up front, because I wasn't 100% certain of that until I saw there's a before/after example below. Also explain exactly what the pass does, since it seems to be more than just copy elision, which was what I first understood when I read this text. | |
336 |
Fix typos
flang/docs/FIRArrayOperations.md | ||
---|---|---|
203 | The transformation needs an analysis first to be performed. More description of the pass should go directly in the pass file itself. The purpose of having an example here is just to show some array operations in use in a bigger example. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | In this example it has everything but it's not always the case like in the example below: func @conversion_with_temporary(%arr0 : !fir.ref<!fir.array<10xi32>>) { %c10 = arith.constant 10 : index %1 = fir.shape %c10 : (index) -> !fir.shape<1> %2 = fir.array_load %arr0(%1) : (!fir.ref<!fir.array<10xi32>>, !fir.shape<1>) -> !fir.array<10xi32> %c10_i64 = arith.constant 10 : i64 %3 = fir.convert %c10_i64 : (i64) -> index %c1_i64 = arith.constant 1 : i64 %c-1_i64 = arith.constant -1 : i64 %4 = fir.shape %c10 : (index) -> !fir.shape<1> %5 = fir.slice %c10_i64, %c1_i64, %c-1_i64 : (i64, i64, i64) -> !fir.slice<1> %6 = fir.array_load %arr0(%4) [%5] : (!fir.ref<!fir.array<10xi32>>, !fir.shape<1>, !fir.slice<1>) -> !fir.array<10xi32> %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index %7 = arith.subi %3, %c1 : index %8 = fir.do_loop %arg0 = %c0 to %7 step %c1 unordered iter_args(%arg1 = %2) -> (!fir.array<10xi32>) { %9 = fir.array_fetch %6, %arg0 : (!fir.array<10xi32>, index) -> i32 %10 = fir.array_update %arg1, %9, %arg0 : (!fir.array<10xi32>, i32, index) -> !fir.array<10xi32> fir.result %10 : !fir.array<10xi32> } fir.array_merge_store %2, %8 to %arr0 : !fir.array<10xi32>, !fir.array<10xi32>, !fir.ref<!fir.array<10xi32>> return } | |
139 | This Fortran example will generate an implied loop so the actual i and j are not present in the user code but will be generated. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
123 | The array "abstraction" in FIR is not meant to be used anywhere. Fortran itself doesn't have "array values" in any sane sense of that term. To avoid to introduce arrays universally in FIR we restrict the use of the abstraction to precisely where we want it and where it is useful. | |
181 | See answer above. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
19–20 | We have this in the doc: https://github.com/flang-compiler/f18-llvm-project/blob/main/flang/docs/FortranForCProgrammers.md We can probably add a small introduction here. | |
275 | We can simplify and add more explanation to the example for sure. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | I don't quite get the different in this example? | |
123 | I don't feel this is a clear answer. What does "To avoid to introduce arrays universally" means for example? In general: an SSA value is equal to another SSA value, what you're saying here is that inside the IR this isn't the case and a transformation that would make an SSA value a block argument for example isn't valid. This seems fragile and extremely uncommon (I don't know of any precent in any dialect I worked on), so I'd insist to look for any alternative to such a strong design constraint: the bar to justify this should be really high and we're not there yet IMO. |
Address comments
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | Agreed you can track it back all the way to fir.array_load. Since we know it during lowering why make it harder to track it and not just having it right away? | |
123 | I think the restriction is probably not phrased correctly. What we want to mean is that the value must trace back transitively to an array_load as the dominating source. For example, this is totally fine FIR: %1 = array_load fir.loop ... iter_args (%2 = %1) ... %3 = array_fetch %2 | |
159 | Description updated. | |
185 | Which "array type" are you referencing to? (!fir.array)? | |
196 | In some cases we do not want to load derived types/characters because they may have non-compile time constant sizes (cannot be loaded/might be big aggregates for which load/store is problematic (cause long LLVM compile time + asserts at a certain point). | |
275 | I update this section and kept just the example with annotation. I think it makes more sense to keep this documentation related to the array operations. Finer documentation on array-value-copy is probably better in the pass itself as comment or in a separate documentation. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | My comment explaining how %8 should be enough isn't about the ability tracking it back to the fir.array_load, this is about the SSA value being self consistent in the semantics model. It isn't clear to me in the description of fir.array_merge_store what is the first operand useful for. I suspect it helps tying thing together in some lowering later potentially, but I'm not sure how (and it isn't documented). | |
123 | For example, this is totally fine FIR: %1 = array_load fir.loop ... iter_args (%2 = %1) ... %3 = array_fetch %2 Seems like %2 after the first iteration will be the result of the yield of the previous iteration: what will produce this value? | |
196 | Can you document this with an example illustrating such a case? |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | Array expressions work with a context (such as a statement). FIR does not model Fortran statements directly. Instead we "bound" a region of FIR with array_loads and an array_merge_store. | |
123 | This is a more complete example: %7 = fir.array_load %arg0 [%6] : (!fir.box<!fir.array<?xf32>>, !fir.slice<1>) -> !fir.array<?xf32> ... %18 = fir.do_loop %arg5 = %c0 to %17 step %c1 iter_args(%arg6 = %7) -> (!fir.array<?xf32>) { %19 = fir.array_fetch %arg6, %arg5 : (!fir.array<?xf32>, index) -> f32 %20 = fir.array_update %arg6, %19, %arg5 : (!fir.array<?xf32>, f32, index) -> !fir.array<?xf32> fir.result %20 : !fir.array<?xf32> } fir.array_merge_store %7, %18 to %arg0[%6] : !fir.array<?xf32>, !fir.array<?xf32>, !fir.box<!fir.array<?xf32>>, !fir.slice<1> | |
196 | Here is an example with character assignment. Fortran subroutine foo(c1, c2, n) integer(8) :: n character(n) :: c1(:), c2(:) c1 = c2 end subroutine That would results in a cleaned-up FIR that looks like this: func @_QPfoo(%arg0: !fir.box<!fir.array<?x!fir.char<1,?>>>, %arg1: !fir.box<!fir.array<?x!fir.char<1,?>>>, %arg2: !fir.ref<i64>) { %0 = fir.load %arg2 : !fir.ref<i64> %c0 = arith.constant 0 : index %1:3 = fir.box_dims %arg0, %c0 : (!fir.box<!fir.array<?x!fir.char<1,?>>>, index) -> (index, index, index) %2 = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.char<1,?>>>) -> !fir.array<?x!fir.char<1,?>> %3 = fir.array_load %arg1 : (!fir.box<!fir.array<?x!fir.char<1,?>>>) -> !fir.array<?x!fir.char<1,?>> %c1 = arith.constant 1 : index %4 = arith.subi %1#1, %c1 : index %5 = fir.do_loop %arg3 = %c0 to %4 step %c1 unordered iter_args(%arg4 = %2) -> (!fir.array<?x!fir.char<1,?>>) { %6 = fir.array_access %3, %arg3 : (!fir.array<?x!fir.char<1,?>>, index) -> !fir.ref<!fir.char<1,?>> %7 = fir.array_access %arg4, %arg3 : (!fir.array<?x!fir.char<1,?>>, index) -> !fir.ref<!fir.char<1,?>> %false = arith.constant false %8 = fir.convert %7 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8> %9 = fir.convert %6 : (!fir.ref<!fir.char<1,?>>) -> !fir.ref<i8> fir.call @llvm.memmove.p0i8.p0i8.i64(%8, %9, %0, %false) : (!fir.ref<i8>, !fir.ref<i8>, i64, i1) -> () %10 = fir.array_amend %arg4, %7 : (!fir.array<?x!fir.char<1,?>>, !fir.ref<!fir.char<1,?>>) -> !fir.array<?x!fir.char<1,?>> fir.result %10 : !fir.array<?x!fir.char<1,?>> } fir.array_merge_store %2, %5 to %arg0 : !fir.array<?x!fir.char<1,?>>, !fir.array<?x!fir.char<1,?>>, !fir.box<!fir.array<?x!fir.char<1,?>>> return } func private @llvm.memmove.p0i8.p0i8.i64(!fir.ref<i8>, !fir.ref<i8>, i64, i1) } We do not want to start loading !fir.ref<!fir.char<1,?>> here because it has dynamic length, and even if constant, could be way too long. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | I understand the "statement problem" in Fortran, but I don't quite make the direct connection in FIR. Basically you're saying "we're propagating the 'bound' of the statement here", but that does not really say why it is needed or useful? | |
123 | Right, so here array_fetch will consume the result of the array_update after the first iteration and not the result from an array_load. It may be a problem of definition, when you write "that can be trace back transitively to an array_load " ; do you see array_update as "transparent" in here? If so what is the list of such "transparent" operations? | |
196 | Uh, I'm not totally understanding all the pieces here, in particular you're converting the references to !fir.char<1,?> to simple i8 reference and use the N arguments for the memmove size, but at the same time you're saying that they are dynamic length.
You're saying that basically array_amend is a fused form of load+update because the optimizer bails out and can't reconstruct the fused form later in the lowering? All of this may be fine: it just deserves a lot of documentation I think. |
Update restrictions infos. Rest of the updates will come shortly.
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | It's useful because it bounds the second of the FIR that represents the statements like a FORALL. Statement have specific semantics and like I mentioned before it is useful for analysis for example to know the "bound" defined by array_load and array_merge_store. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | I feel we're going in rounds here, you keep telling it is useful and I keep asking you to elaborate on this: can you put it in the doc. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | Another way to approach it is to analyze "what if this operand wasn't provided?" and answer the question with a description of how/where the optimizer or the codegen would be infeasible or very inconvenient. |
Update the doc
flang/docs/FIRArrayOperations.md | ||
---|---|---|
196 | array_update serves two purposes: it is an array access, and it denotes the array as being changed. The lhs is distinguished from everything on the rhs. Fortran arrays are implicitly memory bound as are some other Fortran type/kind entities. For entities that can be atomically promoted to the value domain, we use array_fetch and array_update. For those type/kinds that cannot be promoted to values, we must leave them in a memory reference domain, and we use array_access and array_amend. array_access and array_amend split the two purposes of array_update into two distinct ops. array_access captures the array access semantics and array_amend denotes which array_access is the lhs. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | You pinged the revision but I don't think you actually followed up here, or did I miss something? |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | We have added some documentation in the last paragraph just below. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | OK, thanks. I won't have bandwidth right now to page-in again all the context from back when I was actively trying to understand this in detail, so it's probably better if you just land this now. |
flang/docs/FIRArrayOperations.md | ||
---|---|---|
95 | Thanks for the reviews. The documentation can probably still be improved here and there and I guess it will be updated over time when other in the community get a chance to look at it. |
Can we link to a doc explaining this? (assuming this exists on the web, otherwise it would be worth an introduction)