Support allocate statement with source in runtime version. The source
expression is evaluated only once for each allocate statement. When the
source expression has shape-spec, uses it for bounds. Otherwise, get
the bounds from the source expression. Get the length if the source
expression has deferred length parameter.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
- When I tried to support this feature using runtime, I found I need one temporary storage for non-contiguous variable, and that operation is actually one allocate-from-source-operation.
- This contains too many scenarios, and it is not easy to review from IR check. So should I add all the IR checks for the test cases in this patch. For now, I only add the IR check for scalar. Please notice this is RFC.
- I put the execution tests in https://github.com/llvm/llvm-project/issues/58927, and hope it can help the review process.
When I tried to support this feature using runtime, I found I need one temporary storage for non-contiguous variable, and that operation is actually one allocate-from-source-operation.
Could you be more specific here, which runtime did you try using, and why you need this copy exactly ? AllocatableAllocateSource signature is defined but not implemented, and that was the one I had in mind you would implement and use.
I am rather against having an inlined implementation for this using the current lowering/assignment lowering.
flang/lib/Lower/Allocatable.cpp | ||
---|---|---|
411 | The lower bounds should be the same as sourceExv: "When an ALLOCATE statement is executed for an array with no allocate-shape-spec-list, the bounds of source-expr determine the bounds of the array." (9.7.1.2 6). | |
489 | Won't this lead to re-evaluating the source-expr for each allocation object ? I think this is illegal if the source-expr may have side effects. |
Could you be more specific here, which runtime did you try using, and why you need this copy exactly ?
When the subscript-triplet has stride, I need to copy the array section to one contiguous storage since I did not find the helper function to add the stride info. Is it allowed to use sm (memory stride in bytes) of CFI_dim_t to add it to handle this case?
AllocatableAllocateSource signature is defined but not implemented, and that was the one I had in mind you would implement and use.
I tried to use AllocatableAllocateSource, but I haven't figured out how to handle the stride info. And another problem is what if the source expression is non-pointer non-allocatable variables and elem_len (element size in bytes) in CFI_cdesc_t would be unknown.
flang/lib/Lower/Allocatable.cpp | ||
---|---|---|
489 | Right, I shouldn't share the code of expression evaluating, and use ExvValue as argument. |
Why does it matters if the source expression has triplets, can you give an example ? I would expect you would use genExprBox() on the source expression and pass the descriptor value (fir.box) to the runtime regardless of what the expression is (they are likely some optimizations that one could think of depending on the what storage the source expression is evaluated in, e.g. in certain cases it could be moved into the allocate object, but let's keep this optimizations for when there is simple code able to yield correct code).
AllocatableAllocateSource signature is defined but not implemented, and that was the one I had in mind you would implement and use.
I tried to use AllocatableAllocateSource, but I haven't figured out how to handle the stride info. And another problem is what if the source expression is non-pointer non-allocatable variables and elem_len (element size in bytes) in CFI_cdesc_t would be unknown.
You do not really need to handle the stride info manually inside the runtime. I expect the code in AllocatableAllocateSource would be somehow similar to FortranAssign for the reallocation case (except in that for source allocation it should be an error if the allocatable/pointer is already allocated). See https://github.com/llvm/llvm-project/blob/main/flang/runtime/assign.cpp#L62.
Thanks a lot. I tried as you suggest. I don't need to handle the stride info manually. Using genExprBox should work for each case. Will change this PR with the runtime version of implementation later.
flang/lib/Lower/Allocatable.cpp | ||
---|---|---|
411 | I am confused about this description when I read this. Specifically, the "determine" is not clear in the standard. It does not clearly state that the allocate object have the same bounds of the source-expr. For the following case: program main integer, parameter :: n = 5 integer :: a(2:n+1) = [1, 2, 3, 4, 5] call test1(n, a) call test2(n, a) contains subroutine test1(n, a) integer, allocatable :: x1(:), x2(:), x3(:), x4(:) integer :: n, sss, a(n) allocate(x1, x2, source = a) allocate(x3, source = a(2:5:2)) allocate(x4, source = a(2:5), stat=sss) print *, "allocatable: ", lbound(x1), ubound(x1) print *, "allocatable: ", lbound(x2), ubound(x2) print *, "allocatable: ", lbound(x3), ubound(x3) print *, "allocatable: ", lbound(x4), ubound(x4) end subroutine test2(n, a) integer, pointer :: x1(:), x2(:), x3(:), x4(:) integer :: n, sss, a(n) allocate(x1, x2, source = a) allocate(x3, source = a(2:5:2)) allocate(x4, source = a(2:5), stat=sss) print *, "pointer: ", lbound(x1), ubound(x1) print *, "pointer: ", lbound(x2), ubound(x2) print *, "pointer: ", lbound(x3), ubound(x3) print *, "pointer: ", lbound(x4), ubound(x4) end end gfortran, ifort and classic-flang have the same output: allocatable: 1 5 allocatable: 1 5 allocatable: 1 2 allocatable: 1 4 pointer: 1 5 pointer: 1 5 pointer: 1 2 pointer: 1 4 |
flang/lib/Lower/Allocatable.cpp | ||
---|---|---|
411 | I read the "determine" in standard as telling "the new bounds are the one of the source expression". Here is an example where the lower bounds of the source expression are not ones: real :: src(2:11) real, allocatable :: x(:) allocate(x, source=src) print *, lbound(x), ubound(x) end gfortran, ifort and nvfortran all prints 2 11 as expected. (*): The program you gave above as an example contains two things that are counterintuitive in Fortran (IMHO):
|
flang/lib/Lower/Allocatable.cpp | ||
---|---|---|
411 | You are right. I realized I was wrong just now. Thanks for the clarification. For your test case, I used gcc 9.3.0, and it prints 1 10. It prints 2 11 when I use gcc 12. |
flang/lib/Lower/Allocatable.cpp | ||
---|---|---|
411 |
Funny, I only tried one of the latest version. They may have fixed it as a bug. |
flang/include/flang/Runtime/assign.h | ||
---|---|---|
31 | The comment doesn't adequately explain what this API does. What are the two Boolean arguments? What Fortran constructs would use this API instead of Assign()? Does this routine need to be in the API of the runtime support library at all, or is it only used internally? | |
flang/runtime/allocatable.cpp | ||
102 | Why is wasJustAllocated false? You just allocated it. |
Update runtime part to address the comments.
The comment doesn't adequately explain what this API does. What are the two Boolean arguments? What Fortran constructs would use this API instead of Assign()? Does this routine need to be in the API of the runtime support library at all, or is it only used internally?
The main difference of using Assign for allocate statement with source is that there is no realloc. I rethink about it. Add one more argument skipRealloc to Assign is more clear.
@peixin Are you planning more work on this patch or is this the final version to be reviewed?
flang/lib/Lower/Allocatable.cpp | ||
---|---|---|
525 | The lower bounds in the fir.box of a fir::BoxValue are unreliable, the code should use fir::factory::readLowerBound(builder, loc, sourceExv, i, one) to get it. | |
flang/runtime/assign.cpp | ||
167 | Is there code relying on doing assignments with mismatching element numbers ? That seems a bit wild, and although you added a std::min to make it well defined, it seems to me it most likely is a user error and I am not sure we should close the eyes in the runtime call. In an inline implementation, skipping the complexity of emitting the checking code makes sense to me, but here, adding complexity to optionally avoid the check just to match ifort and gfortran default blindness seems overkill to me, unless of course, there is a use case motivating this. |
flang/lib/Lower/Allocatable.cpp | ||
---|---|---|
525 | Thanks a lot. Will fix this in next update. | |
flang/runtime/assign.cpp | ||
167 | I hit this kind of errors in the test cases of classic-flang. I didn't capture one case in real workloads since this check is not enabled by gfortran/ifort/classic-flang. We can support the runtime check by default for now, and turn it back to the current settings when we hit the errors in real workloads. Do you prefer this? |
flang/include/flang/Runtime/assign.h | ||
---|---|---|
29–30 | If this is not an external API for the runtime, then this header file is not the right place to put it. | |
flang/runtime/allocatable.cpp | ||
103 | I don't think that the runtime check should ever be disabled, and certainly not disabled by default. | |
flang/runtime/assign.cpp | ||
196 | Don't recompute min() on every iteration. | |
205 | Don't recompute min() on every iteration. | |
215 | Don't recompute min() on every iteration. | |
224 | Don't recompute min() on every iteration. | |
258 | Don't recompute min() on every iteration. | |
flang/runtime/pointer.cpp | ||
146 | -fcheck-bounds is for indexing. This runtime check should always be enabled. |
Address the comments:
- Use fir::factory::readLowerBound to get the lower bound.
- Remove the runtime check option.
- Move the internal assignment to one new file.
flang/runtime/assign-object.h | ||
---|---|---|
1 ↗ | (On Diff #480882) | Why do you need a new header file and cpp file for this? |
flang/include/flang/Runtime/assign.h | ||
---|---|---|
29–30 | @clementval I add the new header file and cpp file because of this comment. I noticed that flang/include/flang/Runtime/ is where the external APIs belongs to. The internal API can be placed to flang/runtime/. Do I misunderstand this comment? |
flang/include/flang/Runtime/assign.h | ||
---|---|---|
29–30 | Why not keeping flang/runtime/assign.cpp and just adding flang/runtime/assign.h? |
flang/include/flang/Runtime/assign.h | ||
---|---|---|
29–30 | Thanks. Fixed. |
How is the scalar source handled? Here is what I see:
% cat sa.f90 program a integer n integer, allocatable :: i(:) n = 42 allocate(i(4), source=n) end program a % flang sa.f90 % a.out fatal Fortran runtime error(./sa.f90:5): Assign: mismatched ranks (1 != 0) in assignment to unallocated allocatable Aborted
Thanks @sscalpone for providing this case. The case of allocate array with scalar source is missed in this patch. Can I file a ticket for this case and support this in next patch?
If this is not an external API for the runtime, then this header file is not the right place to put it.