This is an archive of the discontinued LLVM Phabricator instance.

[flang] RFC: -fstack-arrays implementation
ClosedPublic

Authored by tblah on Dec 8 2022, 3:10 AM.

Details

Summary

From gfortran's man page:

Adding this option will make the Fortran compiler put all local arrays, even those of unknown size onto stack memory. If your program uses very large local arrays it is possible that you will have to extend your runtime limits for stack memory on some operating systems. This flag is enabled by default at optimization level -Ofast.

This RFC provides an evaluation of the scope of work required to support -fstack-arrays as a pass responsible for moving array heap allocations to the stack.

Github issue: https://github.com/llvm/llvm-project/issues/59231

The proposed design works in most cases, the exceptions are

    • Where blocks: the generated code is too complex for the pass to follow. There are multiple known bugs in the generated code so the easiest course of action is likely to be to fix this in codegen.
  • Array constructors: the generated code uses realloc() because it is not always possible to tell ahead of time how large the constructed array will be. This makes it difficult to rewrite the generated code to use alloca. gfortran is able to transform this case into a stack allocation. This more complex case will be left for future work.

For more information on these two cases see the section on ConvertExpr.cpp.

Diff Detail

Event Timeline

tblah created this revision.Dec 8 2022, 3:10 AM
Herald added a project: Restricted Project. · View Herald Transcript
tblah requested review of this revision.Dec 8 2022, 3:10 AM

Thanks for posting this RFC @tblah. Could you also comment on why a general pass like MemoryAllocationOpt that converts Stack Allocation to Heap Allocation is not suitable here? Please include this in the RFC as an alternative option that was considered.

flang/docs/fstack-arrays.md
20

Could you add a FIR code example of the state now and after your change?

43

Could you expand this a bit more, particularly why stack allocation cannot be used?

peixin added inline comments.Dec 8 2022, 4:10 AM
flang/docs/fstack-arrays.md
33

Why? Is it possible to transform fir.allocmem into fir.alloca in one place when the option is enabled in codegen?

How about adding one unit attribute to fir.allocmem to label that the data object cannot be allocated on stack?

43

+1

Could you also comment on why a general pass like MemoryAllocationOpt that converts Stack Allocation to Heap Allocation is not suitable here? Please include this in the RFC as an alternative option that was considered.

That would be good to know. Modifying the behavior in all the mentioned places sounds a bit hacky and error prone in the future. How are you going to ensure future code will follow what you propose here. I think a centralized pass is preferred even if it requires extra attributes or smth similar.

Note that lowering is being updated and that in the future, temporaries for expressions will be created in HLFIR bufferization pass: https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp.

tblah updated this revision to Diff 481355.Dec 8 2022, 10:27 AM
tblah edited the summary of this revision. (Show Details)

Updates:

  • Organize document according to the format recommended in DesignGuideline.md
  • Explain why a piecemeal approach was chosen instead of a single pass performing this conversion
  • Mention that most of the referenced code will be replaced by HLFIR
  • Include an example of the changes to array value copy
  • Provide a more detailed explanation of ConvertVariable.cpp
  • Add a testing plan
tblah added a comment.Dec 8 2022, 10:32 AM

Could you also comment on why a general pass like MemoryAllocationOpt that converts Stack Allocation to Heap Allocation is not suitable here? Please include this in the RFC as an alternative option that was considered.

That would be good to know. Modifying the behavior in all the mentioned places sounds a bit hacky and error prone in the future. How are you going to ensure future code will follow what you propose here. I think a centralized pass is preferred even if it requires extra attributes or smth similar.

Thanks for taking a look. I have explained my reasoning in the updated version of the patch. In short, it isn't easy to match up fir.allocmem and fir.free.

Note that lowering is being updated and that in the future, temporaries for expressions will be created in HLFIR bufferization pass: https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp.

Thanks for the heads up. I've added a mention of this to the document. Would it be okay to come back to HLFIR changes in a later patch to this RFC?

tblah marked 3 inline comments as done.Dec 8 2022, 10:34 AM
tblah added inline comments.
flang/docs/fstack-arrays.md
33

Do you mean an attribute meaning something like "this is a real allocate statement not something added during code generation"?

Please see my explanation for my reasoning in the updated version of the patch.

clementval added a comment.EditedDec 8 2022, 11:15 AM

Could you also comment on why a general pass like MemoryAllocationOpt that converts Stack Allocation to Heap Allocation is not suitable here? Please include this in the RFC as an alternative option that was considered.

That would be good to know. Modifying the behavior in all the mentioned places sounds a bit hacky and error prone in the future. How are you going to ensure future code will follow what you propose here. I think a centralized pass is preferred even if it requires extra attributes or smth similar.

Thanks for taking a look. I have explained my reasoning in the updated version of the patch. In short, it isn't easy to match up fir.allocmem and fir.free.

Ok but it can be made easy to match if necessary. I still think a centralized way of doing is preferable.

peixin added inline comments.Dec 8 2022, 5:42 PM
flang/docs/fstack-arrays.md
4

Can you add more descriptions about what kind of arrays will be allocated on the stack? It would be good to add the description of -fstack-arrays of LLVM Flang and maintain it each time the lowering/codegen code is changed.

16

Can you give one example that it is hard to match the fir.allocmem op with associated fir.freemem?

33

No. What I mean is to add one unit attribute, maybe named mustHeapAlloc. In lowering code, you won't need to add one if-else of fir.allocmem and fir.alloca. Just add the unit attribute if fstack-arrays is enabled.

I am thinking if we can also add one unit attribute to fir.freemem, too. Then, we can make the transformation in one pass or somewhere in codegen at one time.

  1. If the pattern of what can be transformed from heap allocation into stack sllocation can be captured, then no need to change anything in lowering code. Do it directly in the single-maintained place (in one pass or somewhere in codegen).
  2. For special cases, add the unit attirbute in lowering code, make the transformation by checking the attribute in the single-maintained place (in one pass or somewhere in codegen).
55

Do we use git diff style to show the FIR changes in doc files? Maybe removing the unrelated FIR is more clear such as follows.

temp-arrays.fir
%1 = fir.allocmem !fir.array<5xi32>
... // other FIR code including accessing %1
fir.freemem %1 : !fir.heap<!fir.array<5xi32>>

// Maybe describe what -fstack-arrays does to generate the following?

temp-arrays-stack-allocation.fir
%1 = fir.alloca !fir.array<5xi32>
... // other FIR code including accessing %1
// No fir.freemem for stack allocated %1
tblah added inline comments.Dec 9 2022, 3:50 AM
flang/docs/fstack-arrays.md
16

The simplest example would be when the fir.freemem is in another fir.func. But these couldn't be converted to stack allocations anyway.

Another example:

subroutine example(n, arr)
  integer, intent(in) :: n
  integer, dimension(:), allocatable, intent(out) :: arr

  if (n .lt. 1) then
    return
  end if

  allocate(arr(n))

  call doSomething(arr)

  if (isFinished(arr)) then
    deallocate(arr)
  end if

! arr might still be alive here
end subroutine

Data flow analysis would be needed to determine whether arr has a lifetime which outlives the fir.func in which is was first allocated (and so could not be moved onto the stack).

Thanks for the heads up. I've added a mention of this to the document. Would it be okay to come back to HLFIR changes in a later patch to this RFC?

Yes, that sounds reasonable to me.

peixin added inline comments.Dec 9 2022, 11:15 PM
flang/docs/fstack-arrays.md
16

Can -fstack-arrays be applied to allocatable arrays? I checked gfortran and classic-flang, both of which don't allocate memory on stack for this case when fstack-arrays is enabled.

tblah added inline comments.Dec 10 2022, 2:44 AM
flang/docs/fstack-arrays.md
16

Thanks for checking. My concern is that a general transformation will have to understand FIR for any situation where fir.allocmem and fir.freemem are generated and so will have to be able to distinguish, from the FIR, situations like this from those where arrays can be moved onto the stack.

If instead the codegen was modified, the context would be still available and so it would be easy to only try moving the right sort of allocations.

Another approach might to be add an attribute either saying "consider moving this heap allocation to the stack" or "never move this heap allocation to the stack". That could simplify things, but then we would have the same problem of having to modify all the right sections of code-gen, and keep that maintained going forward.

peixin added inline comments.Dec 10 2022, 7:42 PM
flang/docs/fstack-arrays.md
16

Sounds reasonable to me. Thanks.

Could you have an array builder/materialiser component for creating arrays? Despite the code being spread over the lowering, there is a central component in charge of arrays?!?

clementval added inline comments.Dec 14 2022, 12:30 AM
flang/docs/fstack-arrays.md
16

That could simplify things, but then we would have the same problem of having to modify all the right sections of code-gen, and keep that maintained going forward.

You mean lowering right? If an attribute is needed I thing it's a nicer solution than having an if-else at every place we create an allocmemop/freememop.

tblah updated this revision to Diff 482880.Dec 14 2022, 9:07 AM
tblah edited the summary of this revision. (Show Details)

Thank you everyone for the feedback so far.

Following the feedback, I have decided to change my approach and implement this as a pass responsible for moving array heap allocations to the stack.

Data flow analysis using the MLIR DFA framework will be used to find allocations where both the fir.allocmem and fir.freemem are in the same function, operating on the same SSA value, and the fir.freemem will always be executed before function return.

Suitable allocations will then be moved to the stack (if possible without introducing a fir.alloca inside of a loop).

fir.allocmem generated by an allocate statement in source code will not be converted, as the standard seems to require these remain on the heap (C.12.1.2).

tblah marked 4 inline comments as done.Dec 14 2022, 9:09 AM
tblah updated this revision to Diff 483934.Dec 19 2022, 5:57 AM

Add a section naming a new FIR attribute to mark allocations which should never be moved to the stack: fir.must_be_heap

tblah updated this revision to Diff 486207.Jan 4 2023, 2:33 AM
tblah edited the summary of this revision. (Show Details)
  • Added a paragraph explaining handling of fir.call operations during data flow analysis
  • Added information about the two known cases where the existing analysis is insufficient (see the section on ConvertExpr.cpp)

I would like to merge what exists already for the easy cases and return to the two edge cases once HLFIR work is more complete.

jeanPerier added inline comments.Jan 4 2023, 10:17 AM
flang/docs/fstack-arrays.md
69
peixin added inline comments.Jan 5 2023, 4:42 AM
flang/docs/fstack-arrays.md
8
  1. I think you are referring to Ubuntu Manpage: gfortran (https://manpages.ubuntu.com/manpages/trusty/man1/arm-linux-gnueabi-gfortran.1.html), instead of gfortran manual page (https://gcc.gnu.org/onlinedocs/gfortran/Code-Gen-Options.html).
  1. Is it allowed to copy it from the GCC manual page in LLVM doc files? I don't know.
tblah updated this revision to Diff 486538.Jan 5 2023, 5:45 AM
tblah marked 5 inline comments as done.
  • Remove quote from gfortran's man page
  • Fix typo
tblah marked an inline comment as done.Jan 5 2023, 5:46 AM
tblah added inline comments.
flang/docs/fstack-arrays.md
8

Thanks for pointing this out. I have replaced the direct quote with a rephrasing.

69

Thanks!

tblah updated this revision to Diff 488640.Jan 12 2023, 7:00 AM
tblah marked an inline comment as done.
  • Add mention of CFG loops, removed TODO for goto statements
  • Specified the conditions under which llvm.stacksave/llvm.stackrestore intrinsics are safe to use
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 7:00 AM
tblah added a comment.Feb 7 2023, 2:29 AM

Ping for review on this. The pass implementation is now merged.

jeanPerier accepted this revision.Feb 13 2023, 2:37 AM

Looks good, thanks for documenting this.

This revision is now accepted and ready to land.Feb 13 2023, 2:37 AM
This revision was automatically updated to reflect the committed changes.