This is an archive of the discontinued LLVM Phabricator instance.

[flang] Fix overallocation by fir-to-llvm-ir pass
ClosedPublic

Authored by rovka on Jan 10 2022, 3:50 AM.

Details

Summary

When converting a fir.alloca of an array to the LLVM dialect, we used to
multiply the allocated size by all the constant factors encoded in the
array type. This is fine when the array type is converted to the element
type for the purposes of the allocation, but if it's converted to an
array type, then we might be allocating too much space. For example, for
%2 = fir.alloca !fir.array<8x16x32xf32>, %0, %1 we would allocate
%0 * %1 * 8 * 16 * 32 x llvm.array<32 x array<16 * array<8 x f32>>>. We
really only need to allocate %0 * %1 such arrays.

This patch fixes the issue by taking note of the array type that we're
trying to allocate. It tries to match the behaviour of
LLVMTypeConverter::convertPointerLike, which returns a pointer to the
element type when the array type doesn't have a constant interior.
We consequently only multiply with the constant factors in the array
type if the array type doesn't have a constant interior.

This has the nice side effect that it gets rid of some redundant
multiplications with the constant 1 in some cases.

Diff Detail

Event Timeline

rovka created this revision.Jan 10 2022, 3:50 AM
rovka requested review of this revision.Jan 10 2022, 3:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 10 2022, 3:50 AM
schweitz added inline comments.Jan 10 2022, 12:18 PM
flang/lib/Optimizer/CodeGen/CodeGen.cpp
376–384

This does not appear to be the correct.

alloca !fir.array<*:i32>, %1, %2

has a constant sized interior (the element type i32 has a constant size), but it is clearly not valid to drop the array shape arguments.

This applies to a case like

alloca !fir.array<5x6x?x?xf64>, %3, %4, %5

as well. In this second case, space of %3 * %4 * %5 * 5 * 6 * sizeof(f64) bytes are required to be allocated even though the type has a constant interior of 5 * 6 * sizeof(f64).

rovka added inline comments.Jan 11 2022, 1:51 AM
flang/lib/Optimizer/CodeGen/CodeGen.cpp
376–384

If I understand correctly, the first example is not valid, since you can't call alloca on a sequence type with unknown rank. You'd get an error from here.

As for the second example, it's very similar to the test I'm adding at line 1119, alloca_const_interior_array. That has

fir.alloca !fir.array<8x9x?x?xf32>, %0, %1

and it allocates %0 * %1 * array<9 x array<8 x f32>>, so I believe it's allocating the right amount. Am I missing something?

schweitz accepted this revision.Jan 11 2022, 1:57 PM
schweitz added inline comments.
flang/lib/Optimizer/CodeGen/CodeGen.cpp
376–384

Ok, got it now. Thanks.

This revision is now accepted and ready to land.Jan 11 2022, 1:57 PM
awarzynski accepted this revision.Jan 12 2022, 1:11 AM

Makes sense, thank you @rovka !

This revision was automatically updated to reflect the committed changes.