This is an archive of the discontinued LLVM Phabricator instance.

[GlobalOpt] Disable heap SROA when GEP of the only storer is used
AbandonedPublic

Authored by MaskRay on Apr 24 2021, 11:31 PM.

Details

Summary

heap-sra-2.ll and heap-sra-phi.ll test that we can perform SROA on a global
variable with only one storer (malloc).

The code does not consider that if a GEP of the storer is used, an arbitrary
element in the allocated array can be modified and we cannot correctly represent
it with a synthesized load of the global variable (some trivial usage e.g.
hasAllZeroIndices can be represented by ad-hoc code but it is clear the
optimization is worth the code complexity). For PR50027 we will get an assertion error
when creating GEP, but we could have correctness issues as well. Simply
disable the unsafe optimization.

Fix PR50027

Diff Detail

Unit TestsFailed

Event Timeline

MaskRay created this revision.Apr 24 2021, 11:31 PM
MaskRay requested review of this revision.Apr 24 2021, 11:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2021, 11:31 PM

I'm not familiar with the terminology of "storer", so the description is confusing to me.
Trying to understand the test case and description, is the issue that Heap SROA isn't handling a store of the entire struct (as opposed to initializing the struct element by element)?

MaskRay added a comment.EditedApr 27 2021, 9:13 AM

I'm not familiar with the terminology of "storer", so the description is confusing to me.

I mean the instruction which stores to the malloc return value. I don't know an appropriate term.

Trying to understand the test case and description, is the issue that Heap SROA isn't handling a store of the entire struct (as opposed to initializing the struct element by element)?

If the malloc allocates an array, say {a, b}{a, b}{a, b}{a, b}. The heap SROA makes it [a,a,a,a] [b,b,b,b] (i.e. elements are shuffled).
With GEP poking different indexes, this can either cause misoptimizations or trigger assertion failures (indexes are not properly converted). PR50027 is an assertion failure.

I'm not familiar with the terminology of "storer", so the description is confusing to me.

I mean the instruction which stores to the malloc return value. I don't know an appropriate term.

"the store into the malloc return value/pointer" seems clearer to me

llvm/lib/Transforms/IPO/GlobalOpt.cpp
1016

seems like this doesn't take into account when the global variable is an array.
normally, if we just have a normal struct, the GEP's first operand is the GV, second operand is something like 0, and the third operand indexes into the struct itself, which is fine for SROA. having fewer than 3 operands can directly reference the struct
with an array, we need to account for an extra index to index into the array

MaskRay added inline comments.Apr 27 2021, 11:59 AM
llvm/lib/Transforms/IPO/GlobalOpt.cpp
1016

Do you have specific suggestion to the diff? :)

I am thinking the value of array heap SROA is low but I don't want to just remove the implementation, so drop GEP - see the tests like heap-sra-phi.ll, heap SROA still works if the function allocates and forgets the array.

aeubanks added inline comments.Apr 27 2021, 2:28 PM
llvm/lib/Transforms/IPO/GlobalOpt.cpp
1016

maybe check the GV type to see if it's an array, and in that case use 4 instead of 3?
supporting GEPs seems natural, I'm not sure why we wouldn't want to, seems like code would probably use GEPs into global values to do things

MaskRay marked an inline comment as done.Apr 27 2021, 2:40 PM
MaskRay added inline comments.
llvm/lib/Transforms/IPO/GlobalOpt.cpp
1016

For heap-sra-negative.ll, GV is @X = internal global %struct.foo* null (decayed array). Inst->getNumOperands() >= 4 cannot catch the case.

This function is only used to check CI - the only store instruction. Load instructions are not subject to the constraint, so dropping the support here will not punish the load site GEP.

MaskRay marked an inline comment as done.May 10 2021, 12:16 PM

🙄

aeubanks added inline comments.May 10 2021, 9:41 PM
llvm/lib/Transforms/IPO/GlobalOpt.cpp
1016

Inst->getNumOperands() >= 4 does catch the case, heap SROA doesn't trigger in your test case when we change 3 to 4.
I'm pretty sure "// Must index into the array and into the struct." is what I explained in the first comment of this thread.

This function is only used to check CI - the only store instruction.

CI is the malloc call, not the store (unless I'm misunderstanding your comment).

Load instructions are not subject to the constraint, so dropping the support here will not punish the load site GEP.

I don't understand. With this patch, if there's a GEP of the malloc anywhere, even if it's used for a load, it gives up.

But it does look like you're right and this is probably never actually hit in practice. Running your example through -O2, the store gets split by instcombine and the global isn't considered "once stored" when GlobalOpt runs. I bootstrapped LLVM with a crash in PerformHeapAllocSRoA and it was never hit. We should either go with the 3 -> 4, or completely remove heap SROA.

Thanks for testing that this does not trigger during bootstrapping. Now I am more confident that we should drop the current implementation: D102257.
I have kept some tests in case someone wants to create a better implementation in the future.

MaskRay abandoned this revision.May 11 2021, 11:35 AM

Abandon in favor of D102257 (removing heap SROA).