This is an archive of the discontinued LLVM Phabricator instance.

Fix PR#24142
ClosedPublic

Authored by Carrot on Jul 15 2015, 4:44 PM.

Details

Reviewers
mkuper
rnk
Summary

Function getSPAdjust doesn't consider the stack alignment, its usage in PEI::replaceFrameIndices causes wrong offset is used for stack objects.

This patch adds a function to align the SP adjustment to the correct alignment, and call it from getSPAdjust and ARM backend. It can/should also be used by other backends if SP adjustment is need to be aligned.

Diff Detail

Event Timeline

Carrot updated this revision to Diff 29850.Jul 15 2015, 4:44 PM
Carrot retitled this revision from to Fix PR#24142.
Carrot updated this object.
Carrot added a subscriber: llvm-commits.
chandlerc added a subscriber: chandlerc.

In addition to needing a test, Reid should probably have a look as he's thought a lot about stack alignment and adjustment.

Carrot updated this revision to Diff 29944.Jul 16 2015, 2:39 PM

A testcase is added, before this patch, wrong address [sp, #2116] is generated.

Can you use RoundUpToAlignment?

Carrot updated this revision to Diff 30003.Jul 17 2015, 8:46 AM

Use RoundUpToAlignment to calculate the aligned SP adjustment.

hfinkel added inline comments.
include/llvm/Target/TargetFrameLowering.h
79

Can't you just write this as:

SPAdj = -RoundUpToAlignment(-SPAdj, StackAlignment);
test/CodeGen/ARM/align-sp-adjustment.ll
3 ↗(On Diff #30003)

How stable is this number?

Carrot updated this revision to Diff 30872.Jul 28 2015, 4:26 PM
Carrot marked an inline comment as done.
Carrot marked an inline comment as done.Jul 28 2015, 4:40 PM
Carrot added inline comments.
test/CodeGen/ARM/align-sp-adjustment.ll
4 ↗(On Diff #30872)

This number is the sum of stack frame and the space of outward parameter(8). The stack frame is:

push.w  {r4, r5, r6, r7, r8, r9, r10, r11, lr}

.Ltmp0:

.cfi_def_cfa_offset 36

.Ltmp1:

.cfi_offset lr, -4

.Ltmp2:

.cfi_offset r11, -8

.Ltmp3:

.cfi_offset r10, -12

.Ltmp4:

.cfi_offset r9, -16

.Ltmp5:

.cfi_offset r8, -20

.Ltmp6:

.cfi_offset r7, -24

.Ltmp7:

.cfi_offset r6, -28

.Ltmp8:

.cfi_offset r5, -32

.Ltmp9:

.cfi_offset r4, -36 
.pad    #2064
sub.w   sp, sp, #2064
.pad    #12
sub     sp, #12

12 is padding, 2064 is the size of local variable, the rest are for callee saved registers. The only possible changing size is the space for callee saved registers, and the only possible optimization that can impact it is register allocation. There are several function call in the loop, most of the callee saved registers are used to keep variables cross function call, so this looks unlikely to change.

Carrot marked an inline comment as done.Aug 5 2015, 11:15 AM

ping

rnk added a reviewer: mkuper.Aug 10 2015, 9:26 AM
rnk edited edge metadata.

Michael added getSPAdjust recently, so he can comment on what it should be doing.

lib/CodeGen/TargetInstrInfo.cpp
664

I don't think this is the right place to fix this. getSPAdjust is also used on X86 to return SP adjustment values for PUSH, POP, CALL, and others. These typically produce values that are not a multiple of stack alignment (4 / 8 vs 16).

rnk added inline comments.Aug 10 2015, 9:31 AM
test/CodeGen/ARM/align-sp-adjustment.ll
4 ↗(On Diff #30872)

I think you can reduce this test case further to make it more robust and easier to understand.

The string parameters are probably not needed, and can be i32's or i8* null values to make it shorter.

The array can instead be a simple array of i32, like %params = alloca i32, 516, which gives an i32*, which requires less GEP manipulation.

14 ↗(On Diff #30872)

I'm surprised this assembles, I don't see a definition of #3 or any other attribute sets.

mkuper added inline comments.Aug 11 2015, 4:02 AM
lib/CodeGen/TargetInstrInfo.cpp
664

I think this actually is the right place for it.
This looks like a long-standing bug from before I factored this code out of PEI and into TFI.

From my commit message on the refactoring, emphasis is new:
"PEI tries to keep track of how much starting or ending a call sequence adjusts the stack pointer by, so that it can resolve frame-index references. Currently, it takes a very simplistic view of how SP adjustments are done - both FrameStartOpcode and FrameDestroyOpcode adjust it exactly by the amount written in its first argument.

This view is in fact incorrect for some targets (*e.g. due to stack re-alignment*, or because it may want to adjust the stack pointer in multiple steps). However, that doesn't cause breakage, because most targets (the only in-tree exception appears to be 32-bit ARM) rely on being able to simplify the call frame pseudo-instructions earlier, so this code is never hit. "

The newly-created (in a later commit) X86 version of getSPAdjust actually does take alignment into account, but, unfortunately, I never went back and fixed the ARM version - I didn't know it's broken in practice.

rnk added inline comments.Aug 11 2015, 8:56 AM
lib/CodeGen/TargetInstrInfo.cpp
664

Do you think the new behavior of aligning the outgoing call argument space should live in the ARM backend or the generic Target layer?

I'm surprised that the backends aren't responsible for building aligned call stack adjustments. I guess this ties into the whole crazy phase ordering of stack realignment, where we try to support late stack realignment if the register allocator wants it.

mkuper added inline comments.Aug 11 2015, 9:20 AM
lib/CodeGen/TargetInstrInfo.cpp
664

The generic target layer makes sense to me. The fact the call stack adjustment instructions don't take alignment into account isn't backend-specific.
I should probably also refactor the X86 code to call into the generic layer for the framesetup/framedestroy case. Will try to do that once this patch goes in.

And yes, I was surprised at that either.

Carrot marked an inline comment as done.Aug 11 2015, 4:39 PM
Carrot added inline comments.
test/CodeGen/ARM/align-sp-adjustment.ll
4 ↗(On Diff #30872)

I tried to change the string to integer 0, unfortunately it failed to reproduce the bug. The reason is the compiler need to generate some instructions to compute the address of string, store it in some register, and moved the code out of the loop since it is loop invariant. If it is changed to some constant, then it need not to be stored in register, the extra register can be used store the value of the fifth parameter, the load is moved out of the loop, so avoid the problem code sequence.

Similarly if I change the data type of %params, the address computation is also changed, and the register usage is changed, causes the fifth parameter loaded before the loop, and failed to reproduce the bug.

rnk accepted this revision.Aug 11 2015, 5:50 PM
rnk edited edge metadata.

OK, seems reasonable.

This revision is now accepted and ready to land.Aug 11 2015, 5:50 PM
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in rL245253.