Page MenuHomePhabricator

Fix AArch64 AAPCS frame record chain
ClosedPublic

Authored by logan on Nov 27 2019, 11:49 PM.

Details

Summary

After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register) may be placed in the middle of a stack frame if a function has both callee-saved general-purpose registers and floating point registers. This will break the stack unwinders that simply walk through the frame records (based on the guarantee from AAPCS64 "The Frame Pointer" section). This commit fixes the problem by adding the frame record offset.

Diff Detail

Event Timeline

logan created this revision.Nov 27 2019, 11:49 PM

Thanks for creating a fix for this @logan!

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459

There is a condition here that is not yet tested. If the frame-record is saved that is both LR and FP, not just FP, so is this case needed?

logan marked an inline comment as done.Nov 28 2019, 10:22 PM
logan added inline comments.
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459

I think we don't have to check LR here because AAPCS64 guarantees FP and LR will be spilled to consecutive words. Besides, we only care about the address of the spilled FP.

sdesmalen added inline comments.Nov 29 2019, 2:08 AM
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459

If LR is also spilled, shouldn't isPaired() be true? This is testing two layouts (LR, FP) and (FP, <something else>), where the latter is not a frame-record.

This makes me think that this code also needs a condition for hasFP(MF), because that guarantees the existence of a frame-record (as opposed to ordinary spills of FP/LR that don't necessarily constitute the framerecord)

logan marked an inline comment as done.Nov 29 2019, 11:07 PM
logan added inline comments.
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459

I am concerned about the case that FP is the second register in the pair (next one). Since the EmitMI function only gets a RegisterPairInfo. I cannot see the next RegisterPairInfo. Let me think about how to revise this tomorrow.

hasFP(MF) makes sense to me. I'll update the code in the next revision.

logan updated this revision to Diff 231642.Dec 1 2019, 11:20 PM
logan marked 3 inline comments as done.
logan added inline comments.
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2459

Revised. Please take a look. Thanks.

sdesmalen added inline comments.Dec 2 2019, 2:56 AM
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
1184

Having calculated the offset to the FrameRecord explicitly, can we now replace this with:

int FPOffset = AFI->getFrameRecordOffset();

?

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
134

Is this description correct? The current meaning of FrameRecordOffset seems to be the offset from SP _after_ allocating the callee-save area.

logan updated this revision to Diff 231820.Dec 2 2019, 9:52 PM
logan marked 4 inline comments as done.
logan added inline comments.
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
1184

Yes. Replaced in the latest Diff.

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
134

Thanks. Reworded.

logan updated this revision to Diff 231828.Dec 2 2019, 11:02 PM
logan marked 2 inline comments as done.

Functionally the patch is looking good, nearly there!

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
134

Thanks! Is it worth renaming the variable to something like OffsetToFrameRecordFromCalleeSaveBase to make the meaning of the variable easier to understand in the places it is used? (I couldn't think of something shorter :))

llvm/test/CodeGen/AArch64/framelayout-frame-record.ll
12 ↗(On Diff #231828)

This has some implicit knowledge that d9 and d8 will be used to keep d0 and d1.
This is probably better tested with a MIR test like:

#RUN: llc -mtriple=aarch64-- -start-before prologepilog %s -o - | FileCheck %s
---
name: TestFrameRecordLocation
tracksRegLiveness: true
frameInfo:
  isFrameAddressTaken: true
body: |
  bb.0:
    $d8 = IMPLICIT_DEF
    $d9 = IMPLICIT_DEF
    RET_ReallyLR
...

which explicitly marks d8 and d9 as callee-saved. The isFrameAddressTaken: true will trigger the use of the frame-pointer, and thus storing of the frame-record. You only need to add a few check lines.

13 ↗(On Diff #231828)

nit: instead of checking [[FP_OFFSET:16]] as a variable, you can just check 16 directly.

logan updated this revision to Diff 232037.Dec 3 2019, 11:27 PM
logan marked 5 inline comments as done.Dec 3 2019, 11:31 PM
logan added inline comments.
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2234

Just noticed that Windows have different ordering.

llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
134

The naming convention in the file seems to be {get,set,}XXXOffset, thus I renamed this to CalleeSaveBaseToFrameRecordOffset.

logan marked an inline comment as done.Dec 3 2019, 11:32 PM
sdesmalen accepted this revision.Dec 4 2019, 2:10 PM

Thanks @logan , LGTM!

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
2234

Yes, good spot!

This revision is now accepted and ready to land.Dec 4 2019, 2:10 PM
srhines added a subscriber: srhines.
logan updated this revision to Diff 233260.Dec 10 2019, 11:16 PM

Run git clang-format and resolve rebase conflicts.

I am currently converting my SVN commit access to GitHub access. I'll land this as soon as possible.

This revision was automatically updated to reflect the committed changes.
logan reopened this revision.Dec 16 2019, 10:17 PM

I reverted the CL because I encountered the following assertion:

assert((DestReg != AArch64::SP || Bytes % 16 == 0) &&
       "SP increment/decrement not 16-byte aligned");

I will upload a revision soon.

This revision is now accepted and ready to land.Dec 16 2019, 10:17 PM
logan updated this revision to Diff 257595.Apr 14 2020, 9:36 PM

Rebase to latest LLVM master branch

Rebase to latest LLVM master branch

Did you make any other changes other than the rebase?

Rebase to latest LLVM master branch

Did you make any other changes other than the rebase?

Nope. I am still debugging the assertion failure. One of the emitFrameOffset in emitEpilogue get non-zero NumBytes that is not multiple of 16. But from the limited backtrace, I have some difficulty to spot the problem.

BTW, I cannot locally reproduce it with the same command. On my side, the command can run without problems.

This revision was landed with ongoing or failed builds.Aug 27 2020, 10:30 AM
This revision was automatically updated to reflect the committed changes.

I've had to revert this patch because it caused runtime failures when building spec2k/eon with -march=armv8-a+sve -mllvm -aarch64-sve-vector-bits-min=256.

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
3445–3446

I don't believe this is a safe fix for the issue mentioned when the patch was previously reverted by https://reviews.llvm.org/rG04879086b44348cad600a0a1ccbe1f7776cc3cf9.

The stack always being 16-byte align is a requirement for the AAPCS.

But doesn't that mean the assert is now protecting less than it was? Without the knowledge of FP's displacement, you cannot know if Bytes % 8 == 0 is safe or not. Is the knowledge of FP's displacement recorded anyway so the original intent of the assert is not compromised? From an SVE point of view, FP displacement will either need to be prevented or undone if it's used to access SVE stack slots. This is because SVE offsets are implicitly scaled and thus any byte based displacement cannot be encoded into its instructions.

It's my believe that at a minimum the assert should be tightened up to pre-D70800 levels before reapplying the patch.

Paul!!!

On 01/09/2020, 16:28, "Owen Anderson" <resistor@icloud.com> wrote:

Sure, but that means there need to be a way to detect when FP is displaced (hence my original question). If such data is available (and it needs to be for SVE code generation to function correctly) then I don't see why the assert cannot use it to assert Bytes is correctly aligned and displaced.

On 01/09/2020, 18:29, "Owen Anderson" <resistor@icloud.com> wrote:

The assertion is protecting less than it was, because the original was overly strictly.  Please refer to the test case I mentioned, which provides an example where AAPCS-compliant code generation (frame record layout is part of AAPCS!) is impossible under the original assertion.  As such, I don’t see a path to returning the assertion to its original form.  Again, correct AAPCS-compliant code generation for this test is impossible with the assertion as it was.

I suspect you will find that if you use that test case as a starting point and add an SVE stack object to it, you will find that the SVE frame layout code needs to handle the case of an unaligned frame pointer.  That’s most likely the underlying cause of the failure you were seeing.

Fair enough, thanks for the information Owen.

Clearly my preference would be to not fix one AArch64 bug by introducing another. You sound more knowledgeable in this area than myself so perhaps could create your suggested SVE test cases to ensure this isn't the case. That said, if you prefer to just reapply the patch as is, at least we've recorded the issue for others working on SVE.

Paul!!!

On 01/09/2020, 19:12, "Owen Anderson" <resistor@mac.com> wrote: