After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register) may be placed in the middle of a stack frame if a function has both callee-saved general-purpose registers and floating point registers. This will break the stack unwinders that simply walk through the frame records (based on the guarantee from AAPCS64 "The Frame Pointer" section). This commit fixes the problem by adding the frame record offset.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
2459 | I think we don't have to check LR here because AAPCS64 guarantees FP and LR will be spilled to consecutive words. Besides, we only care about the address of the spilled FP. |
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
2459 | If LR is also spilled, shouldn't isPaired() be true? This is testing two layouts (LR, FP) and (FP, <something else>), where the latter is not a frame-record. This makes me think that this code also needs a condition for hasFP(MF), because that guarantees the existence of a frame-record (as opposed to ordinary spills of FP/LR that don't necessarily constitute the framerecord) |
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
2459 | I am concerned about the case that FP is the second register in the pair (next one). Since the EmitMI function only gets a RegisterPairInfo. I cannot see the next RegisterPairInfo. Let me think about how to revise this tomorrow. hasFP(MF) makes sense to me. I'll update the code in the next revision. |
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
2459 | Revised. Please take a look. Thanks. |
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp | ||
---|---|---|
1184 | Having calculated the offset to the FrameRecord explicitly, can we now replace this with: int FPOffset = AFI->getFrameRecordOffset(); ? | |
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h | ||
134 | Is this description correct? The current meaning of FrameRecordOffset seems to be the offset from SP _after_ allocating the callee-save area. |
Functionally the patch is looking good, nearly there!
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h | ||
---|---|---|
134 | Thanks! Is it worth renaming the variable to something like OffsetToFrameRecordFromCalleeSaveBase to make the meaning of the variable easier to understand in the places it is used? (I couldn't think of something shorter :)) | |
llvm/test/CodeGen/AArch64/framelayout-frame-record.ll | ||
12 ↗ | (On Diff #231828) | This has some implicit knowledge that d9 and d8 will be used to keep d0 and d1. #RUN: llc -mtriple=aarch64-- -start-before prologepilog %s -o - | FileCheck %s --- name: TestFrameRecordLocation tracksRegLiveness: true frameInfo: isFrameAddressTaken: true body: | bb.0: $d8 = IMPLICIT_DEF $d9 = IMPLICIT_DEF RET_ReallyLR ... which explicitly marks d8 and d9 as callee-saved. The isFrameAddressTaken: true will trigger the use of the frame-pointer, and thus storing of the frame-record. You only need to add a few check lines. |
13 ↗ | (On Diff #231828) | nit: instead of checking [[FP_OFFSET:16]] as a variable, you can just check 16 directly. |
Run git clang-format and resolve rebase conflicts.
I am currently converting my SVN commit access to GitHub access. I'll land this as soon as possible.
I reverted the CL because I encountered the following assertion:
assert((DestReg != AArch64::SP || Bytes % 16 == 0) && "SP increment/decrement not 16-byte aligned");
I will upload a revision soon.
Nope. I am still debugging the assertion failure. One of the emitFrameOffset in emitEpilogue get non-zero NumBytes that is not multiple of 16. But from the limited backtrace, I have some difficulty to spot the problem.
BTW, I cannot locally reproduce it with the same command. On my side, the command can run without problems.
I've had to revert this patch because it caused runtime failures when building spec2k/eon with -march=armv8-a+sve -mllvm -aarch64-sve-vector-bits-min=256.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | ||
---|---|---|
3445–3446 | I don't believe this is a safe fix for the issue mentioned when the patch was previously reverted by https://reviews.llvm.org/rG04879086b44348cad600a0a1ccbe1f7776cc3cf9. The stack always being 16-byte align is a requirement for the AAPCS. |
But doesn't that mean the assert is now protecting less than it was? Without the knowledge of FP's displacement, you cannot know if Bytes % 8 == 0 is safe or not. Is the knowledge of FP's displacement recorded anyway so the original intent of the assert is not compromised? From an SVE point of view, FP displacement will either need to be prevented or undone if it's used to access SVE stack slots. This is because SVE offsets are implicitly scaled and thus any byte based displacement cannot be encoded into its instructions.
It's my believe that at a minimum the assert should be tightened up to pre-D70800 levels before reapplying the patch.
Paul!!!
On 01/09/2020, 16:28, "Owen Anderson" <resistor@icloud.com> wrote:
Sure, but that means there need to be a way to detect when FP is displaced (hence my original question). If such data is available (and it needs to be for SVE code generation to function correctly) then I don't see why the assert cannot use it to assert Bytes is correctly aligned and displaced.
On 01/09/2020, 18:29, "Owen Anderson" <resistor@icloud.com> wrote:
The assertion is protecting less than it was, because the original was overly strictly. Please refer to the test case I mentioned, which provides an example where AAPCS-compliant code generation (frame record layout is part of AAPCS!) is impossible under the original assertion. As such, I don’t see a path to returning the assertion to its original form. Again, correct AAPCS-compliant code generation for this test is impossible with the assertion as it was. I suspect you will find that if you use that test case as a starting point and add an SVE stack object to it, you will find that the SVE frame layout code needs to handle the case of an unaligned frame pointer. That’s most likely the underlying cause of the failure you were seeing.
Fair enough, thanks for the information Owen.
Clearly my preference would be to not fix one AArch64 bug by introducing another. You sound more knowledgeable in this area than myself so perhaps could create your suggested SVE test cases to ensure this isn't the case. That said, if you prefer to just reapply the patch as is, at least we've recorded the issue for others working on SVE.
Paul!!!
On 01/09/2020, 19:12, "Owen Anderson" <resistor@mac.com> wrote:
Having calculated the offset to the FrameRecord explicitly, can we now replace this with:
?