Page MenuHomePhabricator

Fix bug in UnwindAssemblyInstEmulation with fp-using codegen and mid-function epilogues
ClosedPublic

Authored by jasonmolenda on Apr 13 2020, 8:59 PM.

Details

Summary

I found this on an AAarch64 target, where we use a frame pointer register on Darwin. UnwindAssemblyInstEmulation iterates over the instructions building up an UnwindPlan row by row. When UnwindAssemblyInstEmulation sees a branch instruction, it "forwards" the unwind state to the target offset; if we have a mid-function epilogue that restores the spilled registers and returns, when we hit the branch target, the saved unwind state is reinstated.

The bug comes in that UnwindAssemblyInstEmulation maintains both the "current row" object and it tracks the register that the Canonical Frame Address is defined in terms of, and whether this register is the frame pointer or not. After the prologue we have unwind state defining the CFA as $fp+16; then during a mid-function epilogue we restore the registers and switch to the CFA as $sp based. At this point the UnwindAssemblyInstEmulation ivars (CFA register, is-it-fp-or-not) are updated to recognize the switch to $sp. After the epilogue, we restore the current-row but we leave the other ivars to their epilogue setting.

The failure then comes in if the instructions after the mid-function epilogue modify $sp, UnwindAssemblyInstEmulation changes the current row's CFA offset value to track that change, because it thinks the CFA is defined in terms of the stack pointer. But the current row is still defining it in terms of $fp. So everything fails.

It's a sneaky bug because it's easy to miss if lldb silently "falls back" to the architectural default unwind plan when it can't find the caller. So we're missing spilled registers from this frame. It takes some specific circumstances to hit it -- modifying $sp after that mid-function epilogue.

I know not a lot of people have touched this code in ages (including me!), but let's put it up for review in case anyone has questions.

Diff Detail

Unit TestsFailed

TimeTest
40 msMLIR.Target::Unknown Unit Message ("")
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/bin/mlir-translate -mlir-to-llvmir /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/mlir/test/Target/llvmir.mlir | /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/bin/FileCheck /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/mlir/test/Target/llvmir.mlir

Event Timeline

jasonmolenda created this revision.Apr 13 2020, 8:59 PM
davide added a subscriber: davide.Apr 13 2020, 10:35 PM
clayborg requested changes to this revision.Apr 14 2020, 2:39 AM

If you test worked, then there is something wrong with this test? See inline comment for copy and paste error

lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
146

Is there a copy and paste error here?:

s/ra_reg_info/sp_reg_info/
This revision now requires changes to proceed.Apr 14 2020, 2:39 AM

If you test worked, then there is something wrong with this test? See inline comment for copy and paste error

Thanks for catching that. The test definitely fails without the patch, and works with. lldb is using that sp_reg_info to re-set the m_fp_is_cfa ivar, which my test clearly does not exercise if it is wrong.

Update to address mistake Greg identified; also remove two unused variables that were in this method before my changes.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 14 2020, 5:26 PM
This revision was automatically updated to reflect the committed changes.