Page MenuHomePhabricator

[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2)
ClosedPublic

Authored by sdesmalen on Oct 10 2019, 4:41 AM.

Details

Summary

Commit message from D66935:

This patch fixes a bug exposed by D65653 where a subsequent invocation
of determineCalleeSaves ends up with a different size for the callee
save area, leading to different frame-offsets in debug information.

In the invocation by PEI, determineCalleeSaves tries to determine
whether it needs to spill an extra callee-saved register to get an
emergency spill slot. To do this, it calls 'estimateStackSize' and
manually adds the size of the callee-saves to this. PEI then allocates
the spill objects for the callee saves and the remaining frame layout
is calculated accordingly.

A second invocation in LiveDebugValues causes estimateStackSize to return
the size of the stack frame including the callee-saves. Given that the
size of the callee-saves is added to this, these callee-saves are counted
twice, which leads determineCalleeSaves to believe the stack has
become big enough to require spilling an extra callee-save as emergency
spillslot. It then updates CalleeSavedStackSize with a larger value.

Since CalleeSavedStackSize is used in the calculation of the frame
offset in getFrameIndexReference, this leads to incorrect offsets for
variables/locals when this information is recalculated after PEI.

This patch fixes the lldb unit tests in functionalities/thread/concurrent_events/*

Changes after D66935:

Ensures AArch64FunctionInfo::getCalleeSavedStackSize does not return
the uninitialized CalleeSavedStackSize when running llc on a specific
pass where the MIR code has already been expected to have gone through PEI.

Instead, getCalleeSavedStackSize (when passed the MachineFrameInfo) will try
to recalculate the CalleeSavedStackSize from the CalleeSavedInfo. In debug
mode, the compiler will assert the recalculated size equals the cached
size as calculated through a call to determineCalleeSaves.

This fixes two tests:

test/DebugInfo/AArch64/asan-stack-vars.mir
test/DebugInfo/AArch64/compiler-gen-bbs-livedebugvalues.mir

that otherwise fail when compiled using msan.

Diff Detail

Event Timeline

sdesmalen created this revision.Oct 10 2019, 4:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2019, 4:41 AM
sdesmalen marked an inline comment as done.Oct 10 2019, 4:43 AM
sdesmalen added inline comments.
llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
184

I've tested LNT and the llvm unit-tests with this check enabled.

omjavaid requested changes to this revision.Oct 15 2019, 12:32 AM

This revision does not fix LLDB regression on AArch64.

http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu

All tests in functionalities/thread/concurrent_events/* category still failing.

This revision now requires changes to proceed.Oct 15 2019, 12:32 AM

This revision does not fix LLDB regression on AArch64.

http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu

All tests in functionalities/thread/concurrent_events/* category still failing.

Correct, those tests are fixed by D66935. That fix was reverted because of a failure with msan on test/DebugInfo/AArch64/asan-stack-vars.mir and test/DebugInfo/AArch64/compiler-gen-bbs-livedebugvalues.mir, which this patch intends to fix. I kept the fixes separate to make them easier to review. Do you want me to merge the two patches?

This revision does not fix LLDB regression on AArch64.

http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu

All tests in functionalities/thread/concurrent_events/* category still failing.

Correct, those tests are fixed by D66935. That fix was reverted because of a failure with msan on test/DebugInfo/AArch64/asan-stack-vars.mir and test/DebugInfo/AArch64/compiler-gen-bbs-livedebugvalues.mir, which this patch intends to fix. I kept the fixes separate to make them easier to review. Do you want me to merge the two patches?

Yes, I guess it will be the right idea to merge both patches otherwise you might see failing buildbots when both patches get merged at different times.

sdesmalen updated this revision to Diff 225380.Oct 17 2019, 2:22 AM
sdesmalen retitled this revision from [AArch64] Fix access to uninitialized CalleeSavedStackSize to [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2).
sdesmalen edited the summary of this revision. (Show Details)

Squashed the commit with the changes in D66935.

efriedma accepted this revision.Oct 18 2019, 6:08 PM

LGTM

I was considering whether, instead of making getCalleeSavedStackSize approximate the stack size when running an individual pass, we should serialize it in the MIR. But this is probably okay as-is.

omjavaid accepted this revision.Oct 21 2019, 1:16 AM

This corrects LLDB regressions.
LGTM.

This revision is now accepted and ready to land.Oct 21 2019, 1:16 AM
This revision was automatically updated to reflect the committed changes.

This breaks most bots on http://lab.llvm.org:8011/console . Unless this has a very quick fix, please revert and investigate asynchonously.

This breaks most bots on http://lab.llvm.org:8011/console . Unless this has a very quick fix, please revert and investigate asynchonously.

Are you sure these are caused by this patch though? (Most bots that fail test more than this patch and I also can't quickly see how it causes all these failures)

This breaks most bots on http://lab.llvm.org:8011/console . Unless this has a very quick fix, please revert and investigate asynchonously.

Are you sure these are caused by this patch though? (Most bots that fail test more than this patch and I also can't quickly see how it causes all these failures)

Yes, http://45.33.8.238/linux/2376/summary.html for example has just your change.

Maybe try reverting and if it doesn't help you know it's not your change. (It very likely is, though.)

This breaks most bots on http://lab.llvm.org:8011/console . Unless this has a very quick fix, please revert and investigate asynchonously.

Are you sure these are caused by this patch though? (Most bots that fail test more than this patch and I also can't quickly see how it causes all these failures)

Yes, http://45.33.8.238/linux/2376/summary.html for example has just your change.

Maybe try reverting and if it doesn't help you know it's not your change. (It very likely is, though.)

Got it! Thanks, I didn't quickly spot that one test from the console output. I'll revert the test and fix this separately. Thanks for pointing out!