This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Don't reserve R12 on Thumb1 as an emergency spill slot.
ClosedPublic

Authored by efriedma on Jun 21 2019, 3:12 PM.

Details

Summary

The current implementation of ThumbRegisterInfo::saveScavengerRegister is bad for two reasons: one, it's buggy, and two, it blocks using R12 for other optimizations. So this patch gets rid of it, and adds the necessary support for using an ordinary emergency spill slot on Thumb1.

(Specifically, I think saveScavengerRegister was broken by r305625, and nobody noticed for two years because the codepath is almost never used. The new code will also probably not be used much, but it now has better tests, and if we fail to emit a necessary emergency spill slot we get a reasonable error message instead of a miscompile.)

A rough outline of the changes in the patch:

  1. Gets rid of ThumbRegisterInfo::saveScavengerRegister.
  2. Modifies ARMFrameLowering::determineCalleeSaves to allocate an emergency spill slot for Thumb1.
  3. Implements useFPForScavengingIndex, so the emergency spill slot isn't placed at a negative offset from FP on Thumb1.
  4. Modifies the heuristics for allocating an emergency spill slot to support Thumb1. This includes fixing ExtraCSSpill so we don't try to use "lr" as a substitute for allocating an emergency spill slot.
  5. Allocates a base pointer in more cases, so the emergency spill slot is always accessible.
  6. Modifies ARMFrameLowering::ResolveFrameIndexReference to compute the right offset in the new cases where we're forcing a base pointer.
  7. Ensures we never generate a load or store with an offset outside of its frame object. This makes the heuristics more straightforward.

Some of the changes to the emergency spill slot heuristics in estimateRSStackSizeLimit and determineCalleeSaves affect ARM/Thumb2; hopefully, they should allow the compiler to avoid allocating an emergency spill slot in cases where it isn't necessary. The rest of the changes should only affect Thumb1.

As far as I can tell, there isn't any good way to split this patch. If we don't get rid of saveScavengerRegister, the other changes are essentially useless. And if we get rid of saveScavengerRegister without the other fixes, we'll introduce bugs. I'm open to suggestions about this, though.

Diff Detail

Repository
rL LLVM

Event Timeline

efriedma created this revision.Jun 21 2019, 3:12 PM

@chill, Dave suggested you could perhaps have a look at this? You've looked at frame lowering recently?

Looks like a nice idea. Like Sjoerd said, I don't know a lot about Frame Lowering though. (I can try and learn if no-one else jumps in).

This looks a lot like a bug I hit a while ago while writing a fuzzer for ABI
compatibility between clang and gcc. I've currently got Thumb1-only targets
disabled because of this. Applying this patch doesn't get the fuzzer fully
passing, but it does drop the failure rate significantly, and I expect there
are more bugs to find once this one is fixed. I should have time later this
week to reduce some of the remaining failures, but since this test suite has
never passed for Thumb1 I don't think it should block this patch.

lib/Target/ARM/ARMFrameLowering.cpp
1982

Could you expand the comment at the definition of ExtraCSSpill to explain the conditions in which it will be set?

test/CodeGen/Thumb/emergency-spill-slot.ll
335

Could we just not have any CHECK lines for this test, if we don't care about the output, and are just making sure we don't crash?

efriedma marked an inline comment as done.Jun 24 2019, 11:33 AM

This looks a lot like a bug I hit a while ago while writing a fuzzer for ABI
compatibility between clang and gcc. I've currently got Thumb1-only targets
disabled because of this.

My team ran into this when we added some new configurations of SPEC; it's good to hear we aren't the only ones doing any Thumb1 testing. :)

test/CodeGen/Thumb/emergency-spill-slot.ll
335

I prefer not to do that, generally; yes, we primarily care that it doesn't crash, but a change to the generated code could indicate the test is no longer testing what it's supposed to test.

efriedma updated this revision to Diff 206272.Jun 24 2019, 11:37 AM

Improved comment on ExtraCSSpill.

I've reduced one of the failures from my fuzzer (with this patch applied on top of SVN r364172) to P8154, and it looks like we're now using the base pointer before it has been set up:

$ /work/llvm/build/bin/clang --target=arm--none-eabi -march=armv6-m -c file2097315.c -O0 -o - -S
...
F1214777307:
      .fnstart
      .save   {r4, r5, r6, r7, lr}
      push    {r4, r5, r6, r7, lr}
      .setfp  r7, sp, #12
      add     r7, sp, #12
      str     r2, [r6, #28]
      .pad    #1548
      ldr     r2, .LCPI0_1
      add     sp, r2
      mov     r4, sp
      lsrs    r4, r4, #5
      lsls    r4, r4, #5
      mov     sp, r4
      mov     r6, sp
...

The third instruction in the function looks like it is using r6 as the base pointer to save one of the argument registers on the stack, but r6 isn't set until further down.

The third instruction in the function looks like it is using r6 as the base pointer to save one of the argument registers on the stack, but r6 isn't set until further down.

Yes, it looks like we're trying to scavenge a register inside the function's prologue. Probably straightforward to fix: when we're generating the prologue/epilogue, we should be able to directly pick a free register, since we know we're emitting the code immediately after callee-save registers are spilled.

efriedma updated this revision to Diff 206562.Jun 25 2019, 5:26 PM

Fix Thumb1FrameLowering::emitPrologue and Thumb1FrameLowering::emitEpilogue so they don't scavenge a register inside of frame setup/teardown. Instead, just pick a register we know is available.

ostannard accepted this revision.Jun 26 2019, 6:21 AM

LGTM.

I've given this another few hours of fuzzing, and the only thing it's found is that we emit bad unwind tables when saving high registers, but that's always been the case, I'll raise a bug for that.

This revision is now accepted and ready to land.Jun 26 2019, 6:21 AM
This revision was automatically updated to reflect the committed changes.
test/CodeGen/ARM/ldrex-frame-size.ll