This is an archive of the discontinued LLVM Phabricator instance.

[Thumb] Fix assembler error 'cannot honor width suffix pop {lr}'
ClosedPublic

Authored by tyomitch on Dec 21 2015, 5:50 PM.

Details

Summary

Hi Artyom and Quentin,

It looks like we are still generating .s files with invalid pop operand with Thumb1.

This is a problem when not using integrated assembler, because it
produces errors like 'cannot honor width suffix -- `pop {r7,lr}''.

See the attached test case to reproduce the issue.

The function emitPopSpecialFixUp replaces instructions
like pop {r7, lr} with:
pop {r7}
pop {r0}
mov lr, r0
since lt cannot be encoded with Thumb1.

But it only deletes lr operand from pop instruction when
the pop is the last instruction in the BB.
We should handle cases where the pop instruction
might be followed by other BB terminator instructions.

I tried to fix the issue, let me know what else you suggest.

Diff Detail

Repository
rL LLVM

Event Timeline

apazos updated this revision to Diff 43427.Dec 21 2015, 5:50 PM
apazos retitled this revision from to [Thumb] Fix assembler error 'cannot honor width suffix pop {lr}'.
apazos updated this object.
apazos added reviewers: tyomitch, jroelofs, qcolombet.
apazos set the repository for this revision to rL LLVM.

Bother, I considered that but thought I'd investigated all the possible terminators when reviewing that patch and convinced myself they couldn't occur after a tPOP. Which one was it?

The patch looks fine to me, though.

Tim.

t.p.northover accepted this revision.Dec 21 2015, 6:24 PM
t.p.northover added a reviewer: t.p.northover.
This revision is now accepted and ready to land.Dec 21 2015, 6:24 PM
tyomitch edited edge metadata.Dec 22 2015, 3:13 AM

Bother, I considered that but thought I'd investigated all the possible terminators when reviewing that patch and convinced myself they couldn't occur after a tPOP. Which one was it?

It's followed by an unconditional branch to a tBX_RET, and my patch only handled falling-through to a tBX_RET :-/

(Note that the same code compiles correctly with -enable-shrink-wrap=false, so it's all @qcolombet's oversight in the first place! :-P)

tyomitch requested changes to this revision.Dec 22 2015, 4:17 AM
tyomitch edited edge metadata.

test/CodeGen/Thumb/thumb-shrink-wrapping.ll includes a few more cases when pop {lr} can still be generated; all of them are wrong, and need addressing.

This revision now requires changes to proceed.Dec 22 2015, 4:17 AM
tyomitch commandeered this revision.Dec 22 2015, 1:54 PM
tyomitch edited reviewers, added: apazos; removed: tyomitch.
This revision is now accepted and ready to land.Dec 22 2015, 1:54 PM
tyomitch updated this revision to Diff 43481.Dec 22 2015, 1:56 PM
tyomitch edited edge metadata.
tyomitch removed rL LLVM as the repository for this revision.
tyomitch added a subscriber: llvm-commits.

Updating the patch to handle the aforementioned additional cases of pop {lr} in test/CodeGen/Thumb/thumb-shrink-wrapping.ll

apazos edited edge metadata.Dec 22 2015, 4:17 PM

Thanks Tim, the test case looks fine. I will rerun the updated patch with my other workloads and let you know if I find any other corner case.

tyomitch updated this revision to Diff 43514.Dec 23 2015, 1:44 AM
tyomitch edited edge metadata.

Updating the patch to simplify ARMLoadStoreOpt::CombineMovBx

Artyom,

Several tests are still failing due to failure to remove lr operand from the pop instruction.

It will take me some time to reduce each test cases. But I looked at some of the failures and it seems we need to handle conditional branches tBcc to the return as well.

See example IR:

tPOP pred:14, pred:%noreg, %R4<def>, %LR<def>, %SP<imp-def>, %SP<imp-use>
tPOP pred:14, pred:%noreg, %R3<def>, %SP<imp-def>, %SP<imp-use>
%LR<def> = tMOVr %R3<kill>, pred:14, pred:%noreg
tBcc <BB#4>, pred:0, pred:%CPSR<kill>
  Successors according to CFG: BB#4(0x40000000 / 0x80000000 = 50.00%) BB#3(0x40000000 / 0x80000000 = 50.00%)

BB#3: derived from LLVM BB %if.end6

  Live Ins: %R1 %R2 %LR %R4
  Predecessors according to CFG: BB#2
%R0<def>, %CPSR<def,dead> = tMOVi8 0, pred:14, pred:%noreg
tSTRi %R0, %R1, 12, pred:14, pred:%noreg; mem:ST4[%done](tbaa=!14)
tSTRi %R1<kill>, %R2<kill>, 8, pred:14, pred:%noreg; mem:ST4[%head7](tbaa=!11)
  Successors according to CFG: BB#4(?%)

BB#4: derived from LLVM BB %cleanup

  Live Ins: %R0 %LR %R4
  Predecessors according to CFG: BB#2 BB#3 BB#0 BB#1
tBX_RET pred:14, pred:%noreg, %R0<imp-use>

Here is a simplified test case for the tBcc case, we can add it to thumb-pop.ll or move them all to thumb-shrinkwrapping.ll.

; RUN: llc < %s -mtriple=thumbv5e-none-linux-gnueabi-eabi -verify-machineinstrs -o - | FileCheck %s
%struct.a = type { i8*, i32, i32, i8*, i32, i32, i8*, %struct.b*, i8* (i8*, i32, i32)*, void (i8*, i8*)*,
i8*, i32, i32, i32 }
%struct.b = type opaque
%struct.c = type { i32, i32, i32, i32, i8*, i32, i32, i8*, i32, i8*, i32, i32, i32 }
%struct.e = type { i32, i32, i32, i32, i32, i32, i32, i32, %struct.c*, i32, i32, i32, i32, i8*, i32, i32,
i32, i32, i32, %struct.f*, %struct.f*, i32, i32, i32, i32, i32, i32, %struct.f*, [320 x i16], [288 x i16
], [1444 x %struct.f], i32, i32, i32 }
%struct.f = type { i8, i8, i16 }

define i32 @test2(%struct.a* %v, %struct.c* %head) {
; CHECK-LABEL: test2:
; CHECK-NOT: pop {[[POP_REG:r[0-7]]], lr}
entry:

%cmp = icmp eq %struct.a* %v, null
br i1 %cmp, label %cleanup, label %if.end

if.end: ; preds = %entry

%state1 = getelementptr inbounds %struct.a, %struct.a* %v, i32 0, i32 7
%0 = bitcast %struct.b** %state1 to %struct.e**
%1 = load %struct.e*, %struct.e** %0, align 4
%wrap = getelementptr inbounds %struct.e, %struct.e* %1, i32 0, i32 2
%2 = load i32, i32* %wrap, align 4
%and = and i32 %2, 2
%cmp2 = icmp eq i32 %and, 0
br i1 %cmp2, label %cleanup, label %if.end4

if.end4: ; preds = %if.end

%head5 = getelementptr inbounds %struct.e, %struct.e* %1, i32 0, i32 8
store %struct.c* %head, %struct.c** %head5, align 4
br label %cleanup

cleanup: ; preds = %if.end, %entry, %if.end4

%retval.0 = phi i32 [ 0, %if.end4 ], [ -2, %entry ], [ -2, %if.end ]
ret i32 %retval.0

}

Artyom,

These additional changes solve my problem. See if you agree with them.

  • a/lib/Target/ARM/Thumb1FrameLowering.cpp

+++ b/lib/Target/ARM/Thumb1FrameLowering.cpp
@@ -433,18 +433,23 @@ bool Thumb1FrameLowering::emitPopSpecialFixUp(MachineBasicBlock &MBB,

auto MBBI = MBB.getFirstTerminator();
bool CanRestoreDirectly = STI.hasV5TOps() && !ArgRegsSaveSize;
if (CanRestoreDirectly) {
  • if (MBBI != MBB.end() && MBBI->getOpcode() != ARM::tB)

+ if (MBBI != MBB.end() && MBBI->getOpcode() != ARM::tB &&
+ MBBI->getOpcode() != ARM::tBcc)

  CanRestoreDirectly = (MBBI->getOpcode() == ARM::tBX_RET ||
                        MBBI->getOpcode() == ARM::tPOP_RET);
else {
  auto MBBI_prev = MBBI;
  MBBI_prev--;
  assert(MBBI_prev->getOpcode() == ARM::tPOP);
  • assert(MBB.succ_size() == 1);
  • if ((*MBB.succ_begin())->begin()->getOpcode() == ARM::tBX_RET)
  • MBBI = MBBI_prev; // Replace the final tPOP with a tPOP_RET.
  • else
  • CanRestoreDirectly = false;

+ if (MBB.succ_size() > 1)
+ CanRestoreDirectly = false;
+ else {
+ assert(MBB.succ_size() == 1);
+ if ((*MBB.succ_begin())->begin()->getOpcode() == ARM::tBX_RET)
+ MBBI = MBBI_prev; // Replace the final tPOP with a tPOP_RET.
+ else
+ CanRestoreDirectly = false;
+ }

  }
}

@@ -531,7 +536,8 @@ bool Thumb1FrameLowering::emitPopSpecialFixUp(MachineBasicBlock &MBB,

                     .addReg(PopReg, RegState::Kill));
}
  • if (MBBI == MBB.end() || MBBI->getOpcode() == ARM::tB) {

+ if (MBBI == MBB.end() || MBBI->getOpcode() == ARM::tB ||
+ MBBI->getOpcode() == ARM::tBcc) {

auto Pop = MBBI;
Pop--;
assert(Pop->getOpcode() == ARM::tPOP);

Still some remaining issues... it does not seem you can assume that the instruction preceding the branch is the POP instruction. Looks like we just need to search back in the block for a POP that has LR as operand and then remove the operand (like I had in my original patch).

See the piece of code that is triggering a new assertion in Thumb1FrameLowering.cpp:543:

tPOP pred:14, pred:%noreg, %R4<def>, %R5<def>, %R7<def>, %LR<def>, %SP<imp-def>, %SP<imp-use>
 %R12<def> = tMOVr %R0<kill>, pred:14, pred:%noreg
 tBcc <BB#4>, pred:1, pred:%CPSR<kill>
tyomitch updated this revision to Diff 43606.Dec 24 2015, 7:19 AM

Ana, thank you for these new test cases.

I think it's a better idea to avoid generating pop {lr} in Thumb1 epilogues than to go hunting around for them later; so I'm updating the patch to do just that.

Hi Artyom,

So far my tests are passing. I think you can proceed with merging the latest patch.

Thanks for the help!

This revision was automatically updated to reflect the committed changes.