This is an archive of the discontinued LLVM Phabricator instance.

Fix Thumb1 epilogue generation
ClosedPublic

Authored by tyomitch on Nov 25 2015, 5:46 AM.

Details

Summary

This had been broken for a very long time, but nobody noticed until
D14357 enabled shrink-wrapping by default.

Diff Detail

Event Timeline

tyomitch updated this revision to Diff 41136.Nov 25 2015, 5:46 AM
tyomitch retitled this revision from to Fix Thumb1 epilogue generation.
tyomitch updated this object.
tyomitch added reviewers: qcolombet, jroelofs.
tyomitch added a subscriber: llvm-commits.
tyomitch added inline comments.Nov 25 2015, 5:58 AM
lib/Target/ARM/Thumb1FrameLowering.cpp
413

This must have been a leftover from some ARM lowering code: hasV4TOps() and hasV5TOps() have no relevance to Thumb1 epilogue generation, but this code used to prevent the "pop special fix-up" when it was necessary.

434

tPOP_RET is a return instruction, too; but it doesn't need a conversion to tPOP_RET, so we can return early.

462

If MBBI is the first instruction in the BB, then the last iteration of the while decrements InstUpToMBBI past the first instruction, crashing with an assertion failure.

test/CodeGen/Thumb/thumb-shrink-wrapping.ll
42

pop lr is illegal in Thumb1; in fact, llc -filetype=obj emitted pop r7 instead of this instruction, obviously unbalancing the stack.

jroelofs added inline comments.Nov 25 2015, 7:22 AM
lib/Target/ARM/Thumb1FrameLowering.cpp
413

IIRC, this check was to make sure that we didn't emit a pop {r[0-9], pc} on v4t, and expect it to change the thumb bit accordingly (because it can't).

534

Should assert that findRegisterDefOperandIdx didn't return -1.

537

Would this be better written as:

Pop = AddDefaultPred(BuildMI(MBB, MBBI, dl, TII.get(ARM::tPOP)))
          .addReg(PopReg, RegState::Define);
test/CodeGen/Thumb/thumb-shrink-wrapping.ll
43

Is the next instruction a bx lr?

If so, it'd be better to emit:

blx r1

on v5t+, and:

mov lr, r1
bx lr

on v4t.

tyomitch added inline comments.Nov 25 2015, 10:41 AM
lib/Target/ARM/Thumb1FrameLowering.cpp
413

If so, it seems it didn't have any regression tests -- so I'm not sure if my patch may lead to generation of the pop pc on v4T, or if it may not.

534

RemoveOperand asserts this the first thing it does.

537

No, because Pop may correspond either to the MachineInstrBuilder from line 519, or to the pre-existing tPOP from line 515.

test/CodeGen/Thumb/thumb-shrink-wrapping.ll
43

Is the next instruction a bx lr?

Yes it is; but it's also accessible via a by-pass.

Perhaps it would be beneficial to transform a

  pop {r7, r1}
  mov lr, r1
label:
  bx lr

into

  pop {r7, pc}
label:
  bx lr

--but that requires a more sophisticated analysis, reaching outside the BB with the epilogue, which emitPopSpecialFixUp() doesn't (and didn't) attempt to do.

jroelofs added inline comments.Nov 25 2015, 11:25 AM
lib/Target/ARM/Thumb1FrameLowering.cpp
413

test/CodeGen/ARM/thumb1_return_sequence.ll should have been testing it...

Looking at the svn blame, I think r242714 broke it. I've replied on that commit's email.

537

Oh, I see.

test/CodeGen/Thumb/thumb-shrink-wrapping.ll
43

Would you mind leaving a FIXME in emitPopSpecialFixUp explaining that we have to emit it that way for v4t, but for v5t+ it could be optimized to the shorter sequence?

tyomitch updated this revision to Diff 41198.Nov 25 2015, 4:41 PM

Would you mind leaving a FIXME in emitPopSpecialFixUp

I can do better, and actually implement that missing optimization there :-)

This ^ does not look correct for v4t, which doesn't support changing the thumb
bit via a pop into pc.

That's right, it isn't -- same as thumb1_return_sequence.ll is incorrect for v4T.

I have a fix for v4T locally, but perhaps we can commit this fix (for v5T, broken since last week) now, and the fix for v4T (broken since July) in a separate, follow-up commit?
The two fixes are more or less independent.

jroelofs accepted this revision.Nov 30 2015, 8:56 AM
jroelofs edited edge metadata.

This ^ does not look correct for v4t, which doesn't support changing the thumb
bit via a pop into pc.

That's right, it isn't -- same as thumb1_return_sequence.ll is incorrect for v4T.

I have a fix for v4T locally, but perhaps we can commit this fix (for v5T, broken since last week) now, and the fix for v4T (broken since July) in a separate, follow-up commit?
The two fixes are more or less independent.

Sounds reasonable to me.

This revision is now accepted and ready to land.Nov 30 2015, 8:56 AM
qcolombet added inline comments.Nov 30 2015, 2:34 PM
lib/Target/ARM/Thumb1FrameLowering.cpp
411

Add a comment saying that LR cannot be encoded with Thumb1, i.e., it requires a special fix-up.

test/CodeGen/Thumb/pop-special-fixup.ll
3

Is there a chance this test could be merged with the shrink-wrapping one since it explicitly checks with shrink-wrapping enabled?

If that is the case, change the name of the function from test to popSpecialFixup.

8

Please use opt -instnamer to get rid of the %num variables.

qcolombet added inline comments.Nov 30 2015, 2:35 PM
lib/Target/ARM/Thumb1FrameLowering.cpp
456

Do you actually need that loop instead of the copyImplicitOps call?
If so, add a comment explaining why.

458

Period.

tyomitch added inline comments.Dec 1 2015, 6:23 AM
test/CodeGen/Thumb/pop-special-fixup.ll
3

I will merge the two test files in the following patch, which will address v4T specifically.

Currently, they are separate because thumb-shrink-wrapping.ll only tests v4T, and this one only tests v6M.

tyomitch updated this revision to Diff 41499.Dec 1 2015, 6:25 AM
tyomitch edited edge metadata.

Quentin, thank you for the helpful suggestions!

I've incorporated them into the new revision of my patch.

qcolombet accepted this revision.Dec 1 2015, 9:29 AM
qcolombet edited edge metadata.

LGTM too modulo the use of instnamer on the test case to get rid of the %[0-9]+ variables.

Cheers,
Q.

tyomitch closed this revision.Dec 1 2015, 11:28 AM