This had been broken for a very long time, but nobody noticed until
D14357 enabled shrink-wrapping by default.
Details
Diff Detail
Event Timeline
lib/Target/ARM/Thumb1FrameLowering.cpp | ||
---|---|---|
413 | This must have been a leftover from some ARM lowering code: hasV4TOps() and hasV5TOps() have no relevance to Thumb1 epilogue generation, but this code used to prevent the "pop special fix-up" when it was necessary. | |
434 | tPOP_RET is a return instruction, too; but it doesn't need a conversion to tPOP_RET, so we can return early. | |
462 | If MBBI is the first instruction in the BB, then the last iteration of the while decrements InstUpToMBBI past the first instruction, crashing with an assertion failure. | |
test/CodeGen/Thumb/thumb-shrink-wrapping.ll | ||
42 | pop lr is illegal in Thumb1; in fact, llc -filetype=obj emitted pop r7 instead of this instruction, obviously unbalancing the stack. |
lib/Target/ARM/Thumb1FrameLowering.cpp | ||
---|---|---|
413 | IIRC, this check was to make sure that we didn't emit a pop {r[0-9], pc} on v4t, and expect it to change the thumb bit accordingly (because it can't). | |
517 | Should assert that findRegisterDefOperandIdx didn't return -1. | |
520 | Would this be better written as: Pop = AddDefaultPred(BuildMI(MBB, MBBI, dl, TII.get(ARM::tPOP))) .addReg(PopReg, RegState::Define); | |
test/CodeGen/Thumb/thumb-shrink-wrapping.ll | ||
43 | Is the next instruction a bx lr? If so, it'd be better to emit: blx r1 on v5t+, and: mov lr, r1 bx lr on v4t. |
lib/Target/ARM/Thumb1FrameLowering.cpp | ||
---|---|---|
413 | If so, it seems it didn't have any regression tests -- so I'm not sure if my patch may lead to generation of the pop pc on v4T, or if it may not. | |
517 | RemoveOperand asserts this the first thing it does. | |
520 | No, because Pop may correspond either to the MachineInstrBuilder from line 519, or to the pre-existing tPOP from line 515. | |
test/CodeGen/Thumb/thumb-shrink-wrapping.ll | ||
43 |
Yes it is; but it's also accessible via a by-pass. Perhaps it would be beneficial to transform a pop {r7, r1} mov lr, r1 label: bx lr into pop {r7, pc} label: bx lr --but that requires a more sophisticated analysis, reaching outside the BB with the epilogue, which emitPopSpecialFixUp() doesn't (and didn't) attempt to do. |
lib/Target/ARM/Thumb1FrameLowering.cpp | ||
---|---|---|
413 | test/CodeGen/ARM/thumb1_return_sequence.ll should have been testing it... Looking at the svn blame, I think r242714 broke it. I've replied on that commit's email. | |
520 | Oh, I see. | |
test/CodeGen/Thumb/thumb-shrink-wrapping.ll | ||
43 | Would you mind leaving a FIXME in emitPopSpecialFixUp explaining that we have to emit it that way for v4t, but for v5t+ it could be optimized to the shorter sequence? |
Would you mind leaving a FIXME in emitPopSpecialFixUp
I can do better, and actually implement that missing optimization there :-)
This ^ does not look correct for v4t, which doesn't support changing the thumb
bit via a pop into pc.
That's right, it isn't -- same as thumb1_return_sequence.ll is incorrect for v4T.
I have a fix for v4T locally, but perhaps we can commit this fix (for v5T, broken since last week) now, and the fix for v4T (broken since July) in a separate, follow-up commit?
The two fixes are more or less independent.
lib/Target/ARM/Thumb1FrameLowering.cpp | ||
---|---|---|
411–413 | Add a comment saying that LR cannot be encoded with Thumb1, i.e., it requires a special fix-up. | |
test/CodeGen/Thumb/pop-special-fixup.ll | ||
3 ↗ | (On Diff #41198) | Is there a chance this test could be merged with the shrink-wrapping one since it explicitly checks with shrink-wrapping enabled? If that is the case, change the name of the function from test to popSpecialFixup. |
8 ↗ | (On Diff #41198) | Please use opt -instnamer to get rid of the %num variables. |
test/CodeGen/Thumb/pop-special-fixup.ll | ||
---|---|---|
3 ↗ | (On Diff #41198) | I will merge the two test files in the following patch, which will address v4T specifically. Currently, they are separate because thumb-shrink-wrapping.ll only tests v4T, and this one only tests v6M. |
Quentin, thank you for the helpful suggestions!
I've incorporated them into the new revision of my patch.
LGTM too modulo the use of instnamer on the test case to get rid of the %[0-9]+ variables.
Cheers,
Q.
This must have been a leftover from some ARM lowering code: hasV4TOps() and hasV5TOps() have no relevance to Thumb1 epilogue generation, but this code used to prevent the "pop special fix-up" when it was necessary.