This patch fixes pr23772  [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80".
The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable.
To enforce that the instruction t2(ADD|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP).
Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations.
When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant )
It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
- Buildable 41621 - Build 41877: arc lint + arc unit 
Event Timeline
| llvm/test/MC/ARM/invalid-addsub.s | ||
|---|---|---|
| 17–20 | Ups, missing ones. Will fix. | |
Not that trivial, the adr bug only appears after splitting t2(sub|add)ri. And the hardcoded emission of t2 register +/- immediate needs to be fixed once we split the instructions for not emitting invalid instructions. As well the disassembly part fails if done apart.
| llvm/lib/Target/ARM/ARMAsmPrinter.cpp | ||
|---|---|---|
| 1173 | I'm a little surprised the ri12 variants were missing here. I guess the frame setup code doesn't use them. Still fine to add, though. | |
| llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp | ||
| 3285 | Do we need to care about the source register here? t2ADDrr doesn't restrict it. (I guess that might also be a bug?) | |
| llvm/lib/Target/ARM/ARMFrameLowering.cpp | ||
| 1531 | We can't generate t2ADDspImm with a frame index; that wouldn't make any sense. | |
| llvm/lib/Target/ARM/ARMInstrThumb2.td | ||
| 2876 | This pattern is dead code. Same for the other new patterns. | |
| 4859 | Need testcases for all these aliases. | |
| llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp | ||
| 711 | Not your patch, but the way Thumb1 is handling SP here doesn't make any sense. Maybe add a FIXME? | |
| llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
| 7728 | Is this Error actually reachable? If it is, please add a testcase. | |
| 9785 | Please don't copy-paste code. | |
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
| 6630 | Please don't copy-paste code. | |
| llvm/test/CodeGen/Thumb2/mve-stacksplot.mir | ||
| 119 | This test should probably be using CHECK-next to make it clear what's happening. (Please commit separately.) Is it necessary to change the instruction sequence in this patch? I'd prefer to split the optimization into a separate patch. | |
| llvm/test/MC/Disassembler/ARM/thumb-tests.txt | ||
| 285 | Why does this need to change? | |
| llvm/test/MC/Disassembler/ARM/thumb2.txt | ||
| 94 | What part of the patch changes the preferred disassembly here? Can it be split into a separate patch? | |
| llvm/lib/Target/ARM/ARMInstrThumb2.td | ||
|---|---|---|
| 925–938 | This duplicates a lot of what's in T2sTwoRegImm, and also incorrectly forces bit 20 (which encodes the 's' bit) to 0 causing adds to be encoded as add. I think what should be here is the same as the ri variant, but with "let Rn = 13; let Rd = 13;" at the top. | |
| 987 | Should comment that this is the S bit, for consistency with ri12 variant above. | |
| llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp | ||
|---|---|---|
| 3285 | From my tests, rr does block writing to SP and not reading from it: ./llvm-mc --assemble  -triple=thumbv7-apple-darwin9 -mcpu=cortex-a9 --show-encoding <<< "add.w sp, r0, r1"
        .section        __TEXT,__text,regular,pure_instructions
<stdin>:1:7: error: source register must be sp if destination is sp
add.w sp, r0, r1In that case, just checking that the destination is SP is enough. | |
| llvm/lib/Target/ARM/ARMInstrThumb2.td | ||
| 925–938 | Is there a special way to do those let? From the ones I managed to compile, using let Rn = 13 in let Rd = 13 in def spImm... I kept getting error: Duplicate predicate in FastISel table!. So I reduced as much as possible, fixing not setting the S bit (20). | |
| llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
| 9785 | I'm guessing you are speaking about the ADD and SUB joining in a single case statement, right? | |
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
| 6630 | Again, not quite sure, but guessing I can reduce the if/else common parts. | |
| llvm/test/MC/Disassembler/ARM/thumb-tests.txt | ||
| 285 | That was just a mistake in submitting this patch, wanted to investigate why this test keeps warning, as this (and printing a instruction of different encoding): llvm-mc --disassemble --show-encoding -triple=thumbv7-apple-darwin9 -mcpu=cortex-a9 <<< "0x1 0xea 0xfa 0x95"
<stdin>:1:1: warning: potentially undefined instruction encoding
0x1 0xea 0xfa 0x95
^
        and.w   r5, r1, r10, ror #7     @ encoding: [0x01,0xea,0xfa,0x15] | |
| llvm/test/MC/Disassembler/ARM/thumb2.txt | ||
| 94 | These broke as soon as I've changed the table-gen, will try to pin-point what change it and see if can be done into other patch. Most unlikely, as the adr disassembly part was never used. | |
| llvm/lib/Target/ARM/ARMInstrThumb2.td | ||
|---|---|---|
| 925–938 | Never-mind, I just did set them inside the definition and it works. | |
| llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp | ||
|---|---|---|
| 3285 | We might need to split t2ADDrr eventually, but I guess this is fine for now. | |
| llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
|---|---|---|
| 9785 | I was more thinking that the t2ADDri12 and t2ADDspImm12 handling are basically identical. But I guess add/sub are also almost identical. | |
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
| 6630 | Nevermind; I assumed you copy-pasted this without really checking. Why do we need a C++ DecoderMethod for t2ADDspImm, when we don't need one for t2ADDri? | |
| llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp | ||
|---|---|---|
| 711 | What do you mean with the SP handling of thumb1? There is a section just below handling if (BaseOpc == ARM::tADDrSPi). | |
| llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
| 7728 | Indeed it does not make any sense. Getting rid of it. | |
| 9785 | Fair enough. Will join all 4 in a single case. | |
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
| 6630 | If I don't use a custom decoder, the disassembly of the instruction 0x0d 0xf1 0x00 0x0d (should disassemble as add.w sp, sp, #0) is matched as a ADR, t2ADR in the generated build/lib/Target/ARM/ARMGenAsmWriter.inc. I'm not fully aware of why yet, probably the same reason why ADR was being decoded as SUB? /work/bf/LLVM/build/bin/llvm-mc -triple=thumbv7-apple-darwin -mcpu=cortex-a8 -disassemble < /tmp/a
        .section        __TEXT,__text,regular,pure_instructions
        adds.w  r1, r2, #496
        addllvm-mc: /work/bf/LLVM/src/llvm/include/llvm/ADT/SmallVector.h:153: const T& llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::operator[](llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::size_type) const [with T = llvm::MCOperand; <template-parameter-1-2> = void; llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::const_reference = const llvm::MCOperand&; llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::size_type = long unsigned int]: Assertion `idx < size()' failed.
Stack dump:
0.      Program arguments: /work/bf/LLVM/build/bin/llvm-mc -triple=thumbv7-apple-darwin -mcpu=cortex-a8 -disassemble 
 #0 0x00007f342f10611d llvm::sys::PrintStackTrace(llvm::raw_ostream&) /work/bf/LLVM/src/llvm/lib/Support/Unix/Signals.inc:548:22
 #1 0x00007f342f1061b0 PrintStackTraceSignalHandler(void*) /work/bf/LLVM/src/llvm/lib/Support/Unix/Signals.inc:609:1
 #2 0x00007f342f103fe0 llvm::sys::RunSignalHandlers() /work/bf/LLVM/src/llvm/lib/Support/Signals.cpp:68:20
 #3 0x00007f342f105a9c SignalHandler(int) /work/bf/LLVM/src/llvm/lib/Support/Unix/Signals.inc:390:1
 #4 0x00007f342e5b14b0 (/lib/x86_64-linux-gnu/libc.so.6+0x354b0)
 #5 0x00007f342e5b1428 raise /build/glibc-LK5gWL/glibc-2.23/signal/../sysdeps/unix/sysv/linux/raise.c:54:0
 #6 0x00007f342e5b302a abort /build/glibc-LK5gWL/glibc-2.23/stdlib/abort.c:91:0
 #7 0x00007f342e5a9bd7 __assert_fail_base /build/glibc-LK5gWL/glibc-2.23/assert/assert.c:92:0
 #8 0x00007f342e5a9c82 (/lib/x86_64-linux-gnu/libc.so.6+0x2dc82)
 #9 0x00007f343641da0f llvm::SmallVectorTemplateCommon<llvm::MCOperand, void>::operator[](unsigned long) const /work/bf/LLVM/src/llvm/include/llvm/ADT/SmallVector.h:154:19
#10 0x00007f343641d851 llvm::MCInst::getOperand(unsigned int) const /work/bf/LLVM/src/llvm/include/llvm/MC/MCInst.h:180:71
#11 0x00007f34364199e0 llvm::ARMInstPrinter::printSBitModifierOperand(llvm::MCInst const*, unsigned int, llvm::MCSubtargetInfo const&, llvm::raw_ostream&) /work/bf/LLVM/src/llvm/lib/Target/ARM/MCTargetDesc/ARMInstPrinter.cpp:997:35
#12 0x00007f3436408771 llvm::ARMInstPrinter::printInstruction(llvm::MCInst const*, llvm::MCSubtargetInfo const&, llvm::raw_ostream&) /work/bf/LLVM/build/lib/Target/ARM/ARMGenAsmWriter.inc:9164:26
#13 0x00007f34364164ba llvm::ARMInstPrinter::printInst(llvm::MCInst const*, llvm::raw_ostream&, llvm::StringRef, llvm::MCSubtargetInfo const&) /work/bf/LLVM/src/llvm/lib/Target/ARM/MCTargetDesc/ARMInstPrinter.cpp:307:18
#14 0x00007f342f913b4c llvm::MCTargetStreamer::prettyPrintAsm(llvm::MCInstPrinter&, llvm::raw_ostream&, llvm::MCInst const&, llvm::MCSubtargetInfo const&) /work/bf/LLVM/src/llvm/lib/MC/MCStreamer.cpp:983:1
#15 0x00007f342f8933a9 (anonymous namespace)::MCAsmStreamer::EmitInstruction(llvm::MCInst const&, llvm::MCSubtargetInfo const&) /work/bf/LLVM/src/llvm/lib/MC/MCAsmStreamer.cpp:1947:40
#16 0x0000000000431159 PrintInsts(llvm::MCDisassembler const&, std::pair<std::vector<unsigned char, std::allocator<unsigned char> >, std::vector<char const*, std::allocator<char const*> > > const&, llvm::SourceMgr&, llvm::raw_ostream&, llvm::MCStreamer&, bool, llvm::MCSubtargetInfo const&) /work/bf/LLVM/src/llvm/tools/llvm-mc/Disassembler.cpp:73:7
#17 0x0000000000431a62 llvm::Disassembler::disassemble(llvm::Target const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, llvm::MCSubtargetInfo&, llvm::MCStreamer&, llvm::MemoryBuffer&, llvm::SourceMgr&, llvm::MCContext&, llvm::raw_ostream&, llvm::MCTargetOptions const&) /work/bf/LLVM/src/llvm/tools/llvm-mc/Disassembler.cpp:197:34
#18 0x00000000004190fd main /work/bf/LLVM/src/llvm/tools/llvm-mc/llvm-mc.cpp:521:36
#19 0x00007f342e59c830 __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:325:0
#20 0x0000000000416f29 _start (/work/bf/LLVM/build/bin/llvm-mc+0x416f29)
Aborted (core dumped) | |
| llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp | ||
|---|---|---|
| 711 | I mean that we try to use tSUBi8 on SP. But I guess that can't actually happen. If "Offset < 0" is true, we can't be in Thumb1 mode because there aren't any load/store instructions with negative offsets. So the "Base != ARM::SP" check here is unnecessary. | |
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
| 6630 | I don't follow why it's crashing; if it matched adr, it would be trying to print an adr, not an add. The immediate cause of the crash is that the printer is expecting an operand to represent the "s" bit, and isn't finding it. But ARM::t2ADDspImm should have an operand to represent the "s" bit. I guess there's a sort of weird overlap here; DecoderGPRRegisterClass returns SoftFail where it should actually be hard-failing. So the ri/ri12 variants actually match in cases where we don't want them to. I would have expected that to mean you need a decoder for the ri/ri12 variants, not the sp variants, though, and I don't think it would cause a crash. | |
| llvm/lib/Target/ARM/ARMLoadStoreOptimizer.cpp | ||
|---|---|---|
| 711 | Got it. Adding a FIXME here for it. | |
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
| 6630 | Changing DecoderGPRRegisterClass to return a MCDisassembler::Fail breaks some tests. Would it be ok if I keep the custom decoder for now, and add a FIXME to the DecoderGPRRegisterClass? | |
Added fixme to ARMLoadStoreOptimizer due negative offset Added test-cases for all added (and fixed existing) thumb2 add/sub alias
Removed dead/wrong tablegen t2InstSubst, defining a t2ADDri12 using t2_so_imm_neg instead of imm0_4095_neg
| llvm/lib/Target/ARM/ARMInstrThumb2.td | ||
|---|---|---|
| 2365 | These patterns are unreachable. | |
| llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
| 9848 | Is this new behavior? Or was this handled elsewhere before, somehow? | |
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
| 5612 | I assume this is supposed to reject adr sp, #label etc. Is this new behavior? If it is, can you split it into a separate patch? Not sure what the Inst.getNumOperands() check is supposed to be doing. | |
| 5621 | This could probably use a comment. It looks like it's handling sub r0, pc, #0? (That should actually be a valid instruction, as far as I know.) | |
| llvm/lib/Target/ARM/AsmParser/ARMAsmParser.cpp | ||
|---|---|---|
| 9848 | It is not a new behavior, it was handled elsewhere. The only reference I can find about such conversions is in Thumb2SizeReduce::ReduceSpecial running llvm-mc -triple=thumbv7 -show-encoding <<< "add sp, #508" llvm_currently add sp, #508 @ encoding: [0x7f,0xb0] this_patch_without_this_part add.w sp, sp, #508 @ encoding: [0x0d,0xf5,0xfe,0x7d] the_entire_patch add sp, #508 @ encoding: [0x7f,0xb0] Perhaps we might lose the optimization when we obtain a node that is a t2ADDspImm that could be converted to t1, but I prefer to leave that to another patch. | |
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
| 5612 | About the Inst.getNumOperands(), I was confused if Inst was empty here or not, as some case statements create new instructions instead of using this one. I replaced it by an assert that Inst should be empty. Yes indeed it is a change of behavior. It will fail to accept sp to thumbv7. Currently we have the same warning for both thumbv7 and thumbv8 when doing: <stdin>:1:1: warning: potentially undefined instruction encoding
0x0f,0xf2,0x08,0x0d
^
        addw    sp, pc, #8              @ encoding: [0x0f,0xf2,0x08,0x0d]After the patch, it will stop warning for thumbv8 and will hard-fail for thumbv7. (indeed, it should not accept the softfail given by DecoderGPRRegisterClass ). I can't move this changes to a distinct patch, as this code is not even executed currently. When I create the spImm variants in table-gen is when this decoder is actually used. Before, the instructions are either decoded as addw or subw, never as adr.w. | |
| 5621 | Indeed it is a perfectly valid instruction. Is the singular case where (following the ARMv7-M Architecture Reference Manual) that the ADR.w Encoding T2 with offset zero is decoded as a sub. Here we add the pc operand to the operation. Indeed, it should use the function DecodeGPRRegisterClass, not DecoderGPRRegisterClass. So we will preserve the current behavior when doing: The changes appear when the offset is not zero, such as: "0xaf 0xf2 0x01 0x00" currently subw r0, pc, #1 @ encoding: [0xaf,0xf2,0x01,0x00] will_be adr.w r0, #-1 @ encoding: [0xaf,0xf2,0x01,0x00] | |
Requested fixes:
- Removed unreachable patterns
- Added v8 and v7 tests showing different decoder behavior of adr.w writting to sp ( and fixed the code by rejecting softfail )
- Added comment that adr.w with negative zero offset is decoded as subw .. pc, 0
| llvm/lib/Target/ARM/ARMInstrThumb2.td | ||
|---|---|---|
| 929 | "let Inst{26} = imm{11};" This looks suspicious. | |
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
| 5609 | Extra parentheses. | |
| 5612 | Okay, makes sense. Not sure why you need to hard-fail instead of soft-fail here for thumbv7; does something else break? (Please add a brief comment explaining.) | |
| llvm/test/MC/Disassembler/ARM/thumb2-v8.txt | ||
| 43 ↗ | (On Diff #236784) | Whitespace. | 
| llvm/lib/Target/ARM/Disassembler/ARMDisassembler.cpp | ||
|---|---|---|
| 5612 | Indeed it won't break, is just that I didn't realize that the standard was to emit a warning, instead of an error. Fixing it. | |
Seems like a peephole optimizer inverting the two operands, without validating their register types.
- Fixed the destination register class when converting ADDrr to ADDri or ADDspImm. Added test
- Fixed the number of arguments when converting t2SUBspImm to tSUBspi by adding a register at the end. Else, it would break llvm-mca that expects the number of operands to follow the table-gen, even when not used by the instruction.
I'm a little surprised the ri12 variants were missing here. I guess the frame setup code doesn't use them. Still fine to add, though.