LLVM normally only makes use of low registers in Thumb1 and methods Thumb1InstrInfo::storeRegToStackSlot()/loadRegFromStackSlot() are currently able to store/restore only them. However, it is possible in rare cases that a register allocator might need to spill a high register in the middle of a function as well.
Example:
$ cat test.c void constraint_h(void) { int i; asm volatile("@ %0" : : "h" (i) : "r12"); } $ clang -target arm-none-eabi -march=armv6-m -c test.c clang-7: [...]/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp:85: virtual void llvm::Thumb1InstrInfo::storeRegToStackSlot(llvm::MachineBasicBlock&, llvm::MachineBasicBlock::iterator, unsigned int, bool, int, const llvm::TargetRegisterClass*, const llvm::TargetRegisterInfo*) const: Assertion `(RC == &ARM::tGPRRegClass || (TargetRegisterInfo::isPhysicalRegister(SrcReg) && isARMLowRegister(SrcReg))) && "Unknown regclass!"' failed. [...]
The program was compiled at -O0 and so Fast Register Allocator is used. The following happens in this case:
- Prior to register allocation, MIR looks as follows:
Frame Objects: fi#0: size=4, align=4, at location [SP] bb.0.entry: %1:tgpr = tLDRspi %stack.0.i, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i) %0:hgpr = COPY %1:tgpr INLINEASM &"@ $0" [sideeffect] [attdialect], $0:[reguse:hGPR], %0:hgpr, $1:[clobber], implicit-def early-clobber $r12, !3 tBX_RET 14, $noreg
- Fast Register Allocator first satisfies %0:hgpr by selecting r12.
- When the scan reaches the INLINEASM instruction, the allocator however notices that r12 is clobbered and so it needs to be spilled.
- The allocator calls Thumb1InstrInfo::storeRegToStackSlot() to store the register in a stack slot but the method does not know how to do it and aborts. This can also result in a miscompilation if LLVM is built without assertions enabled.
Both store and load of a high register in Thumb1 needs an additional low register. For instance, the store is implemented as:
mov %lowReg, %spilledHighReg str %lowReg, ...
An initial patch in this review extended Thumb1InstrInfo::storeRegToStackSlot() and loadRegFromStackSlot() to allow storing and restoring high registers by inserting a pseudo-instruction that got later lowered after register allocation in ThumbRegisterInfo::eliminateFrameIndex(). This relied on the register scavenger to secure a low register for the sequence. This is possibly problematic when the register pressure is high because ThumbRegisterInfo::saveScavengerRegister() at the moment also tries to make use of high register r12.
The current patch extends RegAllocFast and InlineSpiller to handle a spill with an intermediary directly.
This comment isn't really right. It's still worth using the high registers, with appropriate cost constraints: there are a few important instructions which can take high registers as inputs (cmp, add, bx/blx), and even if we're just effectively using them as spill slots, it's cheaper than spilling to the stack.