When deciding if it is safe to optimize a conditional branch to a CBZ or CBNZ the offsets of the BasicBlocks from the start of the function are estimated. For inline assembly the generic getInlineAsmLength() function is used to get a worst case estimate of the inline assembly by multiplying the number of instructions by the max instruction size of 4 bytes. This unfortunately doesn't take into account the generation of Thumb implicit IT instructions. In edge cases such as when all the instructions in the block are 4-bytes in size and there is an implicit IT then the size is underestimated. This can cause an out of range CBZ or CBNZ to be generated.
The patch takes a conservative approach and assumes that every instruction in the inline assembly block may have an implicit IT.
Fixes pr31805 https://bugs.llvm.org/show_bug.cgi?id=31805 which seems to be intermittently causing the Linux Kernel to fail to build on Arm.
Possible alternatives:
- Be less conservative and a small constant, say 4-bytes to account for any overspill from implicit IT creation, this could still fail but would probably work well in practice.
- The Arm MaxInstLength could be increased to 6, but this may not be right for Arm as there is some code in the ConstantIslands pass that assumes 4-byte alignment of Arm instructions.
- Reimplement getInlineAsmLength() or some similar function to estimate how many implicit IT instructions are needed. Doing this accurately may require a lot of the functionality of an assembly parser though.