The low-overhead branch extension provides a loop-end 'LE' instruction that performs no decrement nor compare, it just jumps backwards. This patch modifies the constant islands pass to try to insert LE instructions in place of a Thumb2 conditional branch, instead of shrinking it. This only happens if a cmp can be converted to a cbn/z and used to exit the loop.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
The piece, that I know, is missing is checking whether we're not creating nested LE loops - which we want to avoid.
Nice. I was expecting this to happen earlier, maybe in the other branch optimisations. This seems like a good place for it, though.
What does it do for codesize?
lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
1935 ↗ | (On Diff #219555) | for (ImmBranch &Br : reverse(ImmBranches)) |
1986 ↗ | (On Diff #219555) | Can you explain this? Is it that we know that this is the LE case, and the other br at the end of the block is being deleted? Can you add a comment. |
lib/Target/ARM/ARMConstantIslandPass.cpp | ||
---|---|---|
1986 ↗ | (On Diff #219555) | That's it. But now looking at it again, I'm not sure if Br.MI could actually be LastMI too... |
As far as I can tell, code size will always be worse: instead of generating just a cbn?z, we also need the T2 LE instruction. I'll add a check and prevent the optimisation at minsize.
- Now allowing CBZ and CBNZ, had got myself a bit confused there...
- Added comment.
- Checking terminator before removing it.
- Added more tests.
I'd say that it's sub-optimal... I don't know how often the issue will arise and whether that justifies using LoopInfo here. I'm also unsure how that extra logic would fit in with the current optimisation as we'd want to visit inner loops first, instead of walking backwards.
Yeah, I agree. I don't think it should cause too much trouble to have nested LE's, it would just invalidate the loop info on each outer iteration.
LGTM.