Don't rewrite an add instruction with 2 SET_CC operands into a csel instruction. The total instruction sequence uses an extra instruction and register. Preventing this, allows us to match a (add, csel) pattern and rewrite this into a cinc, see the new test case in arm64-csel.ll.
Details
Diff Detail
Event Timeline
Should we be generating:
cmp cset cmp csinc
That lets the csinc do the add, if it's only going to be adding 0/1. Maybe that's possible with an extra isel pattern from add(cset(..))?
Thanks for the suggestion Dave. Just did the pen and paper exercise and agree that:
cmp w8, #3 cset w8, hi cmp w9, #33 csinc w0, w8, w8, <=
is even better. :-) Also agreed that a match pattern would be nice, so looking into that now.
This now generates cinc, which is even better.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
13150 | I had to keep this, otherwise this lowering will happen first and the pattern, see below, doesn't match,. | |
llvm/lib/Target/AArch64/AArch64InstrInfo.td | ||
2162 ↗ | (On Diff #331288) | I had to add an zext for this pattern, that's because the way setcc is legalised from an i1 to i32 first, then zext'ed to i64. |
Thanks. LGTM
The lowering/optimization of CSEL/CSINC/etc feels a bit weak to me, so any improvements are good to see. Tablegen improvements especially as (as far as I understand) they can help in both SelectionDag and GlobalISel.
Yeah, thanks Dave, I think I will be looking a bit more in this area, but this is a start...
I had to keep this, otherwise this lowering will happen first and the pattern, see below, doesn't match,.