When we have a G_ADD which is fed by a G_ICMP on one side, we can fold it into the cset for the G_ICMP.
e.g. Given
%cmp = G_ICMP ... %x, %y %add = G_ADD %cmp, %z
We would normally emit a cmp, cset, and add.
However, %add is either %z or %z + 1. So, we can just use %z as the source of the cset rather than wzr, saving an instruction.
This would probably be cleaner in AArch64PostLegalizerLowering, but we'd need to change the way we represent G_ICMP to do that, I think. For now, it's easiest to implement in selection.
This is a 0.1% code size improvement on CTMark/pairlocalalign at -Os.
Example: https://godbolt.org/z/7KdrP8
What if the type is s64? emitCSetForICMP seems to only work for s32.