This is a common pattern:
%icmp:_(s32) = G_ICMP intpred(eq), ... %ext:_(s64) = G_ANYEXT %icmp(s32) %and:_(s64) = G_AND %ext, 1
Here's an example: https://godbolt.org/z/T13f6o8zE
This pattern appears because of the following combine in the
LegalizationArtifactCombiner:
// zext(trunc x) - > and (aext/copy/trunc x), mask
Which kicks in when we widen the result of G_ICMP from 1 bit to 32 bits.
We know that, on AArch64, a scalar G_ICMP will produce 0 or 1. So the result of %ext will always be 0 or 1 as well.
We have some KnownBits combines which eliminate redundant G_ANDs with masks. These combines don't kick in with G_ANYEXT.
So, if we replace the G_ANYEXT with G_ZEXT in this situation, the KnownBits-based combines can remove the redundant G_AND.
I wasn't sure if it woud be more appropriate to
- Take this route
- Put this in the LegalizationArtifactCombiner.
- Allow 64 bit G_ICMP destinations
I decided on this route because
- It's simple
- I'm not sure if philosophically-speaking, we should be handling non-artifact instructions + target-specific details like TargetBooleanContents in the LegalizationArtifactCombiner
- There is a lot of existing code which assumes we only have 32 bit G_ICMP destinations. So, adding support for 64-bit destinations seems rather invasive right now. I think that adding support for 64-bit destinations, or modelling G_ICMP as ADDS/SUBS/etc is probably cleaner long term though.
This gives minor code size savings on all CTMark benchmarks.
G_FCMP too?