This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][GlobalISel] Change G_ANYEXT fed by scalar G_ICMP to G_ZEXT
ClosedPublic

Authored by paquette on Oct 1 2021, 12:18 PM.

Details

Summary

This is a common pattern:

%icmp:_(s32) = G_ICMP intpred(eq), ...
%ext:_(s64) = G_ANYEXT %icmp(s32)
%and:_(s64) = G_AND %ext, 1

Here's an example: https://godbolt.org/z/T13f6o8zE

This pattern appears because of the following combine in the
LegalizationArtifactCombiner:

// zext(trunc x) - > and (aext/copy/trunc x), mask

Which kicks in when we widen the result of G_ICMP from 1 bit to 32 bits.

We know that, on AArch64, a scalar G_ICMP will produce 0 or 1. So the result of %ext will always be 0 or 1 as well.

We have some KnownBits combines which eliminate redundant G_ANDs with masks. These combines don't kick in with G_ANYEXT.

So, if we replace the G_ANYEXT with G_ZEXT in this situation, the KnownBits-based combines can remove the redundant G_AND.

I wasn't sure if it woud be more appropriate to

  • Take this route
  • Put this in the LegalizationArtifactCombiner.
  • Allow 64 bit G_ICMP destinations

I decided on this route because

  1. It's simple
  1. I'm not sure if philosophically-speaking, we should be handling non-artifact instructions + target-specific details like TargetBooleanContents in the LegalizationArtifactCombiner
  1. There is a lot of existing code which assumes we only have 32 bit G_ICMP destinations. So, adding support for 64-bit destinations seems rather invasive right now. I think that adding support for 64-bit destinations, or modelling G_ICMP as ADDS/SUBS/etc is probably cleaner long term though.

This gives minor code size savings on all CTMark benchmarks.

Diff Detail

Event Timeline

paquette created this revision.Oct 1 2021, 12:18 PM
paquette requested review of this revision.Oct 1 2021, 12:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 1 2021, 12:18 PM
jroelofs added inline comments.Oct 1 2021, 1:02 PM
llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerCombiner.cpp
271

G_FCMP too?

paquette updated this revision to Diff 376624.Oct 1 2021, 1:40 PM

Add G_FCMP. This gives minor added improvements on CTMark/sqlite3 and CTMark/bullet.

Here's a similar IR example to the G_ICMP one for G_FCMP: https://godbolt.org/z/c4hW53Wrs

jroelofs accepted this revision.Oct 1 2021, 1:49 PM

LGTM

This revision is now accepted and ready to land.Oct 1 2021, 1:49 PM