This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Narrow binops feeding into G_AND with a mask
ClosedPublic

Authored by paquette on Aug 11 2021, 1:33 PM.

Details

Summary

This is a fairly common pattern:

%mask = G_CONSTANT iN <mask val>
%add = G_ADD %lhs, %rhs
%and = G_AND %add, %mask

We have combines to eliminate G_AND with a mask that does nothing.

If we combined the above to this:

%mask = G_CONSTANT iN <mask val>
%narrow_lhs = G_TRUNC %lhs
%narrow_rhs = G_TRUNC %rhs
%narrow_add = G_ADD %narrow_lhs, %narrow_rhs
%ext = G_ZEXT %narrow_add
%and = G_AND %ext, %mask

We'd be able to take advantage of those combines using the trunc + zext.

For this to work (or be beneficial in the best case)

  • The operation we want to narrow then widen must only be used by the G_AND
  • The G_TRUNC + G_ZEXT must be free
  • Performing the operation at a narrower width must not produce a different value than performing it at the original width *after masking.*

Example comparison between SDAG + GISel: https://godbolt.org/z/63jzb1Yvj

At -Os for AArch64, this is a 0.2% code size improvement on CTMark/pairlocalign.

Diff Detail

Event Timeline

paquette created this revision.Aug 11 2021, 1:33 PM
paquette requested review of this revision.Aug 11 2021, 1:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2021, 1:33 PM
Herald added a subscriber: wdng. · View Herald Transcript
aemerson added inline comments.Aug 11 2021, 1:59 PM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
4382

getDefIgnoringCopies()?

4407–4408

I don't think we should be making assumptions about combine ordering. Targets may choose to define their own combine configs by picking and choosing.

paquette updated this revision to Diff 365866.Aug 11 2021, 3:39 PM

Address comments

aemerson accepted this revision.Aug 11 2021, 4:06 PM

LGTM.

This revision is now accepted and ready to land.Aug 11 2021, 4:06 PM
arsenm added inline comments.Aug 11 2021, 4:09 PM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
4417–4421

If we're going to rely on these TLI hooks, I would rather add LLT overloads and let the default implementation use the approximate function

paquette updated this revision to Diff 365876.Aug 11 2021, 4:36 PM

Add LLT variants of isTruncateFree and isZExtFree which use the approximate EVT function by default.