This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Basic demand elements for some intrinsics
ClosedPublic

Authored by dmgreen on Jan 12 2022, 1:50 AM.

Details

Summary

A lot of neon intrinsics work lane-wise, meaning that non-demanded elements in and not demanded out. This teaches that to AArch64TTIImpl::simplifyDemandedVectorEltsIntrinsic for some simple single-input truncate intrinsics, which can help remove unnecessary instructions in the final result.

Diff Detail

Event Timeline

dmgreen created this revision.Jan 12 2022, 1:50 AM
dmgreen requested review of this revision.Jan 12 2022, 1:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2022, 1:50 AM

The change looks sensible to me, although I was a little confused by the commit message? It looks like what your patch is doing is just adding an AArch64 version of simplifyDemandedVectorEltsIntrinsic, that allows us to potentially simplify the input operand to the intrinsic based on the demanded elements. So, for example, if we know that we are only going to use the first N elements of the intrinsic result we can use that information to simplify the intrinsic operand too.

dmgreen edited the summary of this revision. (Show Details)Jan 13 2022, 12:35 AM

It looks like what your patch is doing is just adding an AArch64 version of simplifyDemandedVectorEltsIntrinsic, that allows us to potentially simplify the input operand to the intrinsic based on the demanded elements. So, for example, if we know that we are only going to use the first N elements of the intrinsic result we can use that information to simplify the intrinsic operand too.

Yep. I had written "single element" where I meant "single source". I can see that being confusing, but you have the right idea.

We could do it for binops too, but I've not looked at those here, just truncates with a single input.

This revision is now accepted and ready to land.Jan 13 2022, 12:46 AM
This revision was landed with ongoing or failed builds.Jan 13 2022, 3:53 AM
This revision was automatically updated to reflect the committed changes.