This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Generate ADDP from shuffled add
ClosedPublic

Authored by dmgreen on May 31 2022, 12:54 AM.

Details

Summary

This adds a fold of add(x, shuffle(x, <1,0,3,2,5,4,...>), into shuffle(addp(x), <0,0,1,1,2,2,..>. The ADDP instruction takes two vectors and returns one, adding adjacent pairs. So we match x in a custom combine as it is lowered from a v8i32. The original code would be 2 rev64 and 2 add, with the new code being a single addp with a zip1;zip2 shuffle, producing smaller code.

Diff Detail

Event Timeline

dmgreen created this revision.May 31 2022, 12:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2022, 12:54 AM
dmgreen requested review of this revision.May 31 2022, 12:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2022, 12:54 AM

Doesn't have to be part of this patch, but can this be extended to FADDP as well?

Doesn't have to be part of this patch, but can this be extended to FADDP as well?

Yeah sure that sounds good, but in a separate patch. I was attempting to get sub to work in the same way too, using a mul <0,-1>, but it's much harder for it to be profitable without using demanded elements.

samtebbs accepted this revision.Jun 1 2022, 6:13 AM

LGTM! Thanks

This revision is now accepted and ready to land.Jun 1 2022, 6:13 AM
This revision was landed with ongoing or failed builds.Jun 6 2022, 3:39 AM
This revision was automatically updated to reflect the committed changes.