This is an archive of the discontinued LLVM Phabricator instance.

[SVE] Fix incorrect lowering of predicate permute builtins.
ClosedPublic

Authored by paulwalker-arm on Jan 18 2023, 4:36 PM.

Details

Summary

When lowering predicate permute builtins we incorrectly assume only
the typically "active" bits for the specified element type play a
role with all other bits zero'd. This is not the case because all
bits are significant, with the element type specifying how they
are grouped:

b8  - permute using a block size of 1 bit
b16 - permute using a block size of 2 bits
b32 - permute using a block size of 4 bits
b64 - permute using a block size of 8 bits

The affected builtins are svrev, svtrn1, svtrn2, svuzp1, svuzp2,
svzip1 and svzip2.

This patch adds new intrinsics to support these operations and
changes the builtin lowering code to emit them. The b8 case remains
unchanged because for that operation the existing intrinsics work
as required and their support for other predicate types has been
maintained as useful if only as a way to test the correctness of
their matching ISD nodes that code generation relies on.

Diff Detail

Event Timeline

paulwalker-arm created this revision.Jan 18 2023, 4:36 PM
Herald added a project: Restricted Project. · View Herald Transcript
paulwalker-arm requested review of this revision.Jan 18 2023, 4:36 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJan 18 2023, 4:36 PM

This is bug fix based on something spotted when reviewing D141469.

dewen added a subscriber: dewen.Jan 19 2023, 7:31 PM
Matt added a subscriber: Matt.Jan 19 2023, 10:09 PM
llvm/lib/Target/AArch64/SVEInstrFormats.td
6540

Out of interest, is there a good reason to handle the nxv16 pattern case differently in the I multiclass args? Written this way at a glance it looks like it is missing.

paulwalker-arm added inline comments.Jan 24 2023, 4:19 PM
llvm/lib/Target/AArch64/SVEInstrFormats.td
6540

My reasoning was the pattern within the instruction class is mandatory for the correct clang builtin support so I figured that should take priority. That means extra patterns are only required for the unpacked cases, which are optional based on them having value during code generation.

peterwaller-arm accepted this revision.Jan 25 2023, 6:30 AM
This revision is now accepted and ready to land.Jan 25 2023, 6:30 AM
This revision was landed with ongoing or failed builds.Jan 26 2023, 4:22 AM
This revision was automatically updated to reflect the committed changes.