This is an archive of the discontinued LLVM Phabricator instance.

[ARM,MVE] Add the vmovnbq,vmovntq intrinsic family.
ClosedPublic

Authored by simon_tatham on Feb 10 2020, 8:54 AM.

Details

Summary

These are in some sense the inverse of vmovl[bt]q: they take a vector
of n wide elements and truncate each to half its width. So they only
write half a vector's worth of output data, and therefore they also
take an 'inactive' parameter to provide the other half of the data in
the output vector. So vmovnb overwrites the even lanes of 'inactive'
with the narrowed values from the main input, and vmovnt overwrites
the odd lanes.

LLVM had existing codegen which generates these MVE instructions in
response to IR that takes two vectors of wide elements, or two vectors
of narrow ones. But in this case, we have one vector of each. So my
clang codegen strategy is to narrow the input vector of wide elements
by simply reinterpreting it as the output type, and then we have two
narrow vectors and can represent the operation as a vector shuffle
that interleaves lanes from both of them.

Even so, not all the cases I needed ended up being selected as a
single MVE instruction, so I've added a couple more patterns that spot
combinations of the 'MVEvmovn' and 'ARMvrev32' SDNodes which can be
generated as a VMOVN instruction with operands swapped.

This commit adds the unpredicated forms only.

Event Timeline

simon_tatham created this revision.Feb 10 2020, 8:54 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 10 2020, 8:54 AM

Rebased on top of fixes to previous patches in the stack.

Cosmetic tweak: don't put the new arm_mve.td section in between the definition and the uses of an unrelated multiclass.

Added tests to make sure the isel rules work in both big- and little-endian.

dmgreen added inline comments.Feb 17 2020, 7:48 AM
clang/test/CodeGen/arm-mve-intrinsics/vmovn.c
11

These would be vreinterpret's in bigendian?

simon_tatham marked an inline comment as done.Feb 17 2020, 7:54 AM
simon_tatham added inline comments.
clang/test/CodeGen/arm-mve-intrinsics/vmovn.c
11

Yes – they're constructed by the vreinterpret record in the clang-side tablegen. Good point; perhaps I should expand the clang test to include a check of the BE output too.

Test on the clang side in both endiannesses.

dmgreen accepted this revision.Feb 17 2020, 8:49 AM

LGTM.

This revision is now accepted and ready to land.Feb 17 2020, 8:49 AM
This revision was automatically updated to reflect the committed changes.