Page MenuHomePhabricator

[ARM] Mve vector shuffles

Authored by dmgreen on Jun 19 2019, 12:11 PM.



This patch tries to add the necessary shuffle vector and buildvector support for ARM MVE. It essentially adds support for VDUP, VREVs and some VMOVs, which are often required by other code (like the upcoming DIV patch).

This mostly uses the same code as for Neon that already generated NEONvdup/NEONvduplane/NEONvrev's. These have been renamed to ARMvdup/etc and moved to ARMInstrInfo as they are common to both architectures. Most of the selection code seems to be applicable to both, but NEON does have some more instructions making some parts specific.

Most code originally by David Sherwood. I've put the patterns near the instructions they effect, but not in them.

Diff Detail

Event Timeline

dmgreen created this revision.Jun 19 2019, 12:11 PM
SjoerdMeijer added inline comments.Jun 24 2019, 2:05 AM

a proper nit: perhaps Mve -> MVE for consistency


There's a lot of prior art here of magic constants and shifts etc., but this doesn't mean much to me.
It probably makes sense when you're familiar with querying the shuffle table.
Not sure if it is worth a comment, but from a quick look at code here, that might not be even necessary.


Do we need tests when we don't have HasMVEInt? Or is that not really useful (or done elsewhere already)?

dmgreen added inline comments.Jun 24 2019, 4:34 AM

Yeah, This certainly sounds like something that should work. We may need to make some adjustments first to the calling convention and what is a legal type.

dmgreen updated this revision to Diff 206189.Jun 24 2019, 4:39 AM
dmgreen added a reviewer: olista01.

Rebase and change capitalisation. I have not yet added the +mve run line.

We may need to make some adjustments first to the calling convention and what is a legal type.

Okay, if that turns out to be the case, I guess that's probably best addressed in a separate patch.

I hope that all the stuff related to passing/returning vectors is okay.
The only place I might expect some subtleties is here:

@vdup_f16(half* %src1, half* %src2)

which just passes a half, because I think so far we expect the IR to be prepared slightly differently (for inspiration, see e.g. test/CodeGen/ARM/fp16-instructions.ll, although I expect it will mostly work).

dmgreen updated this revision to Diff 206666.Jun 26 2019, 7:44 AM
dmgreen edited the summary of this revision. (Show Details)

OK. Now that D60708 is in this is a rebase on what that became. Some of this patch has changed a little as a result and in getting +mve to work. It appears that +mve without a fp is currently broken, as it is add registers without making the operations illegal.

The changes to addMVEVectorTypes have also been moved here, because of how much they have changed. I also needed to fix an fp16 bug in one of the buildvector routines and added a couple of extra patterns.

The tests now test +mve,+fullfp16 and +mve.fp, but not yet +mve which can be added later once that works sensibly.

SjoerdMeijer accepted this revision.Jun 26 2019, 7:49 AM

Ok, sounds reasonable, one step at a time.

This revision is now accepted and ready to land.Jun 26 2019, 7:49 AM

Oh, and the patterns have been moved to use HasMVEInt, even if they are float operations, as we still have the instructions required for them.

miyuki added a subscriber: miyuki.Jun 27 2019, 5:15 AM
This revision was automatically updated to reflect the committed changes.