Page MenuHomePhabricator

[ARM][MVE] Add intrinsics for more immediate shifts.

Authored by simon_tatham on Dec 13 2019, 3:08 AM.



This fills in the remaining shift operations that take a single vector
input and an immediate shift count: the vqshl, vqshlu, vrshr and
vshll[bt] families.

vshll[bt] (which shifts each input lane left into a double-width
output lane) is the most interesting one. There are separate MC
instruction ids for shifting by exactly the input lane width and
shifting by less than that, because the instruction encoding is so
completely different for the lane-width special case. So I had to
write two sets of patterns to match based on the immediate shift
count, which involved adding a ComplexPattern matcher to avoid the
general-case pattern accidentally matching the special case too. For
that family I've made sure to add an llc codegen test for both
versions of each instruction.

I'm experimenting with a new strategy for parametrising the isel
patterns for all these instructions: adding extra fields to the
relevant Instruction subclass itself, which are ignored by the
Tablegen backends that generate the MC data, but can be retrieved from
each instance of that instruction subclass when it's passed as a
template parameter to the multiclass that generates its isel patterns.
A nice effect of that is that I can fill in those informational fields
using let blocks, rather than having to type them out once per
instruction at defm time.

(As a result, quite a lot of existing instruction defs are
reindented by this patch, so it's clearer to read with whitespace
changes ignored.)

Diff Detail

Event Timeline

simon_tatham created this revision.Dec 13 2019, 3:08 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 13 2019, 3:08 AM
MarkMurrayARM accepted this revision.Dec 13 2019, 3:34 AM



*OUCH* 6 deep :-)

This revision is now accepted and ready to land.Dec 13 2019, 3:34 AM
simon_tatham marked an inline comment as done.Dec 13 2019, 3:40 AM
simon_tatham added inline comments.

Yes, but each of those loops is over a length-1 list, so it's not actually taking any time!

This is just the kind of thing I'd have preferred to do using an actual 'define local variable' command. But Tablegen doesn't have one. If D74107 is approved then I'll be able to convert all of these into defvar, along with many other uses of this singleton-foreach idiom.

simon_tatham marked an inline comment as done.Dec 13 2019, 3:41 AM
simon_tatham added inline comments.

Ahem. D71407, I mean.

This revision was automatically updated to reflect the committed changes.