This is an archive of the discontinued LLVM Phabricator instance.

[ARM] MVE reverse shuffles.
ClosedPublic

Authored by dmgreen on Oct 28 2019, 6:06 AM.

Details

Summary

The vectorizer can sometimes make reverse shuffles from indices that count down. In MVE, we don't have a 128bit rev instruction, but we can select this to a VREV64 with some lane movs to swap the two halfs.

Ideally this would use VMOVD's, but only gets as far as VMOVS's at the moment.

Diff Detail

Event Timeline

dmgreen created this revision.Oct 28 2019, 6:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2019, 6:06 AM
dmgreen updated this revision to Diff 226693.Oct 28 2019, 10:08 AM

Updated the cost model too.

samparker added inline comments.Oct 30 2019, 5:49 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
7659–7662

I think keeping some kind of assert is a good idea here.

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
642

should we be considering v4i16 and v8i8 too?

dmgreen marked 2 inline comments as done.Oct 30 2019, 11:49 AM
dmgreen added inline comments.
llvm/lib/Target/ARM/ARMISelLowering.cpp
7659–7662

Yeah, OK. Probably best. I had this going with v4i32 too, which is where this was lost. But until double moves are selected better, that doesn't improve things.

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
642

Those 2 will be type legalised to a v4i32 and v8i16, and the final cost will use the legal type (multiplied by the type legalisation cost, which looks like it will be 1 from the tests).

Because we don't use vmovd's yet, I guess the cost of a v4i32 should be 4, not 5. I'll update that, but they will hopefully drop down to 3 in the future, like the rest.

dmgreen updated this revision to Diff 227153.Oct 30 2019, 12:21 PM
samparker accepted this revision.Nov 13 2019, 3:01 AM

Ok, LGTM

This revision is now accepted and ready to land.Nov 13 2019, 3:01 AM
This revision was automatically updated to reflect the committed changes.