This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Add VREV MVE shuffle costs
ClosedPublic

Authored by dmgreen on Mar 8 2021, 12:05 PM.

Details

Summary

This uses the shuffle mask cost from D98206 to give a better cost of MVE VREV instructions. This helps especially in VectorCombine where the cost of shuffles is used to reorder bitcasts, which this helps improve/refix the phase ordering test for fp16 reductions from intrinsics. The isVREVMask has been moved to a header file to allow it to be used across target transform and isel lowering.

Diff Detail

Event Timeline

dmgreen created this revision.Mar 8 2021, 12:05 PM
dmgreen requested review of this revision.Mar 8 2021, 12:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2021, 12:05 PM
RKSimon added inline comments.Mar 9 2021, 12:51 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1223

Why aren't you just using TTI::SK_Reverse?

dmgreen added inline comments.Mar 9 2021, 4:29 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1223

Thanks for taking a look. A VREV is a little different from a SK_Reverse, as far as I understand, although for some of the tests it happens to match both. A vrev mask will reverse lanes in a block, not the whole register. So they have masks like <3, 2, 1, 0, 7, 6, 5, 4> or <1, 0, 3, 2, 5, 4, 7, 6>.

In MVE there are vrev16.8, vrev32.8, vrev32.16, vrev64.8, vrev64.16 and vrev64.32. The way I think of them is that taking a 128 bit legal vector, they reverse the bits in the first number, then reverse back in the second.

dmgreen updated this revision to Diff 329292.Mar 9 2021, 4:30 AM

Update formatting noticed when looking at this.

SjoerdMeijer accepted this revision.Mar 16 2021, 6:42 AM

Looks reasonable to me. Perhaps wait a day in case @RKSimon has more comments.

This revision is now accepted and ready to land.Mar 16 2021, 6:42 AM
RKSimon accepted this revision.Mar 16 2021, 6:56 AM

Just one minor

llvm/lib/Target/ARM/ARMTargetTransformInfo.h
304
for (unsigned i = 0, e = M.size(); i < e; ++i) {
RKSimon added inline comments.Mar 17 2021, 8:29 AM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1224

Add an assert that Mask is the correct width?

dmgreen added inline comments.Mar 17 2021, 12:19 PM
llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
1224

I'm not sure if there's a specific width this needs to be. It was designed to work with any width, under the hope they would be legalized to full vector widths and so long as each of the original masks were a vrev, the result would be a vrev (or simpler, possibly with some undef lanes).

It's only really tested with single source, non-shape changing shuffles at the moment though. And it looks like testing more than that doesn't work at the moment because they all get a unknown (-1) cost. I'll add a check that the size is an OK width and see about improving the testing to some more cases.

This revision was landed with ongoing or failed builds.Mar 17 2021, 2:22 PM
This revision was automatically updated to reflect the committed changes.