Page MenuHomePhabricator

[AArch64][SVE] Lower shuffles to permute instructions: rev/revb/revh/revw

Authored by wwei on Dec 2 2021, 7:15 AM.



Attempt to lower a shuffle as a permute instruction(rev/revb/revh/revw) for fixed length SVE.

Diff Detail

Event Timeline

wwei created this revision.Dec 2 2021, 7:15 AM
wwei requested review of this revision.Dec 2 2021, 7:15 AM
Matt added a subscriber: Matt.Dec 2 2021, 9:34 AM
wwei added a comment.Dec 7 2021, 12:30 AM

@paulwalker-arm, could you help to review this patch and D113376?

@paulwalker-arm, could you help to review this patch and D113376?

Sorry @wwei, I was hoping to review both patches today but that hasn't really worked out. I'll take a proper look tomorrow. One observation: The patch could do with some floating point tests to verify the BITCAST logic for those types.


Calling LowerToPredicatedOp feels like overkill here compared to DAG.getNode(RevOp, DL, NewVT, ..., DAG.getUNDEF(NewVT)

wwei updated this revision to Diff 392701.Dec 8 2021, 3:27 AM

Add some float test cases

wwei added inline comments.Dec 8 2021, 4:05 AM

Since there's no unpredicated revb/revh/revw for SVE, LowerToPredicatedOp can help to construct a predicate operand, and can handle merge passthru opcode also. If using DAG.getNode(RevOp, DL, NewVT, ..., DAG.getUNDEF(NewVT), we need some extra code to get a predicate operand and pass correct merge passthru operands.

paulwalker-arm added inline comments.Dec 8 2021, 10:46 AM

That's a good point, fair enough.


Is it valid to use class Instruction methods this far into code generation? That said, ShuffleVectorInst::isReverseMask looks potentially unsafe in this context anyway because given a 4 element mask it'll return true for both 3, 2, 1, 0 and 7, 6, 5, 4 but as mentioned below ISD::VECTOR_REVERSE only supports a single operand and so you need to know specifically which of these scenarios apply.

Do you care about the 7, 6, 5, 4 case? I can see test_revv8i32v8i32 is a test for this but I think that is being simplified to a 3, 2, 1, 0 based shuffle before you get here.

I'll note that ARMISelLowering.cpp has the isReverseMask helper function that could be useful.


This looks wrong because ISD::VECTOR_REVERSE only takes a single operand? I presume Op2 it just been ignored and there's nothing in getNode to ensure only a single operand.


I originally wrote a comment asking why the i32 case emitted worse code than the f32 one but then spotted the attribute difference, so this seems more like a negative test. It's worth adding a function comment to make this clear to the user, which also goes for the other "negative" tests. Perhaps it's also worth adding _vl256 to functions where vscale_range is set just to drive the point home?


Is there a need for fadd here? I'm kind of assuming that without it the float shuffles are being converted to integer ones and thus you lost test coverage? but then I can see test_revv8f32 doesn't need it so figured I'd ask.

wwei updated this revision to Diff 393171.Dec 9 2021, 8:16 AM
wwei added inline comments.Dec 9 2021, 8:38 AM

Maybe we use ShuffleVectorInst::isReverseMask here will be better, because it will call isSingleSourceMask to ensure the shuffle mask chooses elements from exactly one source vector(isReverseMask in ARMISelLowering.cpp will not check this).
I tried various test case forms, but I couldn’t construct this kind(7, 6, 5, 4) of case. It would always be converted to 3, 2, 1, 0 form. To be on the safe side, I still added a check-- Op2.isUndef(), and Op1 will be the only valid operand for VECTOR_REVERSE


yeah, you're right. I removed Op2 since it will always be Undef


I rearranged the order of the test cases, sorted according to attributes, and added some comments for the rev instructions.


fadd is not needed, I removed it

wwei added inline comments.Dec 10 2021, 4:25 AM

And for REVB/REVH/REVW cases, I added Op2.isUndef() check too.

paulwalker-arm accepted this revision.Dec 14 2021, 7:20 AM

Functionally this looks good but I've added a few stylistic comments below to consider before committing. I found the tests harder to follow than necessary as it wasn't always clear what the intent of a specific test was, especially those when rev/b/h/w instructions are not emitted. More comments would help and I also think having two separate test files, one for rev (permute-rev) and another for revb/revh/revw (permute-rev-elts) would be advantageous.


In this instance I don't believe this is required because isREVMask does what you need. It's only the uses of ShuffleVectorInst::isReverseMask that need extra protection, which I see you've covered below.


Personally I think the following looks better:

Op = DAG.getNode(ISD::BITCAST, DL, NewVT, Op1);
Op = LowerToPredicatedOp(Op, DAG, RevOp);
Op = DAG.getNode(ISD::BITCAST, DL, ContainerVT, Op);

but then I'm also not a fan of how the function's input parameter Op is overwritten.


This test either looks misplaced (belongs with the other rev tests towards the bottom of this file?) or is perhaps redundant?


Can this be moved to the bottom with the other attributes?


This test looks like it belongs in the top half after test_revhv32i16? (i.e. so it part of the other element shuffle tests as apposed to this blocks of tests which is testing whole vector reversals).

This revision is now accepted and ready to land.Dec 14 2021, 7:20 AM
wwei added a comment.Dec 15 2021, 6:00 AM

@paulwalker-arm Thanks for your comments, the code has been modified based on your review comments, and the test file has been modified too, adding some necessary comments