Attempt to lower a shuffle as a permute instruction(rev/revb/revh/revw) for fixed length SVE.
Sorry @wwei, I was hoping to review both patches today but that hasn't really worked out. I'll take a proper look tomorrow. One observation: The patch could do with some floating point tests to verify the BITCAST logic for those types.
Calling LowerToPredicatedOp feels like overkill here compared to DAG.getNode(RevOp, DL, NewVT, ..., DAG.getUNDEF(NewVT)
Since there's no unpredicated revb/revh/revw for SVE, LowerToPredicatedOp can help to construct a predicate operand, and can handle merge passthru opcode also. If using DAG.getNode(RevOp, DL, NewVT, ..., DAG.getUNDEF(NewVT), we need some extra code to get a predicate operand and pass correct merge passthru operands.
That's a good point, fair enough.
Is it valid to use class Instruction methods this far into code generation? That said, ShuffleVectorInst::isReverseMask looks potentially unsafe in this context anyway because given a 4 element mask it'll return true for both 3, 2, 1, 0 and 7, 6, 5, 4 but as mentioned below ISD::VECTOR_REVERSE only supports a single operand and so you need to know specifically which of these scenarios apply.
Do you care about the 7, 6, 5, 4 case? I can see test_revv8i32v8i32 is a test for this but I think that is being simplified to a 3, 2, 1, 0 based shuffle before you get here.
I'll note that ARMISelLowering.cpp has the isReverseMask helper function that could be useful.
This looks wrong because ISD::VECTOR_REVERSE only takes a single operand? I presume Op2 it just been ignored and there's nothing in getNode to ensure only a single operand.
I originally wrote a comment asking why the i32 case emitted worse code than the f32 one but then spotted the attribute difference, so this seems more like a negative test. It's worth adding a function comment to make this clear to the user, which also goes for the other "negative" tests. Perhaps it's also worth adding _vl256 to functions where vscale_range is set just to drive the point home?
Is there a need for fadd here? I'm kind of assuming that without it the float shuffles are being converted to integer ones and thus you lost test coverage? but then I can see test_revv8f32 doesn't need it so figured I'd ask.
Maybe we use ShuffleVectorInst::isReverseMask here will be better, because it will call isSingleSourceMask to ensure the shuffle mask chooses elements from exactly one source vector(isReverseMask in ARMISelLowering.cpp will not check this).
yeah, you're right. I removed Op2 since it will always be Undef
I rearranged the order of the test cases, sorted according to attributes, and added some comments for the rev instructions.
fadd is not needed, I removed it
Functionally this looks good but I've added a few stylistic comments below to consider before committing. I found the tests harder to follow than necessary as it wasn't always clear what the intent of a specific test was, especially those when rev/b/h/w instructions are not emitted. More comments would help and I also think having two separate test files, one for rev (permute-rev) and another for revb/revh/revw (permute-rev-elts) would be advantageous.
In this instance I don't believe this is required because isREVMask does what you need. It's only the uses of ShuffleVectorInst::isReverseMask that need extra protection, which I see you've covered below.
Personally I think the following looks better:
Op = DAG.getNode(ISD::BITCAST, DL, NewVT, Op1); Op = LowerToPredicatedOp(Op, DAG, RevOp); Op = DAG.getNode(ISD::BITCAST, DL, ContainerVT, Op);
but then I'm also not a fan of how the function's input parameter Op is overwritten.
This test either looks misplaced (belongs with the other rev tests towards the bottom of this file?) or is perhaps redundant?
Can this be moved to the bottom with the other attributes?
This test looks like it belongs in the top half after test_revhv32i16? (i.e. so it part of the other element shuffle tests as apposed to this blocks of tests which is testing whole vector reversals).