This is an archive of the discontinued LLVM Phabricator instance.

[DAG] Fold concat_vectors(concat_vectors(x,y),concat_vectors(a,b)) -> concat_vectors(x,y,a,b)
ClosedPublic

Authored by RKSimon on Aug 5 2021, 1:12 PM.

Details

Summary

Followup to D107068, attempt to fold nested concat_vectors/undefs, as long as both the vector and inner subvector types are legal.

This exposed the same issue in ARM's MVE LowerCONCAT_VECTORS_i1 (raised as PR51365) and AArch64's performConcatVectorsCombine which both assumed concat_vectors only took 2 subvector operands.

Diff Detail

Event Timeline

RKSimon created this revision.Aug 5 2021, 1:12 PM
RKSimon requested review of this revision.Aug 5 2021, 1:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 5 2021, 1:12 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
Matt added a subscriber: Matt.Aug 5 2021, 4:51 PM

Thanks. Both Arm parts look OK to me (the MVE code for concating predicate registers just needs to work, it does not need to be very efficient :) )

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19899

Can this use ConcatOps.append?

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
13587

Can you add a N->getNumOperands() == 2 check here, to be safe.

Seems good to me.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19899

You might actually want append_range from STLExtras here

RKSimon added inline comments.Aug 6 2021, 1:06 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
13587

OK - by the looks of it none of the folds in this function are safe for N->getNumOperands() != 2 - but I'll leave the tests on the individual fold so this can be more easily improved in the future.

The RISC-V changes look good to me.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19869

What would it take to add scalable vector support? I don't think I see anything here that relies on fixed-vector knowledge. If it's lack of testing we may be able to rustle up some test cases with extra wide scalable-vectors which are split during legalization?

RKSimon added inline comments.Aug 6 2021, 4:42 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19869

I was getting crashes in the legalizer which looked tricky to deal with - but turns out it was just aarch64 sve concat_vectors lowering not handling numops != 2 - trying a fix now.

RKSimon updated this revision to Diff 364754.Aug 6 2021, 5:01 AM

Enable scalable vectors support with a fix for AArch64 sve concat_vectors with numops != 2

frasercrmck added inline comments.Aug 9 2021, 2:46 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19869

Thanks, seems to have improved one of our tests at least!

ping - any more comments?

lebedev.ri added inline comments.Aug 16 2021, 4:25 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19863–19864

Does concat_vectors pad with undef?
Or should this be

//  --> concat_vectors(x,y,z,w,u,u,a,b,c,d)
RKSimon added inline comments.Aug 16 2021, 5:48 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19863–19864

All subvector operands of concat_vectors are the same type, so undefs are 'split' to the new type - look at "SubVT" in the code below.

dmgreen accepted this revision.Aug 16 2021, 6:01 AM

All the Arm code (and the DAG combine, as far as I can tell) look good to me.

This revision is now accepted and ready to land.Aug 16 2021, 6:01 AM
lebedev.ri accepted this revision.Aug 16 2021, 6:04 AM

LGTM

I think we may also want some variation of concat(bitcast(vty0 to vty1), bitcast(vty0 to vty1))?
combineConcatVectorOfScalars() does that, but only for non-vector source type.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19863–19864

Err, right, i read that wrong, sorry.

This revision was landed with ongoing or failed builds.Aug 16 2021, 8:07 AM
This revision was automatically updated to reflect the committed changes.