This is an archive of the discontinued LLVM Phabricator instance.

[VectorCombine] Enable scalarizeBinopOrCmp for scalable vectors
ClosedPublic

Authored by MattDevereau on Nov 23 2022, 12:04 AM.

Details

Summary

[VectorCombine] Enable scalarizeBinopOrCmp for scalable vectors

This reverts a change to exclude scalarizeBinopOrCmp in VectorCombine for
scalable vectors which caused poor scalable Binop codegen.

Diff Detail

Event Timeline

MattDevereau created this revision.Nov 23 2022, 12:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2022, 12:04 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
MattDevereau requested review of this revision.Nov 23 2022, 12:04 AM
spatel accepted this revision.Nov 23 2022, 4:46 AM

LGTM - see inline for a couple of additions.

llvm/lib/Transforms/Vectorize/VectorCombine.cpp
1724

Add a comment here to make the logic difference explicit:

// This transform works with scalable and fixed vectors.

Should there be a TODO comment about allowing more scalable transforms?

llvm/test/Transforms/VectorCombine/AArch64/scalarize-scalable.ll
5

This shows a series of folds, but it would be good to have a couple of minimum tests too. Something like this should give us coverage for integer types and compares?

define <vscale x 4 x i32> @scalarize_scalable_udiv(i32 %x, i32 %y) {
  %splatx = insertelement <vscale x 4 x i32> poison, i32 %x, i64 0
  %splaty = insertelement <vscale x 4 x i32> poison, i32 %y, i64 0
  %r = udiv <vscale x 4 x i32> %splatx, %splaty
  ret <vscale x 4 x i32> %r
}

define <vscale x 4 x i1> @scalarize_scalable_icmp(i32 %x, i32 %y) {
  %splatx = insertelement <vscale x 4 x i32> poison, i32 %x, i64 0
  %splaty = insertelement <vscale x 4 x i32> poison, i32 %y, i64 0
  %r = icmp sgt <vscale x 4 x i32> %splatx, %splaty
  ret <vscale x 4 x i1> %r
}
This revision is now accepted and ready to land.Nov 23 2022, 4:46 AM
MattDevereau added inline comments.Nov 23 2022, 5:00 AM
llvm/lib/Transforms/Vectorize/VectorCombine.cpp
1724

I'll add the comment on push.
Regarding the TODO: I think it's ok to put in to signal that there's more scalable optimization work possible here, but to be explicit on why this patch exists, we currently just want to fix this single regression case we've identified and are not actively seeking more scalable optimizations in this area just yet.

llvm/test/Transforms/VectorCombine/AArch64/scalarize-scalable.ll
5

Sure, I'm perfectly happy to throw in some more tests