This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Combine setcc (iN (bitcast (vNi1 X))) with vecreduce_or
ClosedPublic

Authored by c-rhodes on Jul 20 2022, 5:55 AM.

Diff Detail

Event Timeline

c-rhodes created this revision.Jul 20 2022, 5:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2022, 5:55 AM
c-rhodes requested review of this revision.Jul 20 2022, 5:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2022, 5:55 AM
c-rhodes updated this revision to Diff 446159.Jul 20 2022, 7:58 AM
c-rhodes retitled this revision from [AArch64] Combine setcc (i16 (bitcast (v16i1 X))) with vecreduce_or to [AArch64] Combine setcc (iN (bitcast (vNi1 X))) with vecreduce_or.

Also combine v8i1.

paulwalker-arm added inline comments.Jul 20 2022, 8:04 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
18244–18245

Do you need to be restrictive here? Would the result for larger vectors (e.g. v64i1) still be better than what we get today?

You might need to be careful about creating illegal operations. Given the context of this combine, can it be restricted to before type legalisation?

18249

The result type of SETCC doesn't have to be i1 so you either need to extend the result based on getBooleanContents() or restrict the combine to cases where VT==i1. My preference is the former.

c-rhodes updated this revision to Diff 446777.Jul 22 2022, 3:56 AM
  • Remove v8i1/v16i1 restriction.
  • Limit combine to pre-legalization.
  • Extend SETCC result with getBooleanContents.
c-rhodes marked an inline comment as done.Jul 22 2022, 4:00 AM
c-rhodes added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
18249

The result type of SETCC doesn't have to be i1 so you either need to extend the result based on getBooleanContents() or restrict the combine to cases where VT==i1. My preference is the former.

Updated to use getBooleanContents, I think this is fixed now?

c-rhodes updated this revision to Diff 446848.Jul 22 2022, 8:29 AM

Return i1 types in tests.

paulwalker-arm added inline comments.Jul 24 2022, 5:13 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
18245

This can be just VT.isScalar() and made part of the first if statement since there's no point querying the setcc operands until after we know they're scalar.

18255

You've misunderstood my previous getBooleanContents comment but that doesn't matter because I think the transformation is not correct anyway. The input (i.e. setcc (iN (bitcast (vNi1 X))), 0, eq) is saying "return true if and only if all bits in X are false", where as vecreduce_or (setcc X, 0, eq) will return true if any bit in X is false".

I think the transformation you want is: setcc (iN (bitcast (vNi1 X))), 0, eq => setcc (iN zext(i1 vecreduce_or (vNi1 X))), 0, eq? which is to say you just need to change LHS to replace the vector->scalar bitcast with a reduction.

c-rhodes updated this revision to Diff 447258.Jul 25 2022, 3:05 AM

vecreduce_or (setcc X, 0, eq) -> setcc (iN (zext (i1 (vecreduce_or (vNi1 X))))), 0, eq

c-rhodes marked 2 inline comments as done.Jul 25 2022, 3:09 AM
c-rhodes added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
18255

You've misunderstood my previous getBooleanContents comment but that doesn't matter because I think the transformation is not correct anyway. The input (i.e. setcc (iN (bitcast (vNi1 X))), 0, eq) is saying "return true if and only if all bits in X are false", where as vecreduce_or (setcc X, 0, eq) will return true if any bit in X is false".

I think the transformation you want is: setcc (iN (bitcast (vNi1 X))), 0, eq => setcc (iN zext(i1 vecreduce_or (vNi1 X))), 0, eq? which is to say you just need to change LHS to replace the vector->scalar bitcast with a reduction.

Ah, well spotted. Replacing the inner bitcast with reduction was the original idea but I incorrectly figured reducing vector setcc was semantically equivalent and had slightly better codegen.

paulwalker-arm accepted this revision.Jul 25 2022, 3:24 AM

One minor precommit change plus a suggested improvement which I think is best done as part of this patch but am accepting anyway in case you prefer to create a separate patch. If you decide to add the improvement to this patch then I'll re-review.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
18249

Can you use RHS here? It should already be the zero you need.

18250

By making this Cond and updating the entry condition, the combine should just work for the SETNE case as well? If so, I think this is worth doing just in case the IR gets inverted.

This revision is now accepted and ready to land.Jul 25 2022, 3:24 AM
c-rhodes updated this revision to Diff 447269.Jul 25 2022, 3:43 AM
c-rhodes marked an inline comment as done.

Handle inverted (not-equal) case.

c-rhodes marked 2 inline comments as done.Jul 25 2022, 3:44 AM
paulwalker-arm accepted this revision.Jul 25 2022, 3:57 AM
This revision was landed with ongoing or failed builds.Jul 25 2022, 5:15 AM
This revision was automatically updated to reflect the committed changes.