Details
Diff Detail
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
18199–18200 | Do you need to be restrictive here? Would the result for larger vectors (e.g. v64i1) still be better than what we get today? You might need to be careful about creating illegal operations. Given the context of this combine, can it be restricted to before type legalisation? | |
18204 | The result type of SETCC doesn't have to be i1 so you either need to extend the result based on getBooleanContents() or restrict the combine to cases where VT==i1. My preference is the former. |
- Remove v8i1/v16i1 restriction.
- Limit combine to pre-legalization.
- Extend SETCC result with getBooleanContents.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
18204 |
Updated to use getBooleanContents, I think this is fixed now? |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
18226 | This can be just VT.isScalar() and made part of the first if statement since there's no point querying the setcc operands until after we know they're scalar. | |
18236 | You've misunderstood my previous getBooleanContents comment but that doesn't matter because I think the transformation is not correct anyway. The input (i.e. setcc (iN (bitcast (vNi1 X))), 0, eq) is saying "return true if and only if all bits in X are false", where as vecreduce_or (setcc X, 0, eq) will return true if any bit in X is false". I think the transformation you want is: setcc (iN (bitcast (vNi1 X))), 0, eq => setcc (iN zext(i1 vecreduce_or (vNi1 X))), 0, eq? which is to say you just need to change LHS to replace the vector->scalar bitcast with a reduction. |
vecreduce_or (setcc X, 0, eq) -> setcc (iN (zext (i1 (vecreduce_or (vNi1 X))))), 0, eq
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
18236 |
Ah, well spotted. Replacing the inner bitcast with reduction was the original idea but I incorrectly figured reducing vector setcc was semantically equivalent and had slightly better codegen. |
One minor precommit change plus a suggested improvement which I think is best done as part of this patch but am accepting anyway in case you prefer to create a separate patch. If you decide to add the improvement to this patch then I'll re-review.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
18249 | Can you use RHS here? It should already be the zero you need. | |
18250 | By making this Cond and updating the entry condition, the combine should just work for the SETNE case as well? If so, I think this is worth doing just in case the IR gets inverted. |
The result type of SETCC doesn't have to be i1 so you either need to extend the result based on getBooleanContents() or restrict the combine to cases where VT==i1. My preference is the former.