This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Add patterns for VSELECT of immediates.
ClosedPublic

Authored by efriedma on May 7 2020, 2:08 PM.

Details

Summary

This covers forms involving "CPY (immediate, zeroing)".

This doesn't handle the case where the operands are reversed, and the condition is freely invertible. Not sure how to handle that. Maybe a DAGCombine.

Diff Detail

Event Timeline

efriedma created this revision.May 7 2020, 2:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 7 2020, 2:08 PM

Hi @efriedma ,

thanks you for working on this.

This doesn't handle the case where the operands are reversed, and the condition is freely invertible.

Do you mean cases like select <vscale x 2 x i1> %p, <vscale x 2 x i64> zeroinitializer, <vscale x 2 x i64> %three? If that the case, the "merging" version of CPY , where %three is allocated to the dest register, should do it? CPY <Zd>.<T>, <Pg>/M, <R><n|SP>.

Kind regards,

Francesco

llvm/test/CodeGen/AArch64/sve-vselect-imm.ll
10

The immediate is a signed immediate - could you please test negative values too?

Nit: the name %three doesn't seem to be related to anything other than the value 3 of some of the tests... could you change it to something more generic, like %vec, or even leave it as %0?

75–112

Can you do the same "invalid values" tests for when the scaling shift is 0? For example -130 and 130.

Nit: adding the token _invalid_imm_values in the names of the functions would help a lot identifying what you are testing here.

Do you mean cases like select <vscale x 2 x i1> %p, <vscale x 2 x i64> zeroinitializer, <vscale x 2 x i64> %three? If that the case, the "merging" version of CPY , where %three is allocated to the dest register, should do it? CPY <Zd>.<T>, <Pg>/M, <R><n|SP>.

I was more specifically thinking of the case where %p is something like an icmp, where we can potentially rewrite the operation to produce an inverted result for free. Otherwise, yes, we can use merging CPY.

efriedma updated this revision to Diff 262955.May 8 2020, 2:07 PM

Update test

the case where %p is something like an icmp, where we can potentially rewrite the operation to produce an inverted result for free.

I am not sure I understand. Can you give an explicit example?

Regards,

Francesco

llvm/test/CodeGen/AArch64/sve-vselect-imm.ll
114

I am probably missing something trivial, but why would this be the case only if the odd bits if the predicate are zero?

Moreover, why not add the patterns needed for this directly in this patch? IIUC a higher AddedComplexity is what you need to prioritize the current pattern over the sel one.

efriedma marked an inline comment as done.May 8 2020, 3:23 PM

Say you have something like x < 0 ? 0 : 1. That doesn't match. But if you rewrite it to x >= 0 ? 1 : 0, it does.

llvm/test/CodeGen/AArch64/sve-vselect-imm.ll
114

Using ACLE intrinsics, the suggestion is essentially doing something like svreinterpret_s16(svsel(pred, svdup_s8(-128), svdup_s8(0))). Note that this is using the predicate as a <vscale x 16 x i1>, not a <vscale x 8 x i1>.

The pattern required for that is probably non-trivial

sdesmalen accepted this revision.May 11 2020, 3:03 PM

LGTM

This doesn't handle the case where the operands are reversed, and the condition is freely invertible. Not sure how to handle that. Maybe a DAGCombine.

A combine is probably easiest as it avoids having to specify the sequence to negate the predicate.

llvm/lib/Target/AArch64/SVEInstrFormats.td
179

Can I assume this was already handled by the PredicateMethod of SVECpyImmOperand8?

This revision is now accepted and ready to land.May 11 2020, 3:03 PM
efriedma marked an inline comment as done.May 11 2020, 3:20 PM
efriedma added inline comments.
llvm/lib/Target/AArch64/SVEInstrFormats.td
179

ImmLeaf is only used for SelectionDAG lowering. We can't do SelectionDAG lowering for shifted immediates using ImmLeaf because it needs to produce two MachineOperands. Therefore, an ImmLeaf predicate is completely useless here.

For asm parsing, the necessary predicate is handled by the PredicateMethod, yes.

This revision was automatically updated to reflect the committed changes.