This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add patterns for SVE predicated add/sub and mov combine
ClosedPublic

Authored by NicolaLancellotti on Nov 23 2022, 5:58 AM.

Details

Summary

This patch creates predicated add and sub SVE instructions when the second operand of the add or sub LLVM instruction is a zext or a sext instruction.

I have already attempted to do that with two other patches by canonicalising zext and sext to a vselect instruction.

Patches:

This first patch is currently on the main branch but I noted that the tests for sub are wrong (the zext is the first operand instead of the second one, for add it doesn't matter because of the commutative property)

This patch is currently in review. Also, this patch needed more changes than the previous one, and it can be difficult in future to maintain it.

However, there is a simpler way to generate these predicated instructions by using the TableGen patterns proposed in this patch.
If you agree, I propose to revert the first patch, pre-commit the tests, and then commit these patterns and the updated tests.

Diff Detail

Event Timeline

NicolaLancellotti requested review of this revision.Nov 23 2022, 5:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2022, 5:58 AM

I agree this sounds nicer than the sext patch - that one looked like there was much more involved than we were hoping for.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3160

The indenting can be a little lower here.

3175

I think these constants should be in the range 0-255. Only the bottom 8 bits are likely used, so -1 will mean the same thing, but it's best if we can generate instructions more precisely if we can.

Something like this link can be used to check the constants we need for different values: https://godbolt.org/z/9K7ozhYo7

NicolaLancellotti edited the summary of this revision. (Show Details)

Address comments

Thanks, @dmgreen for the review. I fixed the indent and replaced -1 with 255 and 65535 for i8 and i16.

Matt added a subscriber: Matt.Nov 23 2022, 4:08 PM

For SVE we've tried to minimise the number of patterns that emit multiple instructions so I'm wondering if instead this can be a post legalisation DAG combine that emits int_aarch64_sve_add/sub intrinsics?

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3159

This likely wants to be HasSVEorSME.

3160

Given the previous work that did land, do these patterns ever match? I mean we shouldn't have any zext of predicate vectors by this point should we? Or do you plan to revert that patch once this lands.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3160

Given the previous work that did land, do these patterns ever match? I mean we shouldn't have any zext of predicate vectors by this point should we? Or do you plan to revert that patch once this lands.

These zext patterns are added in case we want to revert the "Canonicalize ZERO_EXTEND to VSELECT" patch, otherwise, we don't need them.

I think the H/S/D should be 255 too, according to https://godbolt.org/z/zcYcdGjfh. It looks like it's a 8bit encoding, that gets treated as a signed value so 255 becomes -1.

For SVE we've tried to minimise the number of patterns that emit multiple instructions so I'm wondering if instead this can be a post legalisation DAG combine that emits int_aarch64_sve_add/sub intrinsics?

Do you think that is as important in this case? It is a good rule if the instructions could be combined into others (we try to use that rule in general for all of Arm/AArch64), but in this case the DUP_ZI_B are leaf nodes. They should all be CSE'd and remat trivially by the rest of the backend, and won't be combined into anything else.

Do you think that is as important in this case?

It's not super important, it's just something that we made a conscious choice to avoid after hitting issues with our original downstream implementation. I'm not too bothered about the CSE side of things, it's more to keep the patterns maintainable. For example, if SVE gains a predicated add/sub imm instruction then you start needing to duplicate these patterns. Of course this can be refactored when such value can be gained so it's not a demand just a suggestion. I'd have a stronger view if AArch64add_m1 existed as an ISD node but it doesn't so I'm happy either way.

paulwalker-arm added inline comments.Nov 24 2022, 5:18 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3160

There's never a good reason to add unused patterns so you should pick one way or the other. For what it's worth I think consistency is best so if the isel patterns work best for sext and there's no downside to using them for zext then I'd prefer to revert the original work once this patch lands.

Address comments

NicolaLancellotti marked an inline comment as done.Nov 24 2022, 6:56 AM

I think the H/S/D should be 255 too, according to https://godbolt.org/z/zcYcdGjfh. It looks like it's a 8bit encoding, that gets treated as a signed value so 255 becomes -1.

You are right. Done.

I addressed the comments, if you agree, I'll proceed to revert the "Canonicalize ZERO_EXTEND to VSELECT" patch and land this patch.

I addressed the comments, if you agree, I'll proceed to revert the "Canonicalize ZERO_EXTEND to VSELECT" patch and land this patch.

Yes, I'm happy with that.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3157–3159

It looks like you can merge these together.

3174–3175

Presumably you'll need zext variants of these patterns before reverting you're previous patch?

Perhaps it's worth canonicalising the constant just in case there are multiple instances of these operations. By which I mean:

add ZPR:$op, (sext ($pred)) -> SUB $op, $pred, DUP(1)
add ZPR:$op, (zext ($pred)) -> ADD $op, $pred, DUP(1)
sub ZPR:$op, (sext ($pred)) -> ADD $op, $pred, DUP(1)
sub ZPR:$op, (zext ($pred)) -> SUB $op, $pred, DUP(1)
llvm/test/CodeGen/AArch64/predicated-add-sub.ll
11–12

I know these are existing tests but is there a reason we don't just pass <vscale x 8 x i1> %v into the function?

Address comments

NicolaLancellotti marked 2 inline comments as done.Nov 28 2022, 2:55 AM
NicolaLancellotti added inline comments.
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3174–3175

As you can see from the tests, instead of subtracting 1, LLVM adds -1, which is the same.
So, we don't need them, do we?

llvm/test/CodeGen/AArch64/predicated-add-sub.ll
11–12

The tests were in a pre-commit test, I updated them.

paulwalker-arm accepted this revision.Nov 28 2022, 8:32 AM

Thanks @NicolaLancellotti. I still think it would be better to canonicalise to two variants (i.e. add/sub op, #1) instead of the current three, but that can be changed later is it turns out be advantageous.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3174–3175

I see what you mean.

This revision is now accepted and ready to land.Nov 28 2022, 8:32 AM
This revision was landed with ongoing or failed builds.Nov 28 2022, 8:38 AM
This revision was automatically updated to reflect the committed changes.