Page MenuHomePhabricator

[SVE] Add SVE2 patterns for unpredicated multiply instructions
ClosedPublic

Authored by dancgr on Jan 15 2020, 12:11 PM.

Details

Summary

Add patterns for SVE2 unpredicated multiply instructions:

  • mul, smulh, umulh, pmul, sqdmulh, sqrdmulh

Also removed negative tests for immediate multiply that are set up to fail, since the case now can succeed and also is handled in the new test.

Diff Detail

Event Timeline

dancgr created this revision.Jan 15 2020, 12:11 PM
Herald added a project: Restricted Project. · View Herald Transcript
dancgr edited the summary of this revision. (Show Details)Jan 15 2020, 12:14 PM

Can we also also add mul patterns for targets that have SVE, but not SVE2?

llvm/include/llvm/IR/IntrinsicsAArch64.td
1207

This name seems strange. Why "z"?

dancgr marked 2 inline comments as done.Jan 15 2020, 12:32 PM

Can we also also add mul patterns for targets that have SVE, but not SVE2?

That instruction is restricted to SVE2. Do we have unpredicated vector mul instructions for SVE targets as well?

The way the instruction is implemented did not allow me to match with SVE targets.

llvm/include/llvm/IR/IntrinsicsAArch64.td
1207

I just followed the name similar to the one for logical predicated instructions. I'm not sure if that is the correct way to name predicated intrinsics. I marked the place with a comment.

1562

@efriedma Here!

dancgr marked 2 inline comments as not done.Jan 15 2020, 12:32 PM

Can we also also add mul patterns for targets that have SVE, but not SVE2?

That instruction is restricted to SVE2. Do we have unpredicated vector mul instructions for SVE targets as well?

We can synthesize a predicate using ptrue. We do this in a few places currently; for example, to implement bswap.

llvm/include/llvm/IR/IntrinsicsAArch64.td
1562

I think here the "_z" is meant to indicate that the predicate operand is "<Pg>/Z".

dancgr marked an inline comment as done.EditedJan 15 2020, 2:06 PM

Can we also also add mul patterns for targets that have SVE, but not SVE2?

That instruction is restricted to SVE2. Do we have unpredicated vector mul instructions for SVE targets as well?

We can synthesize a predicate using ptrue. We do this in a few places currently; for example, to implement bswap.

I hadn't thought of that. I will implement SVE fallbacks for the SVE2 instructions using ptrue.

llvm/include/llvm/IR/IntrinsicsAArch64.td
1562

Ok, I will update it to _m to follow the same pattern.

sdesmalen requested changes to this revision.Jan 16 2020, 9:41 AM

Thanks for efforts on this patch @dancgr!

As you can see, I have requested some changes to the patch to avoid diverging too much while we try to upstream the ACLE.
We also need to make sure we can represent the C/C++ intrinsics with the LLVM IR intrinsics, which is a reason to have predicated smulh/umulh.

llvm/include/llvm/IR/IntrinsicsAArch64.td
1201

smulh needs to be predicated AdvSIMD_Pred2VectorArg_Intrinsic, so that we can implement the predicated ACLE intrinsic with this LLVM IR intrinsic.
You can choose to optimize this predicated form (with ptrue predicate mask) into an unpredicated form using a ISel pattern.

1202

Same for umulh.

1207

Adding _z here implies that the false lanes will be zeroed, which isn't the case, as this is used with merging predication.
These intrinsics are to implement the ACLE, which does the zeroing explicitly with a select. These intrinsics are merging, unless explicitly specified.

1562

The reason that these intrinsics have _z in the name is because they implement an operation with zeroing predication (which maps directly to predicate AND that zeroes the false lanes).

Please don't change these suffixes, as this will give us trouble while we're in the process of implementing the ACLE upstream. We would need to update our Clang implementation to target intrinsics with slightly different names.

llvm/lib/Target/AArch64/SVEInstrFormats.td
2636

nit: sve2_int_mul2 -> sve2_int_mul_single ?

This revision now requires changes to proceed.Jan 16 2020, 9:41 AM
dancgr marked 10 inline comments as done.Jan 16 2020, 11:36 AM

I'm not sure on some parts, but I have prepared a major update for this patch that I hope will fix most of @sdesmalen concerns.

llvm/include/llvm/IR/IntrinsicsAArch64.td
1201

Ok, but we have already a predicated smulh intrinsic, which is the int_aarch64_sve_smulh_m. Do we need the predicated version to be named exactly int_aarch64_sve_smulh?

1207

Ok, makes sense now.

1562

I'm not planning on changing these, I just wanted to understand the naming convention. Now it is clear to me.

dancgr updated this revision to Diff 238556.Jan 16 2020, 11:45 AM
dancgr marked 2 inline comments as done.

Tentative changes to address some comments from the reviewers.

I'm not sure on some parts, but I have prepared a major update for this patch that I hope will fix most of @sdesmalen concerns.

I'm not sure on some parts, but I have prepared a major update for this patch that I hope will fix most of @sdesmalen concerns.

Great, thanks for updating your patch! The patch is nearly good to go I think, just one more little change needed to remove _m.

llvm/include/llvm/IR/IntrinsicsAArch64.td
1207

Given that merging is the default, we don't explicitly specify the _m in the intrinsic name.

dancgr updated this revision to Diff 239648.Jan 22 2020, 10:34 AM
dancgr marked an inline comment as done.

Remove _m from intrinsics, as it is the default behaviour. Also, add unpredicated patterns for SVE2 smulh, umulh.

dancgr marked an inline comment as done.Jan 22 2020, 10:34 AM
dancgr added inline comments.
llvm/include/llvm/IR/IntrinsicsAArch64.td
1207

Ok, I will update that too.

dancgr marked an inline comment as done.Jan 22 2020, 10:35 AM
sdesmalen accepted this revision.Jan 23 2020, 5:37 AM

Thanks for the updates to the patch @dancgr , LGTM!

This revision is now accepted and ready to land.Jan 23 2020, 5:37 AM
This revision was automatically updated to reflect the committed changes.