This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Add patterns for logical immediate operations.
ClosedPublic

Authored by dancgr on Dec 13 2019, 10:58 AM.

Details

Summary

Add pattern matching for the following SVE logical vector and immediate instructions:

  • and/bic, orr/orn, eor/eon.

Event Timeline

dancgr created this revision.Dec 13 2019, 10:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 13 2019, 10:58 AM
efriedma added inline comments.Dec 13 2019, 11:08 AM
llvm/include/llvm/IR/IntrinsicsAArch64.td
1064

I don't really like anyint here... can we restrict the type somehow?

Actually, I'm not sure how this even works with your testcases; how is LLVM computing the type of llvm.aarch64.sve.orr.imm.nxv16i8? You aren't specifying the type of the integer.

dancgr marked an inline comment as done.Dec 13 2019, 11:17 AM
dancgr added inline comments.
llvm/include/llvm/IR/IntrinsicsAArch64.td
1064

We can split this into two intrinsics, one i32 and one i64, both will be almost exactly the same with only the immediate type difference. I don't see a problem in that, I was just trying to reuse some code to avoid unnecessary complexity. But I'm not opposed to having two separate ones either.

In the following test case I'm explicitly setting the input as i64 on the declare statement. If I were to put i32 there, It would fail to find a matching pattern.

define <vscale x 16 x i8> @orr_i8(<vscale x 16 x i8> %a) {
; CHECK-LABEL: orr_i8:
; CHECK: orr z0.b, z0.b, #0xf
; CHECK-NEXT: ret
  %res = call <vscale x 16 x i8> @llvm.aarch64.sve.orr.imm.nxv16i8(<vscale x 16 x i8> %a, 
                                                                   i64 15)
  ret <vscale x 16 x i8> %res
}

declare <vscale x 16 x i8> @llvm.aarch64.sve.orr.imm.nxv16i8(<vscale x 16 x i8>, i64)
efriedma added inline comments.Dec 13 2019, 11:27 AM
llvm/include/llvm/IR/IntrinsicsAArch64.td
1064

Why do we need both an i32 and an i64 variant? The i64 variant seems to cover all the relevant cases (and you haven't added any tests for the i32 variant).

I figured out the source of my confusion about the names of the intrinsics; apparently IR autoupgrade is "fixing" the name of llvm.aarch64.sve.orr.imm.nxv16i8 to refer to the actual name, llvm.aarch64.sve.orr.imm.nxv16i8.i64.

dancgr marked an inline comment as done.Dec 13 2019, 12:05 PM
dancgr added inline comments.
llvm/include/llvm/IR/IntrinsicsAArch64.td
1064

Oh, I get it now. The AdvSIMD_1VectorArg_Imm_Intrinsic was introduced previously for the add/sub/sqadd imm. instrinsics (which use i32). And then I changed it to anyint to be able to use the same class for and/orr/eor imm., because those use i64.

dancgr marked an inline comment as done.Dec 13 2019, 12:06 PM
dancgr added inline comments.
llvm/include/llvm/IR/IntrinsicsAArch64.td
1064

But I don't mind changing that. I am just trying to understand the reasoning so I can apply it to my future patches.

efriedma added inline comments.Dec 13 2019, 12:21 PM
llvm/include/llvm/IR/IntrinsicsAArch64.td
1064

A new "class" doesn't have any associated cost; it's just a pattern for defining an intrinsic. if you need a new one, just add it.

dancgr marked 3 inline comments as done.Dec 13 2019, 12:33 PM
dancgr updated this revision to Diff 233854.Dec 13 2019, 12:42 PM

Add specific intrinsic for i64.

This revision is now accepted and ready to land.Dec 13 2019, 1:03 PM
sdesmalen accepted this revision.Dec 16 2019, 7:04 AM

Thanks @dancgr, LGTM!

There is no harm in adding these intrinsics, but it is worth pointing out that our downstream compiler does not have explicit intrinsics for the immediate forms because:

  • There are no specific C/C++ intrinsics that require the operand to be an immediate. That means using the immediate form is purely an optimisation of the general ((unpredicated) vector, vector) case.
  • We can use a pattern to match e.g. the AArch64dup (SVELogicalImm32 i64:$imm) and emit the immediate-form of the instruction.

I will be merging this patch then, and I will submit a short patch for the AArch64dup (SVELogicalImm32 i64:$imm) patterns and the equivalent ones for the add/sub instructions.

This revision was automatically updated to reflect the committed changes.