This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] WIP for review D88742
AbandonedPublic

Authored by mivnay on Oct 6 2020, 6:39 AM.

Details

Reviewers
dmgreen

Diff Detail

Unit TestsFailed

Event Timeline

mivnay created this revision.Oct 6 2020, 6:39 AM
mivnay requested review of this revision.Oct 6 2020, 6:39 AM
mivnay added a comment.Oct 6 2020, 7:02 AM

This change seems to be generating a different pattern (UABDLv4i32_v2i64 vs UABDLv2i32_v2i64) for one of the test cases in arm64-vabs.ll. I am not sure whether the new pattern generated is correct and better.

original:

t0: ch = EntryToken
      t2: v4i32,ch = CopyFromReg t0, Register:v4i32 %0
        t4: i32,ch = CopyFromReg t0, Register:i32 %1
      t25: v4i32 = DUPv4i32gpr t4
    t28: v2i64 = UABDLv4i32_v2i64 t2, t25
  t19: ch,glue = CopyToReg t0, Register:v2i64 $q0, t28
  t20: ch = RET_ReallyLR Register:v2i64 $q0, t19, t19:1

modified:

t0: ch = EntryToken
  t2: v4i32,ch = CopyFromReg t0, Register:v4i32 %0
        t27: f128 = EXTv16i8 t2, t2, TargetConstant:i32<8>
      t14: v2i32 = EXTRACT_SUBREG t27, TargetConstant:i32<2>                                   
        t4: i32,ch = CopyFromReg t0, Register:i32 %1                                           
      t25: v2i32 = DUPv2i32gpr t4 
    t17: v2i64 = UABDLv2i32_v2i64 t14, t25
  t19: ch,glue = CopyToReg t0, Register:v2i64 $q0, t17
  t20: ch = RET_ReallyLR Register:v2i64 $q0, t19, t19:1
dmgreen added a subscriber: dmgreen.Oct 6 2020, 7:23 AM

There's some code in performExtendCombine from the look of it, still looking at the intrinsics ID's.

mivnay updated this revision to Diff 296481.Oct 6 2020, 9:09 AM
mivnay added a reviewer: dmgreen.
mivnay added a comment.Oct 6 2020, 9:16 AM

There's some code in performExtendCombine from the look of it, still looking at the intrinsics ID's.

oops! I thought it would still match the intrinsic and completely ignored it. Sorry! I have fixed it and added support for SABD node as well. Now, there is a new error because of the "FALLBACK-NOT: remark" checks in the same file.

Selecting:
  %4:fpr64(<8 x s8>) = G_INTRINSIC intrinsic(@llvm.aarch64.neon.sabd), %2:fpr(<8 x s8>), %3:fpr(<8 x s8>)
remark: <unknown>:0:0: cannot select: %4:fpr64(<8 x s8>) = G_INTRINSIC intrinsic(@llvm.aarch64.neon.sabd), %2:fpr(<8 x s8>), %3:fpr(<8 x s8>) (in function: sabdl8h)
warning: Instruction selection used fallback path for sabdl8h
mivnay updated this revision to Diff 296483.Oct 6 2020, 9:28 AM

Right. Because of Global ISel? I don't know that very well and if there's a similar place to do the same thing, converting an intrinsics to the kind of node that would be matched.

My hope was that we would be able to make something that I could later lift to being target-independent and share across AArch64 and MVE.

Replicating the patterns doesn't feel great though, and disabling the test feels a bit of a step backwards. Perhaps make the pattern match either the node or the intrinsic:

def AArch64uabd_n   : SDNode<"AArch64ISD::UABD", SDT_AArch64binvec>;
def AArch64sabd_n   : SDNode<"AArch64ISD::SABD", SDT_AArch64binvec>;
def AArch64uabd     : PatFrags<(ops node:$lhs, node:$rhs),
                                   [(AArch64uabd_n node:$lhs, node:$rhs),
                                    (int_aarch64_neon_uabd node:$lhs, node:$rhs)]>;
def AArch64sabd     : PatFrags<(ops node:$lhs, node:$rhs),
                                   [(AArch64sabd_n node:$lhs, node:$rhs),
                                    (int_aarch64_neon_sabd node:$lhs, node:$rhs)]>;
mivnay added a comment.Oct 7 2020, 3:21 AM

Right. Because of Global ISel? I don't know that very well and if there's a similar place to do the same thing, converting an intrinsics to the kind of node that would be matched.

My hope was that we would be able to make something that I could later lift to being target-independent and share across AArch64 and MVE.

Replicating the patterns doesn't feel great though, and disabling the test feels a bit of a step backwards. Perhaps make the pattern match either the node or the intrinsic:

def AArch64uabd_n   : SDNode<"AArch64ISD::UABD", SDT_AArch64binvec>;
def AArch64sabd_n   : SDNode<"AArch64ISD::SABD", SDT_AArch64binvec>;
def AArch64uabd     : PatFrags<(ops node:$lhs, node:$rhs),
                                   [(AArch64uabd_n node:$lhs, node:$rhs),
                                    (int_aarch64_neon_uabd node:$lhs, node:$rhs)]>;
def AArch64sabd     : PatFrags<(ops node:$lhs, node:$rhs),
                                   [(AArch64sabd_n node:$lhs, node:$rhs),
                                    (int_aarch64_neon_sabd node:$lhs, node:$rhs)]>;

Yes ,the issue is in Global ISel. Above change works but it also allows one to use the intrinsics instead of ISD nodes. I am closing this review and will push the changes to the original review D88742