Page MenuHomePhabricator

bin.cheng-ali (Bin Cheng)
User

Projects

User does not belong to any projects.

User Details

User Since
Feb 18 2021, 12:01 AM (110 w, 1 h)

Recent Activity

Mar 27 2021

bin.cheng-ali added a comment to D99324: [AArch64][SVE] Codegen dup_lane for dup(vector_extract).

Excuse me, I am new to LLVM/backend, one question is: What does "stock LLVM IR" mean(refer to) in above comment?

By stock LLVM IR I'm referring to the LLVM instructions as defined by the LandRef plus non-target specific intrinsics.

As for the patch, I am trying to understand the issue, do you suggest we should first introduce DUP_LANE pattern similar to SVDOT_LANE_S so that clang CodeGen doesn't generate dup.x when possible?

I'm not sure I fully understand your question but in general when it comes to code generation I'm trying to ensure where possible that we have a canonicalised representation so that we minimise the number of patterns (IR or DAG) that end up resolving to the same instruction.

Mar 27 2021, 7:42 AM · Restricted Project

Mar 25 2021

bin.cheng-ali added a comment to D99324: [AArch64][SVE] Codegen dup_lane for dup(vector_extract).

OK, I understand you point. splat_vector(extract_vector_elt(vec, idx)) looks ok for me, and why you prefer do it in in SVEIntrinsicOpts.cpp ? what about do this in performdagcombine with AArch64TBL node?

The reason i prefer to handle in performdagcombine is that what we want to match is AArch64tbl ( ... splat_vector(..., constant)) rather than sve.tbl + sve.dupx. Since shufflevector can also convert to splat_vector.

I feel the higher up the chain/earlier we do this the better. Outside of the ACLE intrinsics I wouldn't expect scalable AArch64ISD::TBL to be created unless that's exactly what the code generator wants. It's worth highlighting that this sort of SVE ACLE intrinsics -> LLVM IR transform will not be an isolated case. We deliberately created intrinsics even for common transforms so that we could minimise use of stock LLVM IR and thus limit failures due to missing scalable vector support. As LLVM matures I would expect us to utilise stock LLVM IR more and more. For example converting dup's to shufflevector, ptrue all predicated operations to normal LLVM bin ops...etc.

That said, if you think PerformDAGCombine (presumable performIntrinsicCombine) is the best place today then fine. It can easily be moved up the chain when we're more comfortable.

Mar 25 2021, 7:27 PM · Restricted Project

Mar 13 2021

bin.cheng-ali added inline comments to D97299: [IR][SVE] Add new llvm.experimental.stepvector intrinsic.
Mar 13 2021, 8:38 AM · Restricted Project

Feb 23 2021

bin.cheng-ali added inline comments to D94708: [IR] Introduce llvm.experimental.vector.splice intrinsic.
Feb 23 2021, 1:59 AM · Restricted Project

Feb 19 2021

bin.cheng-ali added inline comments to D94708: [IR] Introduce llvm.experimental.vector.splice intrinsic.
Feb 19 2021, 10:51 PM · Restricted Project
bin.cheng-ali added inline comments to D95363: [SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse.
Feb 19 2021, 8:55 AM · Restricted Project