This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Initial support for RVV intrinsic
AbandonedPublic

Authored by NickHung on Dec 10 2020, 12:14 AM.

Details

Summary

This patch is meant to discuss the prototype of RVV intrinsic and implement the code generation for intrinsic based on the initial infrastructure D89449.

What this patch has done:

For example, a VADD intrinsic has four prototypes.
VADD(op1, op2)
VADD.M(mask, maskedoff, op1, op2)
VADD.VL(op1, op2, vl)
VADD.M.VL(mask, maskedoff, op1, op2, vl)

Any idea about the prototype of RVV intrinsic?

  • Code generation for VLE/VSE/VADD intrinsics without mask and vl. The implementation is based on the initial infrastructure D89449.

Do we really need those complex patterns written in the target description file?
In this way, we may need five patterns to select four VVV-form intrinsics and one IR node, such as
VADD, VSUB. and five more patterns for VWADD and VWSUB. Eventually, we may suffer maintenance hell.

Our solution is to select RVV intrinsic without any pattern matching rules
Build two searchable tables to provide information.

  1. RVVLMULIndex table: guide the selection function how to determine the LMUL and SEW of an intrinsic.
  2. RVVIntrinsicToPseudo table: look up a pseudo RVV instruction by intrinsic and the LMUL inferred from above.

Example:
RVVLMULIndex table:
(VADDVV, index 1): LMUL can be inferred by the first operand
(VWADDVV, index 1): LMUL can be inferred by the first operand
(VWADDWV, index 1, dividedBy2): LMUL can be inferred by the first operand then divide LMUL by 2.

LMULIndex = lookupLMULIndexByIntrinsic(VADDVV);
LMUL = inferLMUL(VADDVV, LMULIndex);

The LMUL can be inferred by the operand.
check out: https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#data-types

RVVIntrinsicToPseudo table:
(VADDVV, LMUL M1, VADDVV_M1)
(VADDVV, LMUL M2, VADDVV_M2)
(VADDVV, LMUL M4, VADDVV_M4)
(VADDVV, LMUL M8, VADDVV_M8)

PseudoOp = lookupPseudoByIntrinsicAndLMUL(VADDVV, LMUL);

Above can be done within a C++ function.

Diff Detail

Event Timeline

NickHung created this revision.Dec 10 2020, 12:14 AM
NickHung requested review of this revision.Dec 10 2020, 12:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 10 2020, 12:14 AM
wuiw added a subscriber: wuiw.Dec 10 2020, 12:16 AM

Are you proposing to do custom isel in RISCVISelDAGToDAG.cpp using lookupPseudoByIntrinsicAndLMUL and not using the RISCVGenDAGISel.inc table? Do you have an implementation of that yet?

llvm/test/CodeGen/RISCV/rvv/intrinsic-load-add-store-32.ll
10

This is setting the VL to VLmax which isn't what the spec wants. It should get the value from the previous vsetvl intrinsic or maybe the previous intrinsic that had a vl argument.

Our internal implementation has been implementing the intrinsics without vl by inserting a readvl intrinsic and a call to the intrisics that take vl. But we've been finding issues with this. The readvl is acting as an optimization barrier. It also doesn't have any ordering in IR with respect to intrinsics that have a vl argument unless we mark all intrinsics has having side effects.

What are your thoughts on this?

NickHung added inline comments.Dec 10 2020, 5:40 PM
llvm/test/CodeGen/RISCV/rvv/intrinsic-load-add-store-32.ll
10

In our internal implementation, intrinsic without vl should insert "setvli zero, zero, e32,m1,tu,mu" to kepp the current VL.
I didn't touch any code in D89449 in order to respect the authors.
Please check out load-add-store-32.ll, it has the same vsetvl which changes the current vl.

Our thought is to follow the RVV intrinsic documentation
https://github.com/riscv/rvv-intrinsic-doc. So, we provide all kinds of intrinsics with and without vl.

Thanks for sharing your ideas with us.

craig.topper added inline comments.Dec 10 2020, 5:49 PM
llvm/test/CodeGen/RISCV/rvv/intrinsic-load-add-store-32.ll
10

load-add-store-32.ll is intentionally using vlmax. I believe it was intended to model a scalable vector style similar to ARM SVE where there is no VL and instead a whole register of a size only known at runtime is used.

Are you proposing to do custom isel in RISCVISelDAGToDAG.cpp using lookupPseudoByIntrinsicAndLMUL and not using the RISCVGenDAGISel.inc table? Do you have an implementation of that yet?

Thanks for your feedback.

Yes, our internal implementation didn't select RVV intrinsic by the generated matching table in RISCVGenDAGISel.inc, because we don't write any matching rules in RISCVInstrInfoVPseudos.td.

We've referenced the code from EPI, check out https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/blob/EPI/llvm/lib/Target/RISCV/RISCVInstrInfoEPI.td
The number of lines in RISCVInstrInfoEPI.td is 4323, and most of the classes are duplicated in order to cope with the different formats of RVV.
It also contains many comments such as "//// FIXME: These patterns are wrong", so we guess the author must be struggling when writing the matching rules.

We propose to do custom selection in RISCVISelDAGToDAG.cpp with a programmable function and we've implemented all RVV 1.0 instructions in this way.

NickHung added inline comments.Dec 10 2020, 6:11 PM
llvm/test/CodeGen/RISCV/rvv/intrinsic-load-add-store-32.ll
10

Thanks for the explanation. Your thought is correct.
As I mentioned in D89449, the initial infrastructure should be shared by RVV intrinsic and IR nodes.

So, we know it should do a little change to support both RVV intrinsic and IR nodes for distinct programming scenarios.

liaolucy added inline comments.
llvm/test/CodeGen/RISCV/rvv/intrinsic-load-add-store-32.ll
10

Sorry, I'd like to ask a question:

About insert vsetvli:
intrinsics without vl: insert vsetvli a1(not x0), x0, i.e. vl = vlmax ;
Intrinsics with vl: insert vsetvli x0, a1(not x0), i.e. vl = avl ;

When do I need to insert vsetvli x0, x0? Code in D89499, !(VLIndex >= 0) , insert vsetvli x0, x0, but all VLindex > 0.

If my understanding is incorrect, please correct me,thanks.

craig.topper added inline comments.Dec 14 2020, 12:49 AM
llvm/test/CodeGen/RISCV/rvv/intrinsic-load-add-store-32.ll
10

I believe that code is there for vmv.x.s and some other instructions that don't read vl.

Are you proposing to do custom isel in RISCVISelDAGToDAG.cpp using lookupPseudoByIntrinsicAndLMUL and not using the RISCVGenDAGISel.inc table? Do you have an implementation of that yet?

Thanks for your feedback.

Yes, our internal implementation didn't select RVV intrinsic by the generated matching table in RISCVGenDAGISel.inc, because we don't write any matching rules in RISCVInstrInfoVPseudos.td.

We've referenced the code from EPI, check out https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/-/blob/EPI/llvm/lib/Target/RISCV/RISCVInstrInfoEPI.td
The number of lines in RISCVInstrInfoEPI.td is 4323, and most of the classes are duplicated in order to cope with the different formats of RVV.
It also contains many comments such as "//// FIXME: These patterns are wrong", so we guess the author must be struggling when writing the matching rules.

We propose to do custom selection in RISCVISelDAGToDAG.cpp with a programmable function and we've implemented all RVV 1.0 instructions in this way.

I believe we have all 0.9 pseudo instructions in now for the base spec. Zvlsseg and Zvamo are being prepared. After looking at the size of the RISCVGenDAGISel.inc table, I'm very interested to see your RISCVISelDAGToDAG.cpp implementation. Do you have code you are able to share?

NickHung abandoned this revision.Mar 16 2021, 11:57 PM