This is an archive of the discontinued LLVM Phabricator instance.

[RISCV][RVV] Introduce roundmode operand to PseudoVAADD instruction
AbandonedPublic

Authored by arcbbb on Mar 10 2022, 7:20 AM.

Details

Summary

As D113439's discussion, I would like to propose adding
a roundmode operand to instructions that implicitly use VXRM.
And this would allow us to provide intrinsics that are IntrNoMem and pass down a round mode imm. to the backend which will then be able to change the round mode accordingly.

This patch implements single intrinsic int.riscv.vaadd.rm and a naive pass to insert WriteVXRM by iterating through each instruction and check the roundmode operand.
Current implementation conservatively keeps the incoming VXRM value and restores it back before affecting other VXRM users, because the ABI doesn't specify it is preserved or volatile across calls.
It can be improved further if the ABI is updated.

I think both sets of intrinsics (1) and (2) will coexist in the meantime until we all agree to replace (1) with (2)
(1) int.riscv.vaadd.*
(2) int.riscv.vaadd.rm.*

Diff Detail

Event Timeline

arcbbb created this revision.Mar 10 2022, 7:20 AM
arcbbb requested review of this revision.Mar 10 2022, 7:20 AM
arcbbb removed a project: Restricted Project.Mar 10 2022, 7:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 7:37 AM

This generally makes sense.

I'm not sure you want to allow the intrinsics to accept the immediate argument DYN (4). That sort of defeats the point of adding the explicit argument: you can't mark the intrinsic IntrNoMem.

arcbbb updated this revision to Diff 414599.Mar 11 2022, 12:09 AM
arcbbb edited the summary of this revision. (Show Details)

Updates:

  1. Add isUInt<2> check for timm.
  2. Use 0 as round mode value in tests
  3. Add tu and rv32 test cases.

This generally makes sense.

I'm not sure you want to allow the intrinsics to accept the immediate argument DYN (4). That sort of defeats the point of adding the explicit argument: you can't mark the intrinsic IntrNoMem.

Thanks, I got the point. I add a check to prevent this.

I initiated a discussion https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/144
to talk about whether users of clang builtins would like to stay in option (1).

Having RISCVVXRMRegister work per-basic-block is fine for now, but you probably want the capability to hoist the operations out of loops at some point.

Do you have any plan for the vxsat bit? You might want the intrinsics to take it as an explicit argument/return value. Or maybe if you don't expect users of the intrinsics to use the bit, you could just preserve it; instead of saving/restoring vxrm, just save/restore all of vcsr.

Having RISCVVXRMRegister work per-basic-block is fine for now, but you probably want the capability to hoist the operations out of loops at some point.

Yes, I think it is desirable. I will think how to do that after the global analysis is done.

Do you have any plan for the vxsat bit? You might want the intrinsics to take it as an explicit argument/return value. Or maybe if you don't expect users of the intrinsics to use the bit, you could just preserve it; instead of saving/restoring vxrm, just save/restore all of vcsr.

Thanks, I think I can try returning an additional output for the vxsat bit like

declare { <vscale x 1 x i8>, i64 } @llvm.riscv.vaadd.rm.nxv1i8.nxv1i8(
  <vscale x 1 x i8>,
  <vscale x 1 x i8>,
  <vscale x 1 x i8>,
  i64,
  i64);
craig.topper added inline comments.Mar 21 2022, 10:45 AM
llvm/lib/Target/RISCV/RISCVVXRMRegister.cpp
13 ↗(On Diff #414599)

basici->basic

58 ↗(On Diff #414599)

Would it be better to return Optional<unsigned>?

116 ↗(On Diff #414599)

Drop else after continue

116 ↗(On Diff #414599)

Does NewVXRMImm being DYN mean to use the value from the last static instruction? That seems fragile since they weren't order in IR.

140 ↗(On Diff #414599)

Machine basic blocks can have more than 1 terminator. You probably need to do this in the loop when you encounter the first terminator.

llvm/test/CodeGen/RISCV/rvv/roundmode-insert.mir
27

Does it really have an implicit $vxrm use before the RISCVVXRMRegister pass? @efriedma should we only have the vxrm use for the dynamic case?

arcbbb updated this revision to Diff 417354.Mar 22 2022, 11:51 AM

Updates:

  1. Address Craig's comments
  2. Considering VXSAT state, save the value of VCSR instead of the value of VXRM
  3. update test cases
llvm/lib/Target/RISCV/RISCVVXRMRegister.cpp
58 ↗(On Diff #414599)

Yes, thanks!

116 ↗(On Diff #414599)

DYN means to use the saved value by the first static instruction. SwapVXRMImm/SwapVCSRImm is inserted between Entry/Calls/InlineAsms and static instrucitons.

Sorry my wording is misleading.

llvm/test/CodeGen/RISCV/rvv/roundmode-insert.mir
27

My observation is that NoImplicit is false by default in CreateMachineInstr, so implicit $vxrm is there since BuildMI.

In case this is relevant to you, I have a proposal to request the psABI to make VXRM non-preserved across calls in
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/256
It has not been accepted, but I think it is good to have a discussion.

craig.topper added inline comments.Mar 22 2022, 1:38 PM
llvm/test/CodeGen/RISCV/rvv/roundmode-insert.mir
27

Right SelectionDAG will do it from the tablegen definition. For scalar FP I didn't set uses FRM in the td file and then I do a PostProcesseISel to add it for DYN instructions.

Not sure if it makes sense to do something similar here. The VXRMRegister pass would need to add the implicit def once it added the CSR write.

frasercrmck added inline comments.Mar 23 2022, 4:25 AM
llvm/include/llvm/IR/IntrinsicsRISCV.td
641

Is it worth saying with mask and rounding-mode operand to help people differentiate between the intrinsics?

Also this one is defined after the no-roundmode version, but RISCVSaturatingBinaryRMAAXNoMask is defined before its counterpart. I find that quite confusing.

llvm/lib/Target/RISCV/CMakeLists.txt
29

Nit: in alphabetical order, please

llvm/lib/Target/RISCV/RISCVVXRMRegister.cpp
21 ↗(On Diff #417354)

specify -> specifies

101 ↗(On Diff #417354)

if (auto NewVXRM = hasVRXMInfo(MI))

151 ↗(On Diff #417354)

fallthrough

arcbbb added inline comments.Mar 24 2022, 10:55 AM
llvm/include/llvm/IR/IntrinsicsRISCV.td
641

Yes, thanks for making it clear!
I'll fix it in the next update.

llvm/test/CodeGen/RISCV/rvv/roundmode-insert.mir
27

I am figuring it out and I study the approach in FRM & VSETVLI.
My understanding is that

  1. Remove the VXRM uses from the td
  2. Add it for DYN instructions in PostInstrSelection
  3. Add it for non-DYN instructions in VXRMRegister pass while inserting the CSR write.

Are my steps the same as yours?

arcbbb updated this revision to Diff 418251.Mar 25 2022, 9:34 AM

Thank @craig.topper and @frasercrmck
Updates:

  1. Reorder intrinsic class definition
  2. Reorder RISCVVXRMRegister in CMakeLists.txt
  3. Use PostISelHook to add VXRM uses to DYN pseudos
  4. Change non-DYN pseudos to DYN pseudos after CSR Write insertion
  5. Update tests
declare { <vscale x 1 x i8>, i64 } @llvm.riscv.vaadd.rm.nxv1i8.nxv1i8(
  <vscale x 1 x i8>,
  <vscale x 1 x i8>,
  <vscale x 1 x i8>,
  i64,
  i64);

Sorry I make a wrong example as VAADD doesn't saturate.
I should use VSMUL or VSADD.

craig.topper added inline comments.Mar 30 2022, 4:35 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9337

- 3 because we assume SEW and VL exist? Can we assert hasSEW and hasVL?

llvm/lib/Target/RISCV/RISCVRegisterInfo.td
587 ↗(On Diff #418251)

Do we need this? There's no way to express that VXRM and VXSAT overlap with it so we just always reference the individual pieces.

llvm/lib/Target/RISCV/RISCVVXRMRegister.cpp
94 ↗(On Diff #418251)

When would it be greater than 7? And if that can happen then what prevents values of 5-7 which aren't defined.

133 ↗(On Diff #418251)

This isn't handling terminators is it?

144 ↗(On Diff #418251)

VXRM -> VCSR?

arcbbb updated this revision to Diff 419343.Mar 31 2022, 12:17 AM

Address Craig's comments.
Updates:

  1. Add assertions to check hasVL & hasSEW
  2. Remove RISCVReg 'vcsr'
  3. Do not mask immediate value
  4. Fix inconsistent comments
  5. Update O0 & O3 pipeline tests
arcbbb added inline comments.Mar 31 2022, 12:19 AM
llvm/lib/Target/RISCV/RISCVVXRMRegister.cpp
94 ↗(On Diff #418251)

I think only 0~4 is possible. Then I remove the mask to avoid confusing.

khchen added inline comments.Mar 31 2022, 2:56 AM
llvm/lib/Target/RISCV/RISCVVXRMRegister.cpp
1 ↗(On Diff #419343)

Why not call it RISCVInsertVXRM similar with RISCVInsertVSETVLI? IMPO, this naming would confusing me until I see the comment.

108 ↗(On Diff #419343)

sorry, I didn't understand why we need to << 1?

enum RoundingMode {
  RNU = 0,
  RNE = 1,
  RDN = 2,
  ROD = 3,
  DYN = 4,
};

if NewVXRMImm is RDN then the CurVXRMImm would be DYN after <<1?

llvm/test/CodeGen/RISCV/rvv/vaadd-rm-rv32.ll
4

nit: maybe we could merge vaadd-rm-rv32.lland vaadd-rm-rv64.ll into one as we usually did in other tests.
ex. https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/RISCV/rvv/vaadd.ll#L1-L5

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -mattr=+v \
; RUN:   -verify-machineinstrs | FileCheck %s --check-prefixes=CHECK,RV32
; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -mattr=+v \
; RUN:   -verify-machineinstrs | FileCheck %s --check-prefixes=CHECK,RV64

I think it could be in anorther NFC patch.

arcbbb updated this revision to Diff 419990.Apr 2 2022, 9:12 AM

Address @khchen 's comment.
Updates:

  1. Rename RISCVVXRMRegister.cpp to RISCVInsertVXRMWrite.cpp
  2. Fix the VCSR restoring.
arcbbb added inline comments.Apr 2 2022, 9:16 AM
llvm/lib/Target/RISCV/RISCVVXRMRegister.cpp
108 ↗(On Diff #419343)

This is doing VCSR = VXRM << 1

llvm/test/CodeGen/RISCV/rvv/vaadd-rm-rv32.ll
4

Thanks, I'll try merge them in a separate patch.

arcbbb updated this revision to Diff 420826.Apr 6 2022, 6:47 AM
arcbbb marked 17 inline comments as done.
arcbbb edited the summary of this revision. (Show Details)

Updates:

  1. Forgot to rename the pass name.
zvookin added a subscriber: zvookin.May 8 2023, 3:09 PM

What is the status of this? Is it moving forward? So far as I can tell, there still is no solution other than inline assembly to set vxrm from LLVM IR.

arcbbb added a comment.May 9 2023, 1:52 AM

What is the status of this? Is it moving forward? So far as I can tell, there still is no solution other than inline assembly to set vxrm from LLVM IR.

There is a change in RISCV psABI so that VXRM/VXSAT is volatile across calls now. [1]
And the corresponding C intrinsic API is going to be updated based on this ABI change. [2]
[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/294
[2] https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/157

arcbbb abandoned this revision.Aug 10 2023, 1:37 AM

Close this as D151396 is done.

evandro removed a subscriber: evandro.Aug 10 2023, 5:49 PM