Page MenuHomePhabricator

[RISCV] Use vmv.v.[v|i] if we know COPY is under the same vl and vtype.
Needs ReviewPublic

Authored by HsiangKai on Jun 2 2021, 2:37 AM.

Details

Summary

If we know the source operand of COPY is defined by a vector instruction
with tail agnostic and the same LMUL and there is no vsetvli between
COPY and the define instruction to change the vl and vtype, we could use
vmv.v.v or vmv.v.i to copy vector registers to get better performance than
the whole vector register move instructions.

If the source of COPY is from vmv.v.i, we could use vmv.v.i for the
COPY.

This patch only considers all these instructions within one basic block.

Case 1:

bb.0:
  ...
  VSETVLI          # The first VSETVLI before COPY and VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli between VOP and COPY.
  vx = COPY vy

Case 2:

bb.0:
  ...
  VSETVLI          # The first VSETVLI before VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli to change vl between VOP and COPY.
  ...
  VSETVLI          # This vsetvli will not change vl.
  ...
  VSETVLI          # The first VSETVLI before COPY.
  ...              # This VSETVLI does not change vl and vtype.
  ...
  vx = COPY vy

Diff Detail

Unit TestsFailed

TimeTest
50 msx64 debian > LLVM.CodeGen/RISCV/rvv::calling-conv.ll
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llc -mtriple=riscv32 -mattr=+m,+experimental-v < /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/calling-conv.ll --check-prefix=RV32
120 msx64 debian > LLVM.CodeGen/RISCV/rvv::fixed-vectors-calling-conv-fastcc.ll
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llc -mtriple=riscv64 -mattr=+experimental-v -riscv-v-vector-bits-min=128 -riscv-v-fixed-length-vector-lmul-max=8 -verify-machineinstrs < /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv-fastcc.ll | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-calling-conv-fastcc.ll --check-prefixes=CHECK,LMULMAX8
100 msx64 debian > LLVM.CodeGen/RISCV/rvv::fixed-vectors-fp-shuffles.ll
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llc -mtriple=riscv32 -mattr=+d,+experimental-zfh,+experimental-v -riscv-v-vector-bits-min=128 -verify-machineinstrs < /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll --check-prefixes=CHECK,RV32
130 msx64 debian > LLVM.CodeGen/RISCV/rvv::fixed-vectors-int-shuffles.ll
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llc -mtriple=riscv32 -mattr=+experimental-v -riscv-v-vector-bits-min=128 -verify-machineinstrs < /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll --check-prefixes=CHECK,RV32
520 msx64 debian > LLVM.CodeGen/RISCV/rvv::fixed-vectors-int.ll
Script: -- : 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llc -mtriple=riscv32 -mattr=+experimental-v -riscv-v-vector-bits-min=128 -riscv-v-fixed-length-vector-lmul-max=2 -verify-machineinstrs < /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll --check-prefixes=CHECK,RV32,LMULMAX2,LMULMAX2-RV32
View Full Test Results (194 Failed)

Event Timeline

HsiangKai created this revision.Jun 2 2021, 2:37 AM
HsiangKai requested review of this revision.Jun 2 2021, 2:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2021, 2:38 AM
craig.topper added inline comments.Jun 2 2021, 12:15 PM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
168

What if the defining instruction something that doesn't obey vsetvli for example an earlier COPY that became a whole register move. Or a whole register load which could be a spill reload or just a VLMAX load.

Or a reduction which only writes element 0 regardless of the tail policy. I think we currently force tail agnostic on the vsetvli for those, but I might update it to keep whatever tail policy the previous instruction used.

185

If the SEW and LMul are equal, isn't the vsetvli insertion pass usually going to use x0, x0 for VSetVLIForCopy?

205

Is this equivalent to "return LMul == SetLMul;"?

210

What about this

vsetvli
v0 = vectorop
v1 = vleff
copy v2, v0

The vleff may change VL, but if you replace the copy you need the VL from before the vsetvli. So I think you need to stop if you encountered a vleff.

379

Is this supposed to be in this patch. It's not mentioned in the description.

HsiangKai updated this revision to Diff 349776.Jun 4 2021, 12:33 AM

Consider vleff and whole register load/store.

HsiangKai marked 4 inline comments as done.Jun 4 2021, 12:35 AM
HsiangKai retitled this revision from [RISCV] Use vmv.v.v if we know COPY is under the same vl and vtype. to [RISCV] Use vmv.v.[v|i] if we know COPY is under the same vl and vtype..
HsiangKai edited the summary of this revision. (Show Details)
HsiangKai updated this revision to Diff 349804.Jun 4 2021, 3:40 AM

Consider vsetvli x0, x0, vtype.

HsiangKai edited the summary of this revision. (Show Details)Jun 4 2021, 3:47 AM

My browser's really chugging on this huge patch so my input has to be brief. Maybe we could hide the test changes for now?

Regarding vmv.v.i, is there not already some support for rematerialization (e.g. of immediates)? Or is it non-trivial due to having to manage VSETVLIs?

craig.topper added inline comments.Jun 4 2021, 6:49 PM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
193

Can we use MBBI->modifiesRegister(RISCV::VL)?

384–396

I think we need to add the VL and VTYPE implicit uses?

craig.topper added inline comments.Jun 4 2021, 9:00 PM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
168

Looks like I was wrong about reductions. The behavior changed after 0.9 and our implementation is incorrectly forcing tail agnostic for those.

HsiangKai updated this revision to Diff 350027.Jun 4 2021, 10:29 PM
  • Remove most of test updates.
  • Add implicit-use arguments for vmv.v.v or vmv.v.i.

My browser's really chugging on this huge patch so my input has to be brief. Maybe we could hide the test changes for now?

Regarding vmv.v.i, is there not already some support for rematerialization (e.g. of immediates)? Or is it non-trivial due to having to manage VSETVLIs?

There seems no support for rematerialization for vmv.v.i. Do you mean to set isReMaterializable = 1, isAsCheapAsAMove = 1 to vmv.v.i pseudo instructions?

My browser's really chugging on this huge patch so my input has to be brief. Maybe we could hide the test changes for now?

Regarding vmv.v.i, is there not already some support for rematerialization (e.g. of immediates)? Or is it non-trivial due to having to manage VSETVLIs?

There seems no support for rematerialization for vmv.v.i. Do you mean to set isReMaterializable = 1, isAsCheapAsAMove = 1 to vmv.v.i pseudo instructions?

It still won’t work due to the implicit VL and VTYPE operands from vsetvli insertion. If we move vsetvli insertion later after coalescing, rematerialization would work when avl is a 5 bit immediate. Otherwise AVL being a register also disables rematerialization.

HsiangKai updated this revision to Diff 351041.Jun 9 2021, 7:01 PM
  • Add more strict constraint for the conversion.
  • Add comments.
HsiangKai updated this revision to Diff 351047.Jun 9 2021, 7:45 PM

It may have more implicit arguments in the end of the producing instruction. Such as,

$v25 = PseudoVNCLIP_WV_M1 killed renamable $v8m2, killed renamable $v10, $noreg, 3, implicit-def dead $vxsat, implicit $vxrm, implicit $vl, implicit $vtype

HsiangKai updated this revision to Diff 351339.Jun 10 2021, 8:29 PM

Add VL, VTYPE directly.

MIB.addReg(RISCV::VL, RegState::Implicit);
MIB.addReg(RISCV::VTYPE, RegState::Implicit);
HsiangKai updated this revision to Diff 356086.Jul 1 2021, 7:30 PM

Update to the latest downstream version.