This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
2/5
RISCVInsertVSETVLI.cpp
-
test/CodeGen/RISCV/rvv/
-
CodeGen/
-
RISCV/
-
rvv/
-
rv32-vsetvli-intrinsics.ll
-
rv64-vsetvli-intrinsics.ll
-
vsetvli-insert-crossbb.ll
-
vsetvli-insert-crossbb.mir
-
vsetvli-insert.ll
-
vsetvli-insert.mir

Differential D124961

[riscv] Use X0 for destination of VSETVLI instruction if result unused
ClosedPublic

Authored by reames on May 4 2022, 1:45 PM.

Download Raw Diff

Details

Reviewers

craig.topper
khchen
Chenbing.Zheng
jacquesguan

Commits

rG042a7a5f0da8: [riscv] Use X0 for destination of VSETVLI instruction if result unused

Summary

If the GPR destination register of a VSETVLI instruction is unused, we can replace it with X0. This discards the result, and thus reduces register pressure.

Since after the core insertion/lowering algorithm has run, basically all user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic. When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

reames created this revision.May 4 2022, 1:45 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2022, 1:45 PM

Herald added subscribers: sunshaoce, VincentWu, luke957 and 31 others. · View Herald Transcript

reames requested review of this revision.May 4 2022, 1:45 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2022, 1:45 PM

Herald added subscribers: • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

Since after the core insertion/lowering algorithm has run, basically all user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic. When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways.

Almost any real scenario that processes more data than fits in a register will need the GPR output to update memory pointers and to control a loop. Which I guess means we're lacking realistic tests.

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
1229	This is valid for VSETIVLI too.

In D124961#3492285, @craig.topper wrote:

Since after the core insertion/lowering algorithm has run, basically all user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic. When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways.

Almost any real scenario that processes more data than fits in a register will need the GPR output to update memory pointers and to control a loop. Which I guess means we're lacking realistic tests.

How so? An idiomatic tail folded loop doesn't need to explicitly read VL. It's passed via VL reg itself to all the vector ops. It changes across iterations, but that doesn't mean the GPR is used.

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
1229	True. Will update.

In D124961#3492309, @reames wrote:

In D124961#3492285, @craig.topper wrote:

Since after the core insertion/lowering algorithm has run, basically all user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic. When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways.

Almost any real scenario that processes more data than fits in a register will need the GPR output to update memory pointers and to control a loop. Which I guess means we're lacking realistic tests.

How so? An idiomatic tail folded loop doesn't need to explicitly read VL. It's passed via VL reg itself to all the vector ops. It changes across iterations, but that doesn't mean the GPR is used.

Don't you need to know how many elements were processed to advance the pointer for the next iteration. That is done is scalar registers. You also need to update a remaining element count to be passed to the vsetvli for the next iteration. Assuming we're talking about a loop like this example from the spec

A.1. Vector-vector add example
    # vector-vector add routine of 32-bit integers
    # void vvaddint32(size_t n, const int*x, const int*y, int*z)
    # { for (size_t i=0; i<n; i++) { z[i]=x[i]+y[i]; } }
    #
    # a0 = n, a1 = x, a2 = y, a3 = z
    # Non-vector instructions are indented
vvaddint32:
    vsetvli t0, a0, e32, ta, ma  # Set vector length based on 32-bit vectors
vle32.v v0, (a1)
  sub a0, a0, t0
  slli t0, t0, 2
  add a1, a1, t0
vle32.v v1, (a2)
  add a2, a2, t0
vadd.vv v2, v0, v1
vse32.v v2, (a3)
  add a3, a3, t0
  bnez a0, vvaddint32
  ret

In D124961#3492328, @craig.topper wrote:

In D124961#3492309, @reames wrote:

In D124961#3492285, @craig.topper wrote:

Since after the core insertion/lowering algorithm has run, basically all user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic. When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways.

Almost any real scenario that processes more data than fits in a register will need the GPR output to update memory pointers and to control a loop. Which I guess means we're lacking realistic tests.

How so? An idiomatic tail folded loop doesn't need to explicitly read VL. It's passed via VL reg itself to all the vector ops. It changes across iterations, but that doesn't mean the GPR is used.

Don't you need to know how many elements were processed to advance the pointer for the next iteration. That is done is scalar registers. You also need to update a remaining element count to be passed to the vsetvli for the next iteration. Assuming we're talking about a loop like this example from the spec

You are of course correct.

The actual example I was looking at was fixed length vectorization. Because you know the VL in advance, you don't need to read it back.

So we have realistic test for fixed length loop vectorization at least. :)

Add VSETIVLI

craig.topper added inline comments.May 4 2022, 2:54 PM

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
1232	This should maybe be `use_nodbg_empty`.

reames updated this revision to Diff 427154.May 4 2022, 3:10 PM

LGTM

This revision is now accepted and ready to land.May 4 2022, 3:45 PM

Harbormaster completed remote builds in B162806: Diff 427154.May 4 2022, 7:13 PM

frasercrmck added inline comments.May 5 2022, 12:55 AM

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
1224	I know we have `PseudoVSETVLIX0`, but is it really an invariant that `PseudoVSETVLI` never has `X0, X0` at this stage? Should we maybe add an assert to be a bit more sure? If for whatever reason we do have `X0, X0` this would (subtly) generate some wrong code.

I think maybe it's good to have pre-commit test to demonstrate VLS realistic cases could be benefited by this improvement, or at least mention realistic test are coming from VLS vectorization in commit message.

This revision was landed with ongoing or failed builds.May 5 2022, 7:40 AM

Closed by commit rG042a7a5f0da8: [riscv] Use X0 for destination of VSETVLI instruction if result unused (authored by reames). · Explain Why

This revision was automatically updated to reflect the committed changes.

reames added a commit: rG042a7a5f0da8: [riscv] Use X0 for destination of VSETVLI instruction if result unused.

In D124961#3493925, @khchen wrote:

I think maybe it's good to have pre-commit test to demonstrate VLS realistic cases could be benefited by this improvement, or at least mention realistic test are coming from VLS vectorization in commit message.

What does VLS stand for?

Assuming you mean fixed length vectorization, does the comment as submitted satisfy you?

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
1224	At this stage, all the operands are vregs. We have tor rely on the register constraints that are on the instructions to cause the register allocator to do the right thing here. The more generic version of this transform - not applied to just VSETVLI - would have to use register class constraints to on the def instead of using knowledge of the psuedo to assume x0 is valid as this one does.

dtcxzyw mentioned this in D158759: [RISCV] Add a pass to rewrite rd to x0 for non-computational instrs whose return values are unused.Sep 8 2023, 11:17 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVInsertVSETVLI.cpp

16 lines

test/

CodeGen/

RISCV/

rvv/

rv32-vsetvli-intrinsics.ll

8 lines

rv64-vsetvli-intrinsics.ll

10 lines

vsetvli-insert-crossbb.ll

10 lines

vsetvli-insert-crossbb.mir

2 lines

vsetvli-insert.ll

14 lines

vsetvli-insert.mir

2 lines

Diff 427318

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp

Show First 20 Lines • Show All 1,212 Lines • ▼ Show 20 Lines	if (HaveVectorOp) {
}		}

// Phase 3 - add any vsetvli instructions needed in the block. Use the		// Phase 3 - add any vsetvli instructions needed in the block. Use the
// Phase 2 information to avoid adding vsetvlis before the first vector		// Phase 2 information to avoid adding vsetvlis before the first vector
// instruction in the block if the VL/VTYPE is satisfied by its		// instruction in the block if the VL/VTYPE is satisfied by its
// predecessors.		// predecessors.
for (MachineBasicBlock &MBB : MF)		for (MachineBasicBlock &MBB : MF)
emitVSETVLIs(MBB);		emitVSETVLIs(MBB);

		// Once we're fully done rewriting all the instructions, do a final pass
		// through to check for VSETVLIs which write to an unused destination.
		// For the non X0, X0 variant, we can replace the destination register
		frasercrmckUnsubmitted Not Done Reply Inline Actions I know we have `PseudoVSETVLIX0`, but is it really an invariant that `PseudoVSETVLI` never has `X0, X0` at this stage? Should we maybe add an assert to be a bit more sure? If for whatever reason we do have `X0, X0` this would (subtly) generate some wrong code. frasercrmck: I know we have `PseudoVSETVLIX0`, but is it really an invariant that `PseudoVSETVLI` //never//…
		reamesAuthorUnsubmitted Done Reply Inline Actions At this stage, all the operands are vregs. We have tor rely on the register constraints that are on the instructions to cause the register allocator to do the right thing here. The more generic version of this transform - not applied to just VSETVLI - would have to use register class constraints to on the def instead of using knowledge of the psuedo to assume x0 is valid as this one does. reames: At this stage, all the operands are vregs. We have tor rely on the register constraints that…
		// with X0 to reduce register pressure. This is really a generic
		// optimization which can be applied to any dead def (TODO: generalize).
		for (MachineBasicBlock &MBB : MF) {
		for (MachineInstr &MI : MBB) {
		if (MI.getOpcode() == RISCV::PseudoVSETVLI \|\|
		craig.topperUnsubmitted Not Done Reply Inline Actions This is valid for VSETIVLI too. craig.topper: This is valid for VSETIVLI too.
		reamesAuthorUnsubmitted Done Reply Inline Actions True. Will update. reames: True. Will update.
		MI.getOpcode() == RISCV::PseudoVSETIVLI) {
		Register VRegDef = MI.getOperand(0).getReg();
		if (VRegDef != RISCV::X0 && MRI->use_nodbg_empty(VRegDef))
		craig.topperUnsubmitted Not Done Reply Inline Actions This should maybe be `use_nodbg_empty`. craig.topper: This should maybe be `use_nodbg_empty`.
		MI.getOperand(0).setReg(RISCV::X0);
		}
		}
		}
}		}

BlockInfo.clear();		BlockInfo.clear();

return HaveVectorOp;		return HaveVectorOp;
}		}

/// Returns an instance of the Insert VSETVLI pass.		/// Returns an instance of the Insert VSETVLI pass.
FunctionPass *llvm::createRISCVInsertVSETVLIPass() {		FunctionPass *llvm::createRISCVInsertVSETVLIPass() {
return new RISCVInsertVSETVLI();		return new RISCVInsertVSETVLI();
}		}

llvm/test/CodeGen/RISCV/rvv/rv32-vsetvli-intrinsics.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=riscv32 -mattr=+v -verify-machineinstrs < %s \| FileCheck %s			; RUN: llc -mtriple=riscv32 -mattr=+v -verify-machineinstrs < %s \| FileCheck %s

	declare i32 @llvm.riscv.vsetvli.i32(i32, i32, i32)			declare i32 @llvm.riscv.vsetvli.i32(i32, i32, i32)
	declare i32 @llvm.riscv.vsetvlimax.i32(i32, i32)			declare i32 @llvm.riscv.vsetvlimax.i32(i32, i32)
	declare i32 @llvm.riscv.vsetvli.opt.i32(i32, i32, i32)			declare i32 @llvm.riscv.vsetvli.opt.i32(i32, i32, i32)
	declare i32 @llvm.riscv.vsetvlimax.opt.i32(i32, i32)			declare i32 @llvm.riscv.vsetvlimax.opt.i32(i32, i32)

	define void @test_vsetvli_e64mf8(i32 %avl) nounwind {			define void @test_vsetvli_e64mf8(i32 %avl) nounwind {
	; CHECK-LABEL: test_vsetvli_e64mf8:			; CHECK-LABEL: test_vsetvli_e64mf8:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetvli a0, a0, e64, mf8, ta, mu			; CHECK-NEXT: vsetvli zero, a0, e64, mf8, ta, mu
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	call i32 @llvm.riscv.vsetvli.i32(i32 %avl, i32 3, i32 5)			call i32 @llvm.riscv.vsetvli.i32(i32 %avl, i32 3, i32 5)
	ret void			ret void
	}			}

	define void @test_vsetvli_e8mf2_zero_avl() nounwind {			define void @test_vsetvli_e8mf2_zero_avl() nounwind {
	; CHECK-LABEL: test_vsetvli_e8mf2_zero_avl:			; CHECK-LABEL: test_vsetvli_e8mf2_zero_avl:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetivli a0, 0, e8, mf2, ta, mu			; CHECK-NEXT: vsetivli zero, 0, e8, mf2, ta, mu
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	call i32 @llvm.riscv.vsetvli.i32(i32 0, i32 0, i32 7)			call i32 @llvm.riscv.vsetvli.i32(i32 0, i32 0, i32 7)
	ret void			ret void
	}			}

	define void @test_vsetvlimax_e64m8() nounwind {			define void @test_vsetvlimax_e64m8() nounwind {
	; CHECK-LABEL: test_vsetvlimax_e64m8:			; CHECK-LABEL: test_vsetvlimax_e64m8:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	}			}

	declare <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i32(<vscale x 4 x i32>, <vscale x 4 x i32>*, i32)			declare <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i32(<vscale x 4 x i32>, <vscale x 4 x i32>*, i32)

	; Check that we remove the redundant vsetvli when followed by another operation			; Check that we remove the redundant vsetvli when followed by another operation
	define <vscale x 4 x i32> @redundant_vsetvli(i32 %avl, <vscale x 4 x i32>* %ptr) nounwind {			define <vscale x 4 x i32> @redundant_vsetvli(i32 %avl, <vscale x 4 x i32>* %ptr) nounwind {
	; CHECK-LABEL: redundant_vsetvli:			; CHECK-LABEL: redundant_vsetvli:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetvli a0, a0, e32, m2, ta, mu			; CHECK-NEXT: vsetvli zero, a0, e32, m2, ta, mu
	; CHECK-NEXT: vle32.v v8, (a1)			; CHECK-NEXT: vle32.v v8, (a1)
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%vl = call i32 @llvm.riscv.vsetvli.i32(i32 %avl, i32 2, i32 1)			%vl = call i32 @llvm.riscv.vsetvli.i32(i32 %avl, i32 2, i32 1)
	%x = call <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i32(<vscale x 4 x i32> undef, <vscale x 4 x i32>* %ptr, i32 %vl)			%x = call <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i32(<vscale x 4 x i32> undef, <vscale x 4 x i32>* %ptr, i32 %vl)
	ret <vscale x 4 x i32> %x			ret <vscale x 4 x i32> %x
	}			}

	; Check that we remove the repeated/redundant vsetvli when followed by another			; Check that we remove the repeated/redundant vsetvli when followed by another
	; operation			; operation
	; FIXME: We don't catch the second vsetvli because it has a use of its output.			; FIXME: We don't catch the second vsetvli because it has a use of its output.
	; We could replace it with the output of the first vsetvli.			; We could replace it with the output of the first vsetvli.
	define <vscale x 4 x i32> @repeated_vsetvli(i32 %avl, <vscale x 4 x i32>* %ptr) nounwind {			define <vscale x 4 x i32> @repeated_vsetvli(i32 %avl, <vscale x 4 x i32>* %ptr) nounwind {
	; CHECK-LABEL: repeated_vsetvli:			; CHECK-LABEL: repeated_vsetvli:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetvli a0, a0, e32, m2, ta, mu			; CHECK-NEXT: vsetvli a0, a0, e32, m2, ta, mu
	; CHECK-NEXT: vsetvli a0, a0, e32, m2, ta, mu			; CHECK-NEXT: vsetvli zero, a0, e32, m2, ta, mu
	; CHECK-NEXT: vle32.v v8, (a1)			; CHECK-NEXT: vle32.v v8, (a1)
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%vl0 = call i32 @llvm.riscv.vsetvli.i32(i32 %avl, i32 2, i32 1)			%vl0 = call i32 @llvm.riscv.vsetvli.i32(i32 %avl, i32 2, i32 1)
	%vl1 = call i32 @llvm.riscv.vsetvli.i32(i32 %vl0, i32 2, i32 1)			%vl1 = call i32 @llvm.riscv.vsetvli.i32(i32 %vl0, i32 2, i32 1)
	%x = call <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i32(<vscale x 4 x i32> undef, <vscale x 4 x i32>* %ptr, i32 %vl1)			%x = call <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i32(<vscale x 4 x i32> undef, <vscale x 4 x i32>* %ptr, i32 %vl1)
	ret <vscale x 4 x i32> %x			ret <vscale x 4 x i32> %x
	}			}

llvm/test/CodeGen/RISCV/rvv/rv64-vsetvli-intrinsics.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=riscv64 -mattr=+v -verify-machineinstrs < %s \| FileCheck %s			; RUN: llc -mtriple=riscv64 -mattr=+v -verify-machineinstrs < %s \| FileCheck %s

	declare i64 @llvm.riscv.vsetvli.i64(i64, i64, i64)			declare i64 @llvm.riscv.vsetvli.i64(i64, i64, i64)
	declare i64 @llvm.riscv.vsetvlimax.i64(i64, i64)			declare i64 @llvm.riscv.vsetvlimax.i64(i64, i64)
	declare i64 @llvm.riscv.vsetvli.opt.i64(i64, i64, i64)			declare i64 @llvm.riscv.vsetvli.opt.i64(i64, i64, i64)
	declare i64 @llvm.riscv.vsetvlimax.opt.i64(i64, i64)			declare i64 @llvm.riscv.vsetvlimax.opt.i64(i64, i64)

	define void @test_vsetvli_e8m1(i64 %avl) nounwind {			define void @test_vsetvli_e8m1(i64 %avl) nounwind {
	; CHECK-LABEL: test_vsetvli_e8m1:			; CHECK-LABEL: test_vsetvli_e8m1:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetvli a0, a0, e8, m1, ta, mu			; CHECK-NEXT: vsetvli zero, a0, e8, m1, ta, mu
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	call i64 @llvm.riscv.vsetvli.i64(i64 %avl, i64 0, i64 0)			call i64 @llvm.riscv.vsetvli.i64(i64 %avl, i64 0, i64 0)
	ret void			ret void
	}			}

	define void @test_vsetvli_e16mf4(i64 %avl) nounwind {			define void @test_vsetvli_e16mf4(i64 %avl) nounwind {
	; CHECK-LABEL: test_vsetvli_e16mf4:			; CHECK-LABEL: test_vsetvli_e16mf4:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetvli a0, a0, e16, mf4, ta, mu			; CHECK-NEXT: vsetvli zero, a0, e16, mf4, ta, mu
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	call i64 @llvm.riscv.vsetvli.i64(i64 %avl, i64 1, i64 6)			call i64 @llvm.riscv.vsetvli.i64(i64 %avl, i64 1, i64 6)
	ret void			ret void
	}			}

	define void @test_vsetvli_e32mf8_zero_avl() nounwind {			define void @test_vsetvli_e32mf8_zero_avl() nounwind {
	; CHECK-LABEL: test_vsetvli_e32mf8_zero_avl:			; CHECK-LABEL: test_vsetvli_e32mf8_zero_avl:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetivli a0, 0, e16, mf4, ta, mu			; CHECK-NEXT: vsetivli zero, 0, e16, mf4, ta, mu
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	call i64 @llvm.riscv.vsetvli.i64(i64 0, i64 1, i64 6)			call i64 @llvm.riscv.vsetvli.i64(i64 0, i64 1, i64 6)
	ret void			ret void
	}			}

	define void @test_vsetvlimax_e32m2() nounwind {			define void @test_vsetvlimax_e32m2() nounwind {
	; CHECK-LABEL: test_vsetvlimax_e32m2:			; CHECK-LABEL: test_vsetvlimax_e32m2:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
	}			}

	declare <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i64(<vscale x 4 x i32>, <vscale x 4 x i32>*, i64)			declare <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i64(<vscale x 4 x i32>, <vscale x 4 x i32>*, i64)

	; Check that we remove the redundant vsetvli when followed by another operation			; Check that we remove the redundant vsetvli when followed by another operation
	define <vscale x 4 x i32> @redundant_vsetvli(i64 %avl, <vscale x 4 x i32>* %ptr) nounwind {			define <vscale x 4 x i32> @redundant_vsetvli(i64 %avl, <vscale x 4 x i32>* %ptr) nounwind {
	; CHECK-LABEL: redundant_vsetvli:			; CHECK-LABEL: redundant_vsetvli:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetvli a0, a0, e32, m2, ta, mu			; CHECK-NEXT: vsetvli zero, a0, e32, m2, ta, mu
	; CHECK-NEXT: vle32.v v8, (a1)			; CHECK-NEXT: vle32.v v8, (a1)
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%vl = call i64 @llvm.riscv.vsetvli.i64(i64 %avl, i64 2, i64 1)			%vl = call i64 @llvm.riscv.vsetvli.i64(i64 %avl, i64 2, i64 1)
	%x = call <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i64(<vscale x 4 x i32> undef, <vscale x 4 x i32>* %ptr, i64 %vl)			%x = call <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i64(<vscale x 4 x i32> undef, <vscale x 4 x i32>* %ptr, i64 %vl)
	ret <vscale x 4 x i32> %x			ret <vscale x 4 x i32> %x
	}			}

	; Check that we remove the repeated/redundant vsetvli when followed by another			; Check that we remove the repeated/redundant vsetvli when followed by another
	; operation			; operation
	; FIXME: We don't catch the second vsetvli because it has a use of its output.			; FIXME: We don't catch the second vsetvli because it has a use of its output.
	; We could replace it with the output of the first vsetvli.			; We could replace it with the output of the first vsetvli.
	define <vscale x 4 x i32> @repeated_vsetvli(i64 %avl, <vscale x 4 x i32>* %ptr) nounwind {			define <vscale x 4 x i32> @repeated_vsetvli(i64 %avl, <vscale x 4 x i32>* %ptr) nounwind {
	; CHECK-LABEL: repeated_vsetvli:			; CHECK-LABEL: repeated_vsetvli:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetvli a0, a0, e32, m2, ta, mu			; CHECK-NEXT: vsetvli a0, a0, e32, m2, ta, mu
	; CHECK-NEXT: vsetvli a0, a0, e32, m2, ta, mu			; CHECK-NEXT: vsetvli zero, a0, e32, m2, ta, mu
	; CHECK-NEXT: vle32.v v8, (a1)			; CHECK-NEXT: vle32.v v8, (a1)
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%vl0 = call i64 @llvm.riscv.vsetvli.i64(i64 %avl, i64 2, i64 1)			%vl0 = call i64 @llvm.riscv.vsetvli.i64(i64 %avl, i64 2, i64 1)
	%vl1 = call i64 @llvm.riscv.vsetvli.i64(i64 %vl0, i64 2, i64 1)			%vl1 = call i64 @llvm.riscv.vsetvli.i64(i64 %vl0, i64 2, i64 1)
	%x = call <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i64(<vscale x 4 x i32> undef, <vscale x 4 x i32>* %ptr, i64 %vl1)			%x = call <vscale x 4 x i32> @llvm.riscv.vle.nxv4i32.i64(<vscale x 4 x i32> undef, <vscale x 4 x i32>* %ptr, i64 %vl1)
	ret <vscale x 4 x i32> %x			ret <vscale x 4 x i32> %x
	}			}

llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll

Show All 17 Lines
declare <vscale x 2 x float> @llvm.riscv.vfmv.v.f.nxv2f32.f32( <vscale x 2 x float>, float, i64)		declare <vscale x 2 x float> @llvm.riscv.vfmv.v.f.nxv2f32.f32( <vscale x 2 x float>, float, i64)

declare void @llvm.riscv.vse.nxv1f64(<vscale x 1 x double>, <vscale x 1 x double>* nocapture, i64)		declare void @llvm.riscv.vse.nxv1f64(<vscale x 1 x double>, <vscale x 1 x double>* nocapture, i64)
declare void @llvm.riscv.vse.nxv2f32(<vscale x 2 x float>, <vscale x 2 x float>* nocapture, i64)		declare void @llvm.riscv.vse.nxv2f32(<vscale x 2 x float>, <vscale x 2 x float>* nocapture, i64)

define <vscale x 1 x double> @test1(i64 %avl, i8 zeroext %cond, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {		define <vscale x 1 x double> @test1(i64 %avl, i8 zeroext %cond, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {
; CHECK-LABEL: test1:		; CHECK-LABEL: test1:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: vsetvli a0, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, mu
; CHECK-NEXT: beqz a1, .LBB0_2		; CHECK-NEXT: beqz a1, .LBB0_2
; CHECK-NEXT: # %bb.1: # %if.then		; CHECK-NEXT: # %bb.1: # %if.then
; CHECK-NEXT: vfadd.vv v8, v8, v9		; CHECK-NEXT: vfadd.vv v8, v8, v9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
; CHECK-NEXT: .LBB0_2: # %if.else		; CHECK-NEXT: .LBB0_2: # %if.else
; CHECK-NEXT: vfsub.vv v8, v8, v9		; CHECK-NEXT: vfsub.vv v8, v8, v9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
Show All 14 Lines	if.end: ; preds = %if.else, %if.then
ret <vscale x 1 x double> %c.0		ret <vscale x 1 x double> %c.0
}		}

@scratch = global i8 0, align 16		@scratch = global i8 0, align 16

define <vscale x 1 x double> @test2(i64 %avl, i8 zeroext %cond, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {		define <vscale x 1 x double> @test2(i64 %avl, i8 zeroext %cond, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {
; CHECK-LABEL: test2:		; CHECK-LABEL: test2:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: vsetvli a0, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, mu
; CHECK-NEXT: beqz a1, .LBB1_2		; CHECK-NEXT: beqz a1, .LBB1_2
; CHECK-NEXT: # %bb.1: # %if.then		; CHECK-NEXT: # %bb.1: # %if.then
; CHECK-NEXT: vfadd.vv v9, v8, v9		; CHECK-NEXT: vfadd.vv v9, v8, v9
; CHECK-NEXT: vfmul.vv v8, v9, v8		; CHECK-NEXT: vfmul.vv v8, v9, v8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
; CHECK-NEXT: .LBB1_2: # %if.else		; CHECK-NEXT: .LBB1_2: # %if.else
; CHECK-NEXT: vfsub.vv v9, v8, v9		; CHECK-NEXT: vfsub.vv v9, v8, v9
; CHECK-NEXT: vfmul.vv v8, v9, v8		; CHECK-NEXT: vfmul.vv v8, v9, v8
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	if.end: ; preds = %if.else, %if.then
%8 = tail call <vscale x 1 x double> @llvm.riscv.vfmul.nxv1f64.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> %l, <vscale x 1 x double> %r, i64 %avl)		%8 = tail call <vscale x 1 x double> @llvm.riscv.vfmul.nxv1f64.nxv1f64(<vscale x 1 x double> undef, <vscale x 1 x double> %l, <vscale x 1 x double> %r, i64 %avl)
ret <vscale x 1 x double> %8		ret <vscale x 1 x double> %8
}		}

define <vscale x 1 x double> @test5(i64 %avl, i8 zeroext %cond, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {		define <vscale x 1 x double> @test5(i64 %avl, i8 zeroext %cond, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {
; CHECK-LABEL: test5:		; CHECK-LABEL: test5:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: andi a2, a1, 1		; CHECK-NEXT: andi a2, a1, 1
; CHECK-NEXT: vsetvli a0, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, mu
; CHECK-NEXT: bnez a2, .LBB4_3		; CHECK-NEXT: bnez a2, .LBB4_3
; CHECK-NEXT: # %bb.1: # %if.else		; CHECK-NEXT: # %bb.1: # %if.else
; CHECK-NEXT: vfsub.vv v9, v8, v9		; CHECK-NEXT: vfsub.vv v9, v8, v9
; CHECK-NEXT: andi a0, a1, 2		; CHECK-NEXT: andi a0, a1, 2
; CHECK-NEXT: beqz a0, .LBB4_4		; CHECK-NEXT: beqz a0, .LBB4_4
; CHECK-NEXT: .LBB4_2: # %if.then4		; CHECK-NEXT: .LBB4_2: # %if.then4
; CHECK-NEXT: vfmul.vv v8, v9, v8		; CHECK-NEXT: vfmul.vv v8, v9, v8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
; CHECK-NEXT: andi a3, a1, 1		; CHECK-NEXT: andi a3, a1, 1
; CHECK-NEXT: vsetvli a2, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli a2, a0, e64, m1, ta, mu
; CHECK-NEXT: bnez a3, .LBB5_3		; CHECK-NEXT: bnez a3, .LBB5_3
; CHECK-NEXT: # %bb.1: # %if.else		; CHECK-NEXT: # %bb.1: # %if.else
; CHECK-NEXT: vfsub.vv v8, v8, v9		; CHECK-NEXT: vfsub.vv v8, v8, v9
; CHECK-NEXT: andi a1, a1, 2		; CHECK-NEXT: andi a1, a1, 2
; CHECK-NEXT: beqz a1, .LBB5_4		; CHECK-NEXT: beqz a1, .LBB5_4
; CHECK-NEXT: .LBB5_2: # %if.then4		; CHECK-NEXT: .LBB5_2: # %if.then4
; CHECK-NEXT: vsetvli a0, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, mu
; CHECK-NEXT: lui a0, %hi(.LCPI5_0)		; CHECK-NEXT: lui a0, %hi(.LCPI5_0)
; CHECK-NEXT: addi a0, a0, %lo(.LCPI5_0)		; CHECK-NEXT: addi a0, a0, %lo(.LCPI5_0)
; CHECK-NEXT: vlse64.v v9, (a0), zero		; CHECK-NEXT: vlse64.v v9, (a0), zero
; CHECK-NEXT: lui a0, %hi(.LCPI5_1)		; CHECK-NEXT: lui a0, %hi(.LCPI5_1)
; CHECK-NEXT: addi a0, a0, %lo(.LCPI5_1)		; CHECK-NEXT: addi a0, a0, %lo(.LCPI5_1)
; CHECK-NEXT: vlse64.v v10, (a0), zero		; CHECK-NEXT: vlse64.v v10, (a0), zero
; CHECK-NEXT: vfadd.vv v9, v9, v10		; CHECK-NEXT: vfadd.vv v9, v9, v10
; CHECK-NEXT: lui a0, %hi(scratch)		; CHECK-NEXT: lui a0, %hi(scratch)
; CHECK-NEXT: addi a0, a0, %lo(scratch)		; CHECK-NEXT: addi a0, a0, %lo(scratch)
; CHECK-NEXT: vse64.v v9, (a0)		; CHECK-NEXT: vse64.v v9, (a0)
; CHECK-NEXT: j .LBB5_5		; CHECK-NEXT: j .LBB5_5
; CHECK-NEXT: .LBB5_3: # %if.then		; CHECK-NEXT: .LBB5_3: # %if.then
; CHECK-NEXT: vfadd.vv v8, v8, v9		; CHECK-NEXT: vfadd.vv v8, v8, v9
; CHECK-NEXT: andi a1, a1, 2		; CHECK-NEXT: andi a1, a1, 2
; CHECK-NEXT: bnez a1, .LBB5_2		; CHECK-NEXT: bnez a1, .LBB5_2
; CHECK-NEXT: .LBB5_4: # %if.else5		; CHECK-NEXT: .LBB5_4: # %if.else5
; CHECK-NEXT: vsetvli a0, a0, e32, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e32, m1, ta, mu
; CHECK-NEXT: lui a0, %hi(.LCPI5_2)		; CHECK-NEXT: lui a0, %hi(.LCPI5_2)
; CHECK-NEXT: addi a0, a0, %lo(.LCPI5_2)		; CHECK-NEXT: addi a0, a0, %lo(.LCPI5_2)
; CHECK-NEXT: vlse32.v v9, (a0), zero		; CHECK-NEXT: vlse32.v v9, (a0), zero
; CHECK-NEXT: lui a0, %hi(.LCPI5_3)		; CHECK-NEXT: lui a0, %hi(.LCPI5_3)
; CHECK-NEXT: addi a0, a0, %lo(.LCPI5_3)		; CHECK-NEXT: addi a0, a0, %lo(.LCPI5_3)
; CHECK-NEXT: vlse32.v v10, (a0), zero		; CHECK-NEXT: vlse32.v v10, (a0), zero
; CHECK-NEXT: vfadd.vv v9, v9, v10		; CHECK-NEXT: vfadd.vv v9, v9, v10
; CHECK-NEXT: lui a0, %hi(scratch)		; CHECK-NEXT: lui a0, %hi(scratch)
▲ Show 20 Lines • Show All 367 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir

Show First 20 Lines • Show All 422 Lines • ▼ Show 20 Lines	body: \|
; CHECK: bb.0.entry:		; CHECK: bb.0.entry:
; CHECK-NEXT: successors: %bb.2(0x30000000), %bb.1(0x50000000)		; CHECK-NEXT: successors: %bb.2(0x30000000), %bb.1(0x50000000)
; CHECK-NEXT: liveins: $x10, $v8, $v9, $x11		; CHECK-NEXT: liveins: $x10, $v8, $v9, $x11
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gprnox0 = COPY $x11		; CHECK-NEXT: [[COPY:%[0-9]+]]:gprnox0 = COPY $x11
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vr = COPY $v9		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vr = COPY $v9
; CHECK-NEXT: [[COPY2:%[0-9]+]]:vr = COPY $v8		; CHECK-NEXT: [[COPY2:%[0-9]+]]:vr = COPY $v8
; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr = COPY $x10		; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr = COPY $x10
; CHECK-NEXT: [[PseudoVSETVLI:%[0-9]+]]:gprnox0 = PseudoVSETVLI [[COPY]], 88 /* e64, m1, ta, mu */, implicit-def $vl, implicit-def $vtype		; CHECK-NEXT: $x0 = PseudoVSETVLI [[COPY]], 88 /* e64, m1, ta, mu */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: [[COPY4:%[0-9]+]]:gpr = COPY $x0		; CHECK-NEXT: [[COPY4:%[0-9]+]]:gpr = COPY $x0
; CHECK-NEXT: BEQ [[COPY3]], [[COPY4]], %bb.2		; CHECK-NEXT: BEQ [[COPY3]], [[COPY4]], %bb.2
; CHECK-NEXT: PseudoBR %bb.1		; CHECK-NEXT: PseudoBR %bb.1
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1.if.then:		; CHECK-NEXT: bb.1.if.then:
; CHECK-NEXT: successors: %bb.3(0x80000000)		; CHECK-NEXT: successors: %bb.3(0x80000000)
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[PseudoVADD_VV_M1_:%[0-9]+]]:vr = PseudoVADD_VV_M1 [[COPY2]], [[COPY1]], $noreg, 6 /* e64 */, implicit $vl, implicit $vtype		; CHECK-NEXT: [[PseudoVADD_VV_M1_:%[0-9]+]]:vr = PseudoVADD_VV_M1 [[COPY2]], [[COPY1]], $noreg, 6 /* e64 */, implicit $vl, implicit $vtype
▲ Show 20 Lines • Show All 385 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll

Show All 12 Lines	declare <vscale x 1 x i64> @llvm.riscv.vle.mask.nxv1i64(
<vscale x 1 x i64>,		<vscale x 1 x i64>,
<vscale x 1 x i64>*,		<vscale x 1 x i64>*,
<vscale x 1 x i1>,		<vscale x 1 x i1>,
i64, i64)		i64, i64)

define <vscale x 1 x double> @test1(i64 %avl, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {		define <vscale x 1 x double> @test1(i64 %avl, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {
; CHECK-LABEL: test1:		; CHECK-LABEL: test1:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: vsetvli a0, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, mu
; CHECK-NEXT: vfadd.vv v8, v8, v9		; CHECK-NEXT: vfadd.vv v8, v8, v9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
%0 = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 2, i64 7)		%0 = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 2, i64 7)
%1 = tail call <vscale x 1 x double> @llvm.riscv.vfadd.nxv1f64.nxv1f64(		%1 = tail call <vscale x 1 x double> @llvm.riscv.vfadd.nxv1f64.nxv1f64(
<vscale x 1 x double> undef,		<vscale x 1 x double> undef,
<vscale x 1 x double> %a,		<vscale x 1 x double> %a,
<vscale x 1 x double> %b,		<vscale x 1 x double> %b,
i64 %0)		i64 %0)
ret <vscale x 1 x double> %1		ret <vscale x 1 x double> %1
}		}

define <vscale x 1 x double> @test2(i64 %avl, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {		define <vscale x 1 x double> @test2(i64 %avl, <vscale x 1 x double> %a, <vscale x 1 x double> %b) nounwind {
; CHECK-LABEL: test2:		; CHECK-LABEL: test2:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: vsetvli a0, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, mu
; CHECK-NEXT: vfadd.vv v8, v8, v9		; CHECK-NEXT: vfadd.vv v8, v8, v9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
%0 = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 2, i64 7)		%0 = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 2, i64 7)
%1 = tail call <vscale x 1 x double> @llvm.riscv.vfadd.nxv1f64.nxv1f64(		%1 = tail call <vscale x 1 x double> @llvm.riscv.vfadd.nxv1f64.nxv1f64(
<vscale x 1 x double> undef,		<vscale x 1 x double> undef,
<vscale x 1 x double> %a,		<vscale x 1 x double> %a,
<vscale x 1 x double> %b,		<vscale x 1 x double> %b,
i64 %avl)		i64 %avl)
ret <vscale x 1 x double> %1		ret <vscale x 1 x double> %1
}		}

define <vscale x 1 x i64> @test3(i64 %avl, <vscale x 1 x i64> %a, <vscale x 1 x i64>* %b, <vscale x 1 x i1> %c) nounwind {		define <vscale x 1 x i64> @test3(i64 %avl, <vscale x 1 x i64> %a, <vscale x 1 x i64>* %b, <vscale x 1 x i1> %c) nounwind {
; CHECK-LABEL: test3:		; CHECK-LABEL: test3:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: vsetvli a0, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, mu
; CHECK-NEXT: vle64.v v8, (a1), v0.t		; CHECK-NEXT: vle64.v v8, (a1), v0.t
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
%0 = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 3, i64 0)		%0 = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 3, i64 0)
%1 = call <vscale x 1 x i64> @llvm.riscv.vle.mask.nxv1i64(		%1 = call <vscale x 1 x i64> @llvm.riscv.vle.mask.nxv1i64(
<vscale x 1 x i64> %a,		<vscale x 1 x i64> %a,
<vscale x 1 x i64>* %b,		<vscale x 1 x i64>* %b,
<vscale x 1 x i1> %c,		<vscale x 1 x i1> %c,
i64 %0, i64 1)		i64 %0, i64 1)

ret <vscale x 1 x i64> %1		ret <vscale x 1 x i64> %1
}		}

define <vscale x 1 x i64> @test4(i64 %avl, <vscale x 1 x i64> %a, <vscale x 1 x i64>* %b, <vscale x 1 x i1> %c) nounwind {		define <vscale x 1 x i64> @test4(i64 %avl, <vscale x 1 x i64> %a, <vscale x 1 x i64>* %b, <vscale x 1 x i1> %c) nounwind {
; CHECK-LABEL: test4:		; CHECK-LABEL: test4:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: vsetvli a0, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, mu
; CHECK-NEXT: vle64.v v8, (a1), v0.t		; CHECK-NEXT: vle64.v v8, (a1), v0.t
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
%0 = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 3, i64 0)		%0 = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 3, i64 0)
%1 = call <vscale x 1 x i64> @llvm.riscv.vle.mask.nxv1i64(		%1 = call <vscale x 1 x i64> @llvm.riscv.vle.mask.nxv1i64(
<vscale x 1 x i64> %a,		<vscale x 1 x i64> %a,
<vscale x 1 x i64>* %b,		<vscale x 1 x i64>* %b,
<vscale x 1 x i1> %c,		<vscale x 1 x i1> %c,
i64 %avl, i64 1)		i64 %avl, i64 1)

ret <vscale x 1 x i64> %1		ret <vscale x 1 x i64> %1
}		}

; Make sure we don't insert a vsetvli for the vmand instruction.		; Make sure we don't insert a vsetvli for the vmand instruction.
define <vscale x 1 x i1> @test5(<vscale x 1 x i64> %0, <vscale x 1 x i64> %1, <vscale x 1 x i1> %2, i64 %avl) nounwind {		define <vscale x 1 x i1> @test5(<vscale x 1 x i64> %0, <vscale x 1 x i64> %1, <vscale x 1 x i1> %2, i64 %avl) nounwind {
; CHECK-LABEL: test5:		; CHECK-LABEL: test5:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: vsetvli a0, a0, e64, m1, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e64, m1, ta, mu
; CHECK-NEXT: vmseq.vv v8, v8, v9		; CHECK-NEXT: vmseq.vv v8, v8, v9
; CHECK-NEXT: vmand.mm v0, v8, v0		; CHECK-NEXT: vmand.mm v0, v8, v0
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
%vl = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 3, i64 0)		%vl = tail call i64 @llvm.riscv.vsetvli(i64 %avl, i64 3, i64 0)
%a = call <vscale x 1 x i1> @llvm.riscv.vmseq.nxv1i64.i64(<vscale x 1 x i64> %0, <vscale x 1 x i64> %1, i64 %vl)		%a = call <vscale x 1 x i1> @llvm.riscv.vmseq.nxv1i64.i64(<vscale x 1 x i64> %0, <vscale x 1 x i64> %1, i64 %vl)
%b = call <vscale x 1 x i1> @llvm.riscv.vmand.nxv1i1.i64(<vscale x 1 x i1> %a, <vscale x 1 x i1> %2, i64 %vl)		%b = call <vscale x 1 x i1> @llvm.riscv.vmand.nxv1i1.i64(<vscale x 1 x i1> %a, <vscale x 1 x i1> %2, i64 %vl)
ret <vscale x 1 x i1> %b		ret <vscale x 1 x i1> %b
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	%y = call <vscale x 1 x i64> @llvm.riscv.vmv.s.x.nxv1i64(
i64 %b, i64 1)		i64 %b, i64 1)

ret <vscale x 1 x i64> %y		ret <vscale x 1 x i64> %y
}		}

define <vscale x 1 x i64> @test8(<vscale x 1 x i64> %a, i64 %b, <vscale x 1 x i1> %mask) nounwind {		define <vscale x 1 x i64> @test8(<vscale x 1 x i64> %a, i64 %b, <vscale x 1 x i1> %mask) nounwind {
; CHECK-LABEL: test8:		; CHECK-LABEL: test8:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: vsetivli a1, 6, e64, m1, tu, mu		; CHECK-NEXT: vsetivli zero, 6, e64, m1, tu, mu
; CHECK-NEXT: vmv.s.x v8, a0		; CHECK-NEXT: vmv.s.x v8, a0
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
%x = tail call i64 @llvm.riscv.vsetvli(i64 6, i64 3, i64 0)		%x = tail call i64 @llvm.riscv.vsetvli(i64 6, i64 3, i64 0)
%y = call <vscale x 1 x i64> @llvm.riscv.vmv.s.x.nxv1i64(<vscale x 1 x i64> %a, i64 %b, i64 2)		%y = call <vscale x 1 x i64> @llvm.riscv.vmv.s.x.nxv1i64(<vscale x 1 x i64> %a, i64 %b, i64 2)
ret <vscale x 1 x i64> %y		ret <vscale x 1 x i64> %y
}		}

Show All 27 Lines	entry:
%y = call <vscale x 1 x double> @llvm.riscv.vfmv.s.f.nxv1f64(		%y = call <vscale x 1 x double> @llvm.riscv.vfmv.s.f.nxv1f64(
<vscale x 1 x double> %a, double %b, i64 1)		<vscale x 1 x double> %a, double %b, i64 1)
ret <vscale x 1 x double> %y		ret <vscale x 1 x double> %y
}		}

define <vscale x 1 x double> @test11(<vscale x 1 x double> %a, double %b) nounwind {		define <vscale x 1 x double> @test11(<vscale x 1 x double> %a, double %b) nounwind {
; CHECK-LABEL: test11:		; CHECK-LABEL: test11:
; CHECK: # %bb.0: # %entry		; CHECK: # %bb.0: # %entry
; CHECK-NEXT: vsetivli a0, 6, e64, m1, tu, mu		; CHECK-NEXT: vsetivli zero, 6, e64, m1, tu, mu
; CHECK-NEXT: vfmv.s.f v8, fa0		; CHECK-NEXT: vfmv.s.f v8, fa0
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
%x = tail call i64 @llvm.riscv.vsetvli(i64 6, i64 3, i64 0)		%x = tail call i64 @llvm.riscv.vsetvli(i64 6, i64 3, i64 0)
%y = call <vscale x 1 x double> @llvm.riscv.vfmv.s.f.nxv1f64(		%y = call <vscale x 1 x double> @llvm.riscv.vfmv.s.f.nxv1f64(
<vscale x 1 x double> %a, double %b, i64 2)		<vscale x 1 x double> %a, double %b, i64 2)
ret <vscale x 1 x double> %y		ret <vscale x 1 x double> %y
}		}
▲ Show 20 Lines • Show All 164 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir

Show First 20 Lines • Show All 343 Lines • ▼ Show 20 Lines	bb.0.entry:
liveins: $v8, $v9, $x10		liveins: $v8, $v9, $x10

; CHECK-LABEL: name: vsetvli_add		; CHECK-LABEL: name: vsetvli_add
; CHECK: liveins: $v8, $v9, $x10		; CHECK: liveins: $v8, $v9, $x10
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gprnox0 = COPY $x10		; CHECK-NEXT: [[COPY:%[0-9]+]]:gprnox0 = COPY $x10
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vr = COPY $v9		; CHECK-NEXT: [[COPY1:%[0-9]+]]:vr = COPY $v9
; CHECK-NEXT: [[COPY2:%[0-9]+]]:vr = COPY $v8		; CHECK-NEXT: [[COPY2:%[0-9]+]]:vr = COPY $v8
; CHECK-NEXT: [[PseudoVSETVLI:%[0-9]+]]:gprnox0 = PseudoVSETVLI [[COPY]], 88 /* e64, m1, ta, mu */, implicit-def $vl, implicit-def $vtype		; CHECK-NEXT: $x0 = PseudoVSETVLI [[COPY]], 88 /* e64, m1, ta, mu */, implicit-def $vl, implicit-def $vtype
; CHECK-NEXT: [[PseudoVADD_VV_M1_:%[0-9]+]]:vr = PseudoVADD_VV_M1 [[COPY2]], [[COPY1]], $noreg, 6 /* e64 */, implicit $vl, implicit $vtype		; CHECK-NEXT: [[PseudoVADD_VV_M1_:%[0-9]+]]:vr = PseudoVADD_VV_M1 [[COPY2]], [[COPY1]], $noreg, 6 /* e64 */, implicit $vl, implicit $vtype
; CHECK-NEXT: $v8 = COPY [[PseudoVADD_VV_M1_]]		; CHECK-NEXT: $v8 = COPY [[PseudoVADD_VV_M1_]]
; CHECK-NEXT: PseudoRET implicit $v8		; CHECK-NEXT: PseudoRET implicit $v8
%2:gprnox0 = COPY $x10		%2:gprnox0 = COPY $x10
%1:vr = COPY $v9		%1:vr = COPY $v9
%0:vr = COPY $v8		%0:vr = COPY $v8
%3:gprnox0 = PseudoVSETVLI %2, 88, implicit-def dead $vl, implicit-def dead $vtype		%3:gprnox0 = PseudoVSETVLI %2, 88, implicit-def dead $vl, implicit-def dead $vtype
%4:vr = PseudoVADD_VV_M1 %0, %1, killed %3, 6		%4:vr = PseudoVADD_VV_M1 %0, %1, killed %3, 6
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines