This is an archive of the discontinued LLVM Phabricator instance.

Differential D118629

[RISCV] Add a test showing an incorrect VSETVLI insertion
ClosedPublic

Authored by frasercrmck on Jan 31 2022, 10:03 AM.

Download Raw Diff

Details

Reviewers

craig.topper

Commits

rGb00bce2a93b3: [RISCV] Add a test showing an incorrect VSETVLI insertion

Summary

This test shows a loop, whose preheader uses a SEW=64, LMUL=1 vector
operation. The loop body starts off with another SEW=64, LMUL=1 VADD
vector operation, before switching to a SEW=32, LMUL=1/2 vector store
instruction.

We can see that the VSETVLI insertion pass omits a VSETVLI before the
VADD (thinking it inherits its configuration from the preheader) but
does place a SEW=32, LMUL=1/2 VSETVLI before the store. This results in
a miscompilation as when the loop comes back around, the VADD is
incorrectly configured with SEW=32, LMUL=1/2.

It appears to be a bad load/store optimization, as replacing the vector
store with an SEW=32, LMUL=1/2 VADD does correctly insert a VSETVLI. The
issue is therefore possibly arising from canSkipVSETVLIForLoadStore.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

frasercrmck created this revision.Jan 31 2022, 10:03 AM

Herald added subscribers: VincentWu, luke957, achieveartificialintelligence and 25 others. · View Herald TranscriptJan 31 2022, 10:03 AM

frasercrmck requested review of this revision.Jan 31 2022, 10:03 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 31 2022, 10:03 AM

Herald added subscribers: llvm-commits, • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

Sadly found this on the eve of LLVM 14's branch. I've run out of time to fix it today but I'll take a look tomorrow.

add a comment explaining the issue

Harbormaster completed remote builds in B146680: Diff 404589.Jan 31 2022, 12:19 PM

craig.topper mentioned this in D118667: [RISCV] Fix a vsetvli insertion bug involving loads/stores..Jan 31 2022, 4:20 PM

frasercrmck edited the summary of this revision. (Show Details)Feb 1 2022, 2:28 AM

improve test FIXME

This revision was not accepted when it landed; it landed in state Needs Review.Feb 1 2022, 2:31 AM

This revision was landed with ongoing or failed builds.

Closed by commit rGb00bce2a93b3: [RISCV] Add a test showing an incorrect VSETVLI insertion (authored by frasercrmck). · Explain Why

This revision was automatically updated to reflect the committed changes.

frasercrmck added a commit: rGb00bce2a93b3: [RISCV] Add a test showing an incorrect VSETVLI insertion.

I landed this test without it formally being accepted to reduce dependencies between myself and @craig.topper.

Harbormaster completed remote builds in B146844: Diff 404853.Feb 1 2022, 3:30 AM

craig.topper mentioned this in rG7eb781072744: [RISCV] Fix a vsetvli insertion bug involving loads/stores..Feb 1 2022, 7:37 AM

Revision Contents

Path

Size

llvm/

test/

CodeGen/

RISCV/

rvv/

vsetvli-insert-crossbb.mir

76 lines

Diff 404855

llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir

Show First 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	if.end: ; preds = %if.else, %if.then
%d = phi <vscale x 1 x i64> [ %b, %if.then ], [ %c, %if.else ]		%d = phi <vscale x 1 x i64> [ %b, %if.then ], [ %c, %if.else ]
ret <vscale x 1 x i64> %d		ret <vscale x 1 x i64> %d
}		}

define void @vsetvli_vcpop() {		define void @vsetvli_vcpop() {
ret void		ret void
}		}

		define void @vsetvli_loop_store() {
		ret void
		}

; Function Attrs: nounwind readnone		; Function Attrs: nounwind readnone
declare <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64.i64(<vscale x 1 x i64>, <vscale x 1 x i64>, i64) #1		declare <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64.i64(<vscale x 1 x i64>, <vscale x 1 x i64>, i64) #1

; Function Attrs: nounwind readnone		; Function Attrs: nounwind readnone
declare <vscale x 1 x i64> @llvm.riscv.vsub.nxv1i64.nxv1i64.i64(<vscale x 1 x i64>, <vscale x 1 x i64>, i64) #1		declare <vscale x 1 x i64> @llvm.riscv.vsub.nxv1i64.nxv1i64.i64(<vscale x 1 x i64>, <vscale x 1 x i64>, i64) #1

; Function Attrs: nounwind readonly		; Function Attrs: nounwind readonly
declare <vscale x 1 x i64> @llvm.riscv.vle.nxv1i64.i64(<vscale x 1 x i64>, <vscale x 1 x i64>* nocapture, i64) #3		declare <vscale x 1 x i64> @llvm.riscv.vle.nxv1i64.i64(<vscale x 1 x i64>, <vscale x 1 x i64>* nocapture, i64) #3
▲ Show 20 Lines • Show All 416 Lines • ▼ Show 20 Lines	bb.2:
%9:gpr = LWU %1, 0		%9:gpr = LWU %1, 0

bb.3:		bb.3:
%10:gpr = PHI %2, %bb.1, %9, %bb.2		%10:gpr = PHI %2, %bb.1, %9, %bb.2
%11:vr = nsw PseudoVADD_VX_MF2 %6, %10, -1, 5		%11:vr = nsw PseudoVADD_VX_MF2 %6, %10, -1, 5
$v0 = COPY %11		$v0 = COPY %11
PseudoRET implicit $v0		PseudoRET implicit $v0
...		...
		---
		name: vsetvli_loop_store
		tracksRegLiveness: true
		registers:
		- { id: 0, class: gpr, preferred-register: '' }
		- { id: 1, class: gpr, preferred-register: '' }
		- { id: 2, class: gpr, preferred-register: '' }
		- { id: 3, class: gpr, preferred-register: '' }
		- { id: 4, class: vr, preferred-register: '' }
		- { id: 5, class: gpr, preferred-register: '' }
		- { id: 6, class: gpr, preferred-register: '' }
		- { id: 7, class: vr, preferred-register: '' }
		- { id: 8, class: gpr, preferred-register: '' }
		- { id: 9, class: gpr, preferred-register: '' }
		- { id: 10, class: gpr, preferred-register: '' }
		body: \|
		; CHECK-LABEL: name: vsetvli_loop_store
		; CHECK: bb.0:
		; CHECK-NEXT: successors: %bb.1(0x80000000)
		; CHECK-NEXT: liveins: $x10, $x11
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr = COPY $x10
		; CHECK-NEXT: [[PseudoReadVLENB:%[0-9]+]]:gpr = PseudoReadVLENB
		; CHECK-NEXT: [[SRLI:%[0-9]+]]:gpr = SRLI [[PseudoReadVLENB]], 3
		; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr = COPY $x11
		; CHECK-NEXT: dead %11:gpr = PseudoVSETVLIX0 $x0, 88, implicit-def $vl, implicit-def $vtype
		; CHECK-NEXT: [[PseudoVID_V_M1_:%[0-9]+]]:vr = PseudoVID_V_M1 -1, 6, implicit $vl, implicit $vtype
		; CHECK-NEXT: [[COPY2:%[0-9]+]]:gpr = COPY $x0
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: bb.1:
		; CHECK-NEXT: successors: %bb.1(0x40000000), %bb.2(0x40000000)
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: [[PHI:%[0-9]+]]:gpr = PHI [[COPY2]], %bb.0, %10, %bb.1
		; CHECK-NEXT: [[PseudoVADD_VX_M1_:%[0-9]+]]:vr = PseudoVADD_VX_M1 [[PseudoVID_V_M1_]], [[PHI]], -1, 6, implicit $vl, implicit $vtype
		; CHECK-NEXT: [[MUL:%[0-9]+]]:gpr = MUL [[PHI]], [[SRLI]]
		; CHECK-NEXT: [[ADD:%[0-9]+]]:gpr = ADD [[COPY]], [[MUL]]
		; FIXME: We insert a SEW=32,LMUL=1/2 VSETVLI here but no SEW=64,LMUL=1
		; VSETVLI before the VADD above. This misconfigures the VADD in the case that
		; the loop takes its backedge.
		; CHECK-NEXT: dead $x0 = PseudoVSETVLIX0 killed $x0, 87, implicit-def $vl, implicit-def $vtype, implicit $vl
		; CHECK-NEXT: PseudoVSE32_V_MF2 killed [[PseudoVADD_VX_M1_]], killed [[ADD]], -1, 5, implicit $vl, implicit $vtype
		; CHECK-NEXT: [[ADDI:%[0-9]+]]:gpr = ADDI [[PHI]], 1
		; CHECK-NEXT: BLTU [[ADDI]], [[COPY1]], %bb.1
		; CHECK-NEXT: PseudoBR %bb.2
		; CHECK-NEXT: {{ $}}
		; CHECK-NEXT: bb.2:
		; CHECK-NEXT: PseudoRET
		bb.0:
		liveins: $x10, $x11
		%0:gpr = COPY $x10
		%1:gpr = PseudoReadVLENB
		%2:gpr = SRLI %1:gpr, 3
		%3:gpr = COPY $x11
		%4:vr = PseudoVID_V_M1 -1, 6
		%5:gpr = COPY $x0

		bb.1:
		successors: %bb.1, %bb.2

		%6:gpr = PHI %5:gpr, %bb.0, %10:gpr, %bb.1
		%7:vr = PseudoVADD_VX_M1 %4:vr, %6:gpr, -1, 6
		%8:gpr = MUL %6:gpr, %2:gpr
		%9:gpr = ADD %0:gpr, %8:gpr
		PseudoVSE32_V_MF2 killed %7:vr, killed %9:gpr, -1, 5
		%10:gpr = ADDI %6:gpr, 1
		BLTU %10:gpr, %3:gpr, %bb.1
		PseudoBR %bb.2

		bb.2:

		PseudoRET
		...