Download Raw Diff

Details

Reviewers

craig.topper
HsiangKai
rogfer01
evandro
khchen
arcbbb

Commits

rG1ffc3693949c: [RISCV] Add a test showing an incorrect vsetvli insertion

Summary

This patch adds a reduced test case which identifies an illegal vsetvli
inserted by the compiler. The compiler emits a vsetvli which is intended
to preserve VL with the SEW/LMUL ratio e32/m1 when in fact the VL could
have been set by e64/m2 in a predecessor block.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	3,410 ms	x64 debian > libarcher.barrier::barrier.c
	3,560 ms	x64 debian > libarcher.critical::critical.c
	3,720 ms	x64 debian > libarcher.critical::lock-nested.c
	3,620 ms	x64 debian > libarcher.critical::lock.c
	3,610 ms	x64 debian > libarcher.parallel::parallel-simple.c
		View Full Test Results (19 Failed)

Event Timeline

frasercrmck created this revision.Jul 19 2021, 9:04 AM

Herald added subscribers: vkmr, luismarques, apazos and 21 others. · View Herald TranscriptJul 19 2021, 9:04 AM

frasercrmck requested review of this revision.Jul 19 2021, 9:04 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 19 2021, 9:04 AM

Herald added subscribers: llvm-commits, MaskRay. · View Herald Transcript

I haven't found the time to investigate the actual cause of this bug yet, so I thought I'd stick it up as a regression test so that people are aware of the issue; since it's something the simulator chokes on it may or may not actually produce a failure depending on the specific simulator/test case. I can maybe take a look later this week.

Harbormaster completed remote builds in B114879: Diff 359814.Jul 19 2021, 9:41 AM

Why is preserving vl wrong here? Isn't the vl 4 regardless of which it came from?

I believe the pass is preserving it because it believes that AVL=4 will produce the same vl as long as sew/lmul is constant which it should be for sew=64/lmul=2 vs sew=32/lmul=1

In D106286#2888143, @craig.topper wrote:

Why is preserving vl wrong here? Isn't the vl 4 regardless of which it came from?

I believe the pass is preserving it because it believes that AVL=4 will produce the same vl as long as sew/lmul is constant which it should be for sew=64/lmul=2 vs sew=32/lmul=1

You're right, I must have reduced the test case too far at one point and stopped thinking properly. I'll go back to the original failing case.

reduce the test correctly

Sorry about that @craig.topper I was rushing it. The test case is now reduced a lot better. I believe it's an issue where the vmv.x.s "doesn't need" an AVL (it's RISCV::NoRegister) because the operation is VL-agnostic, as it were, so we fall into a trap where we insert a vsetvli zero,zero even though that's an illegal attempt to preserve VL.

Harbormaster completed remote builds in B115059: Diff 360071.Jul 20 2021, 4:11 AM

That's very interesting and I think it has always been broken. I modified the test case to not depend on VLS support and it does this on 12.0.1 as well.

define i32 @foo(<vscale x 2 x i32> %a, <vscale x 4 x i64> %x, <vscale x 4 x i64>* %y) {
  %index = add <vscale x 4 x i64> %x, %x
  store <vscale x 4 x i64> %index, <vscale x 4 x i64>* %y
  %elt = extractelement <vscale x 2 x i32> %a, i64 0
  ret i32 %elt
}

We don't have a VL to use here and I don't think we should make one up. I think the right fix is to add the VL argument to the intrinsic and the instruction.

craig.topper added inline comments.Jul 20 2021, 2:45 PM

llvm/test/CodeGen/RISCV/rvv/vsetvli-regression.ll

Can we use

define i32 @foo(<vscale x 2 x i32> %a, <vscale x 4 x i64> %x, <vscale x 4 x i64>* %y) {
  %index = add <vscale x 4 x i64> %x, %x
  store <vscale x 4 x i64> %index, <vscale x 4 x i64>* %y
  %elt = extractelement <vscale x 2 x i32> %a, i64 0
  ret i32 %elt
}

to avoid the mix of fixed and scalable.

craig.topper mentioned this in D106403: [RISCV] Avoid using x0,x0 vsetvli for vmv.x.s and vfmv.f.s unless we know the sew/lmul ratio is constant..Jul 20 2021, 2:50 PM

update test according to feedback

remove unused vlen-bits-min

In D106286#2890831, @craig.topper wrote:

That's very interesting and I think it has always been broken. [...[
We don't have a VL to use here and I don't think we should make one up. I think the right fix is to add the VL argument to the intrinsic and the instruction.

Interesting, thanks for the context. I'll keep discussion about the fix(es) to D106403 to keep everything well-threaded.

llvm/test/CodeGen/RISCV/rvv/vsetvli-regression.ll
1	Sure. The original test was a mix but I think it's best to keep it as simple as possible.

frasercrmck marked an inline comment as done.Jul 21 2021, 1:12 AM

Harbormaster completed remote builds in B115262: Diff 360377.Jul 21 2021, 1:47 AM

@craig.topper do you think we should merge this in or just abandon it since D106403 presumably covers the same case?

craig.topper mentioned this in rG5edccc458155: [RISCV] Avoid using x0,x0 vsetvli for vmv.x.s and vfmv.f.s unless we know the….Jul 23 2021, 9:13 AM

LGTM I'll go ahead and merge this.

This revision is now accepted and ready to land.Jul 23 2021, 9:27 AM

This revision was landed with ongoing or failed builds.Jul 23 2021, 9:27 AM

Closed by commit rG1ffc3693949c: [RISCV] Add a test showing an incorrect vsetvli insertion (authored by frasercrmck, committed by craig.topper). · Explain Why

This revision was automatically updated to reflect the committed changes.

craig.topper added a commit: rG1ffc3693949c: [RISCV] Add a test showing an incorrect vsetvli insertion.

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add a test showing an incorrect vsetvli insertion
ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 360071

llvm/test/CodeGen/RISCV/rvv/vsetvli-regression.ll

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add a test showing an incorrect vsetvli insertionClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 360071

llvm/test/CodeGen/RISCV/rvv/vsetvli-regression.ll

[RISCV] Add a test showing an incorrect vsetvli insertion
ClosedPublic