This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
RISCVSubtarget.h
1/4
RISCVTargetTransformInfo.h
-
test/Transforms/LoopVectorize/RISCV/
-
Transforms/
-
LoopVectorize/
-
RISCV/
5/9
riscv-interleaved.ll

Differential D101469

[RISCV] Enable interleaved vectorization for RVV
ClosedPublic

Authored by luke957 on Apr 28 2021, 9:17 AM.

Download Raw Diff

Details

Reviewers

craig.topper
frasercrmck
HsiangKai

Commits

rGc4c3869554a6: [RISCV] Enable interleaved vectorization for RVV

Summary

Enable interleaved vectorization for RVV.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	90 ms	x64 debian > Clang.Driver::debug-pass-structure.c

Event Timeline

luke957 created this revision.Apr 28 2021, 9:17 AM

Herald added subscribers: vkmr, frasercrmck, evandro and 25 others. · View Herald TranscriptApr 28 2021, 9:17 AM

luke957 requested review of this revision.Apr 28 2021, 9:17 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2021, 9:17 AM

Herald added subscribers: llvm-commits, MaskRay. · View Herald Transcript

luke957 added reviewers: craig.topper, frasercrmck, HsiangKai.Apr 28 2021, 9:19 AM

craig.topper added inline comments.Apr 28 2021, 9:27 AM

llvm/test/Transforms/LoopVectorize/RISCV/riscv-interleaved.ll
5	Why are we not checking the generated IR?
42	Do we need the debug info?

Harbormaster completed remote builds in B101441: Diff 341236.Apr 28 2021, 11:04 AM

luke957 updated this revision to Diff 341536.Apr 29 2021, 8:55 AM

luke957 added inline comments.Apr 29 2021, 9:06 AM

llvm/test/Transforms/LoopVectorize/RISCV/riscv-interleaved.ll
5	Checking the generated IR is better.
42	DI is not necessary.

Harbormaster completed remote builds in B101654: Diff 341536.Apr 29 2021, 9:42 AM

ping

Do you still need to update the diff to address the previous comments?

In D101469#2766122, @frasercrmck wrote:

Do you still need to update the diff to address the previous comments?

Em, I have modified the test case according to the comments "checking the generated IR" and "debug info"(DI removed). Any other comments?

craig.topper added inline comments.May 18 2021, 10:09 AM

llvm/test/Transforms/LoopVectorize/RISCV/riscv-interleaved.ll
6	Is this just checking the induction variable increment? I'd really like to see what vector instructions it generates.

Rebase and update.

luke957 added inline comments.May 20 2021, 10:27 AM

llvm/test/Transforms/LoopVectorize/RISCV/riscv-interleaved.ll
6	For this test case, vf and uf will be 4 and 2 respectively. So the vector instructions will repeat once in one trip, and there will be an instruction like `%{{.}} = add <4 x i32> %{{.}}, <i32 4, i32 4, i32 4, i32 4>`.

Harbormaster completed remote builds in B105454: Diff 346778.May 20 2021, 11:09 AM

ping

LGTM

This revision is now accepted and ready to land.May 28 2021, 9:56 AM

Closed by commit rGc4c3869554a6: [RISCV] Enable interleaved vectorization for RVV (authored by luke957). · Explain WhyMay 28 2021, 8:28 PM

This revision was automatically updated to reflect the committed changes.

luke957 added a commit: rGc4c3869554a6: [RISCV] Enable interleaved vectorization for RVV.

I just noticed that this enabled interleaving in the loop vectorizer even when the V extension isn't enabled. So we now generate interleaved scalar code in some cases. Was that intentional?

In D101469#2801767, @craig.topper wrote:

I just noticed that this enabled interleaving in the loop vectorizer even when the V extension isn't enabled. So we now generate interleaved scalar code in some cases. Was that intentional?

Thanks for reminding me. Sorry for my carelessness.

In D101469#2801793, @luke957 wrote:

In D101469#2801767, @craig.topper wrote:

I just noticed that this enabled interleaving in the loop vectorizer even when the V extension isn't enabled. So we now generate interleaved scalar code in some cases. Was that intentional?

Thanks for reminding me. Sorry for my carelessness.

Fixed in https://reviews.llvm.org/D103787

craig.topper added inline comments.Jun 8 2021, 11:56 AM

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
173	I think these are two different features. enableInterleavedAccessVectorization is for memory accesses that are interlaved. getMaxInterleaveFactor controls loop unrolling in the vectorizer. Which feature were you trying to enable?

craig.topper added inline comments.Jun 8 2021, 11:57 AM

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
173	I think enableInterleavedAccessVectorization reads extra data and uses shuffles to extract the elements that are needed.

jrtc27 added inline comments.Jun 8 2021, 12:45 PM

llvm/test/Transforms/LoopVectorize/RISCV/riscv-interleaved.ll
5	Why is this not using update_test_checks.py?
12–51	This IR is very messy, Clang-output IR does not always make for clean test cases. We don't need Function Attrs comments, we don't need press comments, many of the attributes are unnecessary, and the ones that are are best done inline. IR tests should be minimal, ideally from-scratch, but whittling Clang-produced IR down to something that could feasibly have been hand-written (or generated by a simple tool, like RVV and RVA tests) is fine.

luke957 added inline comments.Jun 16 2021, 1:21 AM

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
173	Yes, bool enableInterleavedAccessVectorization() should not be added here. I'll restore the code and submit a new patch.

luke957 mentioned this in D104364: [RISCV] Don't enable Interleaved Access Vectorization.Jun 16 2021, 1:39 AM

luke957 added inline comments.Jun 16 2021, 2:24 AM

llvm/test/Transforms/LoopVectorize/RISCV/riscv-interleaved.ll
5	Yeah, using update_test_checks.py is better. But as said update_test_checks.py itself, update_test_checks.py is not designed to be authoritative about what constitutes a good test :)
12–51	Thanks for the review. I'll update the test case to look canonical.

luke957 added inline comments.Jun 16 2021, 2:29 AM

llvm/test/Transforms/LoopVectorize/RISCV/riscv-interleaved.ll
5	Thanks for the comment. I'll try to update the test case using update_test_checks.py.

luke957 mentioned this in D104393: [RISCV] Update test case.Jun 16 2021, 8:56 AM

luke957 mentioned this in rGc2e97ba85e46: [RISCV] Don't enable Interleaved Access Vectorization.Jun 17 2021, 9:38 PM

Miss_Grape added a subscriber: Miss_Grape.May 9 2022, 2:06 AM

Miss_Grape added inline comments.

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
175	Do you need to report an error if the user passes the command "-force-vector-interleave=value" that value > 2

Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2022, 2:06 AM

Herald added subscribers: sunshaoce, • pcwang-thead, eopXD and 3 others. · View Herald Transcript

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVSubtarget.h

2 lines

RISCVTargetTransformInfo.h

5 lines

test/

Transforms/

LoopVectorize/

RISCV/

riscv-interleaved.ll

48 lines

Diff 341536

llvm/lib/Target/RISCV/RISCVSubtarget.h

Context not available.
	bool EnableSaveRestore = false;	bool EnableSaveRestore = false;
	unsigned XLen = 32;	unsigned XLen = 32;
	MVT XLenVT = MVT::i32;	MVT XLenVT = MVT::i32;
		uint8_t MaxInterleaveFactor = 2;
	RISCVABI::ABI TargetABI = RISCVABI::ABI_Unknown;	RISCVABI::ABI TargetABI = RISCVABI::ABI_Unknown;
	BitVector UserReservedRegister;	BitVector UserReservedRegister;
	RISCVFrameLowering FrameLowering;	RISCVFrameLowering FrameLowering;
Context not available.
	assert(i < RISCV::NUM_TARGET_REGS && "Register out of range");	assert(i < RISCV::NUM_TARGET_REGS && "Register out of range");
	return UserReservedRegister[i];	return UserReservedRegister[i];
	}	}
		unsigned getMaxInterleaveFactor() const { return MaxInterleaveFactor; }

	protected:	protected:
	// GlobalISel related APIs.	// GlobalISel related APIs.
Context not available.

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h

Context not available.
	bool isLegalMaskedScatter(Type *DataType, Align Alignment) {	bool isLegalMaskedScatter(Type *DataType, Align Alignment) {
	return isLegalMaskedGatherScatter(DataType, Alignment);	return isLegalMaskedGatherScatter(DataType, Alignment);
	}	}

		bool enableInterleavedAccessVectorization() { return true; }
		unsigned getMaxInterleaveFactor(unsigned VF) {
		return ST->getMaxInterleaveFactor();
		}
	};	};

	} // end namespace llvm	} // end namespace llvm
Context not available.
		craig.topperUnsubmitted Not Done Reply Inline Actions I think these are two different features. enableInterleavedAccessVectorization is for memory accesses that are interlaved. getMaxInterleaveFactor controls loop unrolling in the vectorizer. Which feature were you trying to enable? craig.topper: I think these are two different features. enableInterleavedAccessVectorization is for memory…
		craig.topperUnsubmitted Not Done Reply Inline Actions I think enableInterleavedAccessVectorization reads extra data and uses shuffles to extract the elements that are needed. craig.topper: I think enableInterleavedAccessVectorization reads extra data and uses shuffles to extract the…
		luke957AuthorUnsubmitted Done Reply Inline Actions Yes, bool enableInterleavedAccessVectorization() should not be added here. I'll restore the code and submit a new patch. luke957: Yes, bool enableInterleavedAccessVectorization() should not be added here. I'll restore the…
		Miss_GrapeUnsubmitted Not Done Reply Inline Actions Do you need to report an error if the user passes the command "-force-vector-interleave=value" that value > 2 Miss_Grape: Do you need to report an error if the user passes the command "-force-vector-interleave=value"…

llvm/test/Transforms/LoopVectorize/RISCV/riscv-interleaved.ll

This file was added.

				; RUN: opt -loop-vectorize -dce -instcombine -mtriple riscv64-linux-gnu -mattr=+experimental-v \
				; RUN: -debug-only=loop-vectorize -riscv-v-vector-bits-min=128 -S < %s 2>&1 \| FileCheck %s

				; CHECK-LABEL: foo
				; CHECK: LV: IC is 2
				craig.topperUnsubmitted Not Done Reply Inline Actions Why are we not checking the generated IR? craig.topper: Why are we not checking the generated IR?
				luke957AuthorUnsubmitted Done Reply Inline Actions Checking the generated IR is better. luke957: Checking the generated IR is better.
				jrtc27Unsubmitted Not Done Reply Inline Actions Why is this not using update_test_checks.py? jrtc27: Why is this not using update_test_checks.py?
				luke957AuthorUnsubmitted Done Reply Inline Actions Yeah, using update_test_checks.py is better. But as said update_test_checks.py itself, update_test_checks.py is not designed to be authoritative about what constitutes a good test :) luke957: Yeah, using update_test_checks.py is better. But as said update_test_checks.py itself…
				luke957AuthorUnsubmitted Done Reply Inline Actions Thanks for the comment. I'll try to update the test case using update_test_checks.py. luke957: Thanks for the comment. I'll try to update the test case using update_test_checks.py.
				; CHECK: %{{.}} = add {{.}}, 8
				craig.topperUnsubmitted Not Done Reply Inline Actions Is this just checking the induction variable increment? I'd really like to see what vector instructions it generates. craig.topper: Is this just checking the induction variable increment? I'd really like to see what vector…
				luke957AuthorUnsubmitted Done Reply Inline Actions For this test case, vf and uf will be 4 and 2 respectively. So the vector instructions will repeat once in one trip, and there will be an instruction like `%{{.}} = add <4 x i32> %{{.}}, <i32 4, i32 4, i32 4, i32 4>`. luke957: For this test case, vf and uf will be 4 and 2 respectively. So the vector instructions will…

				; Function Attrs: nofree norecurse nosync nounwind writeonly
				define dso_local void @foo(i32 signext %n, i32* nocapture %A) local_unnamed_addr #0 {
				entry:
				%cmp5 = icmp sgt i32 %n, 0
				br i1 %cmp5, label %for.body.preheader, label %for.cond.cleanup

				for.body.preheader: ; preds = %entry
				%wide.trip.count = zext i32 %n to i64
				br label %for.body

				for.cond.cleanup.loopexit: ; preds = %for.body
				br label %for.cond.cleanup

				for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry
				ret void

				for.body: ; preds = %for.body.preheader, %for.body
				%indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
				%arrayidx = getelementptr inbounds i32, i32* %A, i64 %indvars.iv
				%0 = trunc i64 %indvars.iv to i32
				store i32 %0, i32* %arrayidx, align 4, !tbaa !4
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond.not = icmp eq i64 %indvars.iv.next, %wide.trip.count
				br i1 %exitcond.not, label %for.cond.cleanup.loopexit, label %for.body, !llvm.loop !8
				}

				attributes #0 = { nofree norecurse nosync nounwind writeonly "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+64bit,+a,+c,+m,+relax,-save-restore" }

				!llvm.module.flags = !{!0, !1, !2}
				!llvm.ident = !{!3}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 1, !"target-abi", !"lp64"}
				!2 = !{i32 1, !"SmallDataLimit", i32 8}
				!3 = !{!"clang version 13.0.0"}
				craig.topperUnsubmitted Not Done Reply Inline Actions Do we need the debug info? craig.topper: Do we need the debug info?
				luke957AuthorUnsubmitted Done Reply Inline Actions DI is not necessary. luke957: DI is not necessary.
				!4 = !{!5, !5, i64 0}
				!5 = !{!"int", !6, i64 0}
				!6 = !{!"omnipotent char", !7, i64 0}
				!7 = !{!"Simple C/C++ TBAA"}
				!8 = distinct !{!8, !9}
				!9 = !{!"llvm.loop.mustprogress"}