This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
1/1
PPCISelLowering.cpp
5/5
PPCInstrVSX.td
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
4/4
ppc-32bit-build-vector.ll

Differential D92789

[PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern matching built vectors
ClosedPublic

Authored by ZarkoCA on Dec 7 2020, 2:38 PM.

Download Raw Diff

Details

Reviewers

nemanjai
sfertile
cebowleratibm
Xiangling_L

Commits

rGce4040a43d54: [PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern…

Summary

Some of the pattern matching in PPCInstrVSX.td and node lowering involving vectors assumes 64bit mode. This patch disables some of the unsafe pattern matching and lowering of BUILD_VECTOR in 32bit mode.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	50 ms	x64 windows > LLVM.CodeGen/XCore::threads.ll

Event Timeline

ZarkoCA created this revision.Dec 7 2020, 2:38 PM

Herald added subscribers: shchenz, kbarton, hiraditya. · View Herald TranscriptDec 7 2020, 2:38 PM

ZarkoCA requested review of this revision.Dec 7 2020, 2:38 PM

Herald added a subscriber: llvm-commits. · View Herald TranscriptDec 7 2020, 2:38 PM

Harbormaster completed remote builds in B81358: Diff 310025.Dec 7 2020, 3:26 PM

Whitney added a subscriber: Whitney.Dec 9 2020, 8:21 PM

Ping.

Xiangling_L added a subscriber: Xiangling_L.Dec 10 2020, 2:20 PM

Xiangling_L added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
9389–9390	minor: Is that better to adjust the comment accordingly like: `Under 64-bit mode, BUILD_VECTOR nodes that ...`
llvm/lib/Target/PowerPC/PPCInstrVSX.td
2412	I am wondering why do we leave alone `[HasVSX, IsBigEndian]` and `[HasVSX, HasOnlySwappingMemOps, IsBigEndian]` and `HasVSX, HasP8Vector, NoP9Vector, IsBigEndian]`?
2427–2428	minor: This line should follow #2425?
llvm/test/CodeGen/PowerPC/ppc-32bit-build-vector.ll
4	It seems `-mcpu=pwr8` already implies HasAltivec and HasVSX enabled, so `-mattr=+altivec` and `-mattr=+vsx` are not necessary.
68	Suggestion: The testcase can be further simplified into the following by removing tedious `zext` instruction and shortening the numbers of vector. define dso_local fastcc void @BuildVectorICE() unnamed_addr #0 { entry: br label %while.body while.body: ; preds = %while.body, %entry %newelement = phi i32 [ 0, %entry ], [ %5, %while.body ] %0 = insertelement <4 x i32> <i32 undef, i32 0, i32 0, i32 0>, i32 %newelement, i32 0 %1 = load <4 x i32>, <4 x i32>* undef, align 1 %2 = add <4 x i32> %1, %0 %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %4 = add <4 x i32> %2, %3 %5 = extractelement <4 x i32> %4, i32 0 br label %while.body }

ZarkoCA marked 4 inline comments as done.Dec 11 2020, 6:11 AM

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCInstrVSX.td
2412	That's a good question, I tried disabling these so as to have minimal disruption to existing code. And did it by elimination. So, as far as I can see the potentially big problem in the pattern matching is later implementations like `[HasVSX, IsISA3_0, HasDirectMove, IsBigEndian]` were implemented without considering 32bit codegen and I added a check there and moved on to earlier ones. At some point, I started hitting test case failures and found that codegen in those cases was correct. And in those cases I found that there were checks added for 32bit like in this patch: https://reviews.llvm.org/D17711. In short, I tried to do the safe thing for 32bit VSX without major disruption to existing test cases and codegen unless I could plainly tell it wasn't correct. Which in those cases I wasn't able to.
llvm/test/CodeGen/PowerPC/ppc-32bit-build-vector.ll
68	Thanks a lot! That simplifies it and I'm still able to get the compiler error without the patch.

Addressed comments:

simplified and modified test case per suggestion
added comment

Harbormaster completed remote builds in B82034: Diff 311207.Dec 11 2020, 6:50 AM

Xiangling_L added inline comments.Dec 11 2020, 7:55 AM

llvm/lib/Target/PowerPC/PPCInstrVSX.td
2412	I see your point here. I am asking because I tried adding `IsPPC64` for `[HasVSX, HasP8Vector, NoP9Vector, IsBigEndian]` and found no regressions. So I feel the current situation is kinda messy by having partial fixes and disablement for 32bit mode. But I understand that you want this patch to be conservative correct. I am wondering Is that possible for us to add `IsPPC64` for all predicates where no existing testcases would be failed? But if it would take you too much time, I can live with that for now. Since as we discussed offline that we do have plan to verify them one by one for 32bit mode in the near future.
llvm/test/CodeGen/PowerPC/ppc-32bit-build-vector.ll
64	Sorry that I forgot to mention, attributes `attributes #0` are also not necessary.

Xiangling_L added a reviewer: Xiangling_L.Dec 11 2020, 7:55 AM

ZarkoCA added inline comments.Dec 11 2020, 8:04 AM

llvm/lib/Target/PowerPC/PPCInstrVSX.td
2412	I see your point here. I am asking because I tried adding IsPPC64 for [HasVSX, HasP8Vector, NoP9Vector, IsBigEndian] and found no regressions. So I feel the current situation is kinda messy by having partial fixes and disablement for 32bit mode. But I understand that you want this patch to be conservative correct. Ah ok, I likely missed this instance. I will have a look and update the patch. Thanks. And yes, the plan is to enable 32Bit VSX over time either by extending the implementation here if possible.

Added further IsPPC64 checks for Big Endian VSX
removed function attributes from test case.

ZarkoCA marked 3 inline comments as done.Dec 11 2020, 8:34 AM

Harbormaster completed remote builds in B82056: Diff 311233.Dec 11 2020, 9:04 AM

LGTM. Thanks!

This revision is now accepted and ready to land.Dec 11 2020, 11:08 AM

This revision was landed with ongoing or failed builds.Dec 12 2020, 12:28 PM

Closed by commit rGce4040a43d54: [PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern… (authored by ZarkoCA). · Explain Why

This revision was automatically updated to reflect the committed changes.

ZarkoCA added a commit: rGce4040a43d54: [PPC] Check for PPC64 when emitting 64bit specific VSX nodes when pattern….

ZarkoCA mentioned this in D97503: [AIX] Allow safe for 32bit P9 VSX extract and insert pattern matches.Feb 25 2021, 2:36 PM

ZarkoCA mentioned this in rGf818ec9dd173: [AIX] Allow safe for 32bit P9 VSX extract and insert pattern matches.Apr 27 2021, 4:27 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

PowerPC/

PPCISelLowering.cpp

6 lines

PPCInstrVSX.td

38 lines

test/

CodeGen/

PowerPC/

ppc-32bit-build-vector.ll

64 lines

Diff 311207

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 9,380 Lines • ▼ Show 20 Lines	if (InputLoad && DAG.isSplatValue(Op, true)) {
Ops, LD->getMemoryVT(), LD->getMemOperand());		Ops, LD->getMemoryVT(), LD->getMemOperand());
// Replace all uses of the output chain of the original load with the		// Replace all uses of the output chain of the original load with the
// output chain of the new load.		// output chain of the new load.
DAG.ReplaceAllUsesOfValueWith(InputLoad->getValue(1),		DAG.ReplaceAllUsesOfValueWith(InputLoad->getValue(1),
LdSplt.getValue(1));		LdSplt.getValue(1));
return LdSplt;		return LdSplt;
}		}
}		}

// BUILD_VECTOR nodes that are not constant splats of up to 32-bits can be		// In 64BIT mode BUILD_VECTOR nodes that are not constant splats of up to
		Xiangling_LUnsubmitted Done Reply Inline Actions minor: Is that better to adjust the comment accordingly like: `Under 64-bit mode, BUILD_VECTOR nodes that ...` Xiangling_L: minor: Is that better to adjust the comment accordingly like: `Under 64-bit mode, BUILD_VECTOR…
// lowered to VSX instructions under certain conditions.		// 32-bits can be lowered to VSX instructions under certain conditions.
// Without VSX, there is no pattern more efficient than expanding the node.		// Without VSX, there is no pattern more efficient than expanding the node.
if (Subtarget.hasVSX() &&		if (Subtarget.hasVSX() && Subtarget.isPPC64() &&
haveEfficientBuildVectorPattern(BVN, Subtarget.hasDirectMove(),		haveEfficientBuildVectorPattern(BVN, Subtarget.hasDirectMove(),
Subtarget.hasP8Vector()))		Subtarget.hasP8Vector()))
return Op;		return Op;
return SDValue();		return SDValue();
}		}

uint64_t SplatBits = APSplatBits.getZExtValue();		uint64_t SplatBits = APSplatBits.getZExtValue();
uint64_t SplatUndef = APSplatUndef.getZExtValue();		uint64_t SplatUndef = APSplatUndef.getZExtValue();
▲ Show 20 Lines • Show All 7,651 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrVSX.td

Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	def PPCldsplat : SDNode<"PPCISD::LD_SPLAT", SDT_PPCldsplat,
[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;		[SDNPHasChain, SDNPMayLoad, SDNPMemOperand]>;
def PPCSToV : SDNode<"PPCISD::SCALAR_TO_VECTOR_PERMUTED",		def PPCSToV : SDNode<"PPCISD::SCALAR_TO_VECTOR_PERMUTED",
SDTypeProfile<1, 1, []>, []>;		SDTypeProfile<1, 1, []>, []>;

//-------------------------- Predicate definitions ---------------------------//		//-------------------------- Predicate definitions ---------------------------//
def HasVSX : Predicate<"Subtarget->hasVSX()">;		def HasVSX : Predicate<"Subtarget->hasVSX()">;
def IsLittleEndian : Predicate<"Subtarget->isLittleEndian()">;		def IsLittleEndian : Predicate<"Subtarget->isLittleEndian()">;
def IsBigEndian : Predicate<"!Subtarget->isLittleEndian()">;		def IsBigEndian : Predicate<"!Subtarget->isLittleEndian()">;
		def IsPPC64 : Predicate<"Subtarget->isPPC64()">;
def HasOnlySwappingMemOps : Predicate<"!Subtarget->hasP9Vector()">;		def HasOnlySwappingMemOps : Predicate<"!Subtarget->hasP9Vector()">;
def HasP8Vector : Predicate<"Subtarget->hasP8Vector()">;		def HasP8Vector : Predicate<"Subtarget->hasP8Vector()">;
def HasDirectMove : Predicate<"Subtarget->hasDirectMove()">;		def HasDirectMove : Predicate<"Subtarget->hasDirectMove()">;
def NoP9Vector : Predicate<"!Subtarget->hasP9Vector()">;		def NoP9Vector : Predicate<"!Subtarget->hasP9Vector()">;
def HasP9Vector : Predicate<"Subtarget->hasP9Vector()">;		def HasP9Vector : Predicate<"Subtarget->hasP9Vector()">;
def NoP9Altivec : Predicate<"!Subtarget->hasP9Altivec()">;		def NoP9Altivec : Predicate<"!Subtarget->hasP9Altivec()">;

//--------------------- VSX-specific instruction formats ---------------------//		//--------------------- VSX-specific instruction formats ---------------------//
▲ Show 20 Lines • Show All 2,247 Lines • ▼ Show 20 Lines
// is finer for various reasons. For example, we have Power8Vector,		// is finer for various reasons. For example, we have Power8Vector,
// Power8Altivec, DirectMove that all came in with ISA 2.07. The situation is		// Power8Altivec, DirectMove that all came in with ISA 2.07. The situation is
// similar with ISA 3.0 with Power9Vector, Power9Altivec, IsISA3_0. Then there		// similar with ISA 3.0 with Power9Vector, Power9Altivec, IsISA3_0. Then there
// are orthogonal predicates such as endianness for which the order was		// are orthogonal predicates such as endianness for which the order was
// arbitrarily chosen to be Big, Little.		// arbitrarily chosen to be Big, Little.
//		//
// Predicate combinations available:		// Predicate combinations available:
// [HasVSX]		// [HasVSX]
// [HasVSX, IsBigEndian]		// [HasVSX, IsBigEndian]
		Xiangling_LUnsubmitted Done Reply Inline Actions I am wondering why do we leave alone `[HasVSX, IsBigEndian]` and `[HasVSX, HasOnlySwappingMemOps, IsBigEndian]` and `HasVSX, HasP8Vector, NoP9Vector, IsBigEndian]`? Xiangling_L: I am wondering why do we leave alone `[HasVSX, IsBigEndian]` and `[HasVSX…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions That's a good question, I tried disabling these so as to have minimal disruption to existing code. And did it by elimination. So, as far as I can see the potentially big problem in the pattern matching is later implementations like `[HasVSX, IsISA3_0, HasDirectMove, IsBigEndian]` were implemented without considering 32bit codegen and I added a check there and moved on to earlier ones. At some point, I started hitting test case failures and found that codegen in those cases was correct. And in those cases I found that there were checks added for 32bit like in this patch: https://reviews.llvm.org/D17711. In short, I tried to do the safe thing for 32bit VSX without major disruption to existing test cases and codegen unless I could plainly tell it wasn't correct. Which in those cases I wasn't able to. ZarkoCA: That's a good question, I tried disabling these so as to have minimal disruption to existing…
		Xiangling_LUnsubmitted Done Reply Inline Actions I see your point here. I am asking because I tried adding `IsPPC64` for `[HasVSX, HasP8Vector, NoP9Vector, IsBigEndian]` and found no regressions. So I feel the current situation is kinda messy by having partial fixes and disablement for 32bit mode. But I understand that you want this patch to be conservative correct. I am wondering Is that possible for us to add `IsPPC64` for all predicates where no existing testcases would be failed? But if it would take you too much time, I can live with that for now. Since as we discussed offline that we do have plan to verify them one by one for 32bit mode in the near future. Xiangling_L: I see your point here. I am asking because I tried adding `IsPPC64` for `[HasVSX, HasP8Vector…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I see your point here. I am asking because I tried adding IsPPC64 for [HasVSX, HasP8Vector, NoP9Vector, IsBigEndian] and found no regressions. So I feel the current situation is kinda messy by having partial fixes and disablement for 32bit mode. But I understand that you want this patch to be conservative correct. Ah ok, I likely missed this instance. I will have a look and update the patch. Thanks. And yes, the plan is to enable 32Bit VSX over time either by extending the implementation here if possible. ZarkoCA: > I see your point here. I am asking because I tried adding IsPPC64 for [HasVSX, HasP8Vector…
// [HasVSX, IsLittleEndian]		// [HasVSX, IsLittleEndian]
// [HasVSX, NoP9Vector]		// [HasVSX, NoP9Vector]
// [HasVSX, HasOnlySwappingMemOps]		// [HasVSX, HasOnlySwappingMemOps]
// [HasVSX, HasOnlySwappingMemOps, IsBigEndian]		// [HasVSX, HasOnlySwappingMemOps, IsBigEndian]
// [HasVSX, HasP8Vector]		// [HasVSX, HasP8Vector]
// [HasVSX, HasP8Vector, IsBigEndian]		// [HasVSX, HasP8Vector, IsBigEndian]
// [HasVSX, HasP8Vector, IsLittleEndian]		// [HasVSX, HasP8Vector, IsLittleEndian]
// [HasVSX, HasP8Vector, NoP9Vector, IsBigEndian]		// [HasVSX, HasP8Vector, NoP9Vector, IsBigEndian]
// [HasVSX, HasP8Vector, NoP9Vector, IsLittleEndian]		// [HasVSX, HasP8Vector, NoP9Vector, IsLittleEndian]
// [HasVSX, HasDirectMove]		// [HasVSX, HasDirectMove]
// [HasVSX, HasDirectMove, IsBigEndian]		// [HasVSX, HasDirectMove, IsBigEndian]
// [HasVSX, HasDirectMove, IsLittleEndian]		// [HasVSX, HasDirectMove, IsLittleEndian]
// [HasVSX, HasDirectMove, NoP9Altivec, IsBigEndian]		// [HasVSX, HasDirectMove, NoP9Altivec, IsBigEndian]
		// [HasVSX, HasDirectMove, NoP9Vector, IsBigEndian, IsPPC64]
// [HasVSX, HasDirectMove, NoP9Altivec, IsLittleEndian]		// [HasVSX, HasDirectMove, NoP9Altivec, IsLittleEndian]
// [HasVSX, HasDirectMove, NoP9Vector, IsBigEndian]
// [HasVSX, HasDirectMove, NoP9Vector, IsLittleEndian]		// [HasVSX, HasDirectMove, NoP9Vector, IsLittleEndian]
		Xiangling_LUnsubmitted Done Reply Inline Actions minor: This line should follow #2425? Xiangling_L: minor: This line should follow #2425?
// [HasVSX, HasP9Vector]		// [HasVSX, HasP9Vector]
// [HasVSX, HasP9Vector, IsBigEndian]		// [HasVSX, HasP9Vector, IsBigEndian, IsPPC64]
// [HasVSX, HasP9Vector, IsLittleEndian]		// [HasVSX, HasP9Vector, IsLittleEndian]
// [HasVSX, HasP9Altivec]		// [HasVSX, HasP9Altivec]
// [HasVSX, HasP9Altivec, IsBigEndian]		// [HasVSX, HasP9Altivec, IsBigEndian, IsPPC64]
// [HasVSX, HasP9Altivec, IsLittleEndian]		// [HasVSX, HasP9Altivec, IsLittleEndian]
// [HasVSX, IsISA3_0, HasDirectMove, IsBigEndian]		// [HasVSX, IsISA3_0, HasDirectMove, IsBigEndian, IsPPC64]
// [HasVSX, IsISA3_0, HasDirectMove, IsLittleEndian]		// [HasVSX, IsISA3_0, HasDirectMove, IsLittleEndian]

let AddedComplexity = 400 in {		let AddedComplexity = 400 in {
// Valid for any VSX subtarget, regardless of endianness.		// Valid for any VSX subtarget, regardless of endianness.
let Predicates = [HasVSX] in {		let Predicates = [HasVSX] in {
def : Pat<(v4i32 (vnot_ppc v4i32:$A)),		def : Pat<(v4i32 (vnot_ppc v4i32:$A)),
(v4i32 (XXLNOR $A, $A))>;		(v4i32 (XXLNOR $A, $A))>;
def : Pat<(v4i32 (or (and (vnot_ppc v4i32:$C), v4i32:$A),		def : Pat<(v4i32 (or (and (vnot_ppc v4i32:$C), v4i32:$A),
▲ Show 20 Lines • Show All 567 Lines • ▼ Show 20 Lines	let Predicates = [HasVSX, HasOnlySwappingMemOps] in {
def : Pat<(v2f64 (PPClxvd2x xoaddr:$src)), (LXVD2X xoaddr:$src)>;		def : Pat<(v2f64 (PPClxvd2x xoaddr:$src)), (LXVD2X xoaddr:$src)>;

// Stores.		// Stores.
def : Pat<(int_ppc_vsx_stxvd2x v2f64:$rS, xoaddr:$dst),		def : Pat<(int_ppc_vsx_stxvd2x v2f64:$rS, xoaddr:$dst),
(STXVD2X $rS, xoaddr:$dst)>;		(STXVD2X $rS, xoaddr:$dst)>;
def : Pat<(PPCstxvd2x v2f64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;		def : Pat<(PPCstxvd2x v2f64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;
} // HasVSX, HasOnlySwappingMemOps		} // HasVSX, HasOnlySwappingMemOps

// Big endian VSX subtarget that only has loads and stores that always load		// Big endian VSX subtarget that only has loads and stores that always
// in big endian order. Really big endian pre-Power9 subtargets.		// load in big endian order. Really big endian pre-Power9 subtargets.
let Predicates = [HasVSX, HasOnlySwappingMemOps, IsBigEndian] in {		let Predicates = [HasVSX, HasOnlySwappingMemOps, IsBigEndian] in {
def : Pat<(v2f64 (load xoaddr:$src)), (LXVD2X xoaddr:$src)>;		def : Pat<(v2f64 (load xoaddr:$src)), (LXVD2X xoaddr:$src)>;
def : Pat<(v2i64 (load xoaddr:$src)), (LXVD2X xoaddr:$src)>;		def : Pat<(v2i64 (load xoaddr:$src)), (LXVD2X xoaddr:$src)>;
def : Pat<(v4i32 (load xoaddr:$src)), (LXVW4X xoaddr:$src)>;		def : Pat<(v4i32 (load xoaddr:$src)), (LXVW4X xoaddr:$src)>;
def : Pat<(v4i32 (int_ppc_vsx_lxvw4x xoaddr:$src)), (LXVW4X xoaddr:$src)>;		def : Pat<(v4i32 (int_ppc_vsx_lxvw4x xoaddr:$src)), (LXVW4X xoaddr:$src)>;
def : Pat<(store v2f64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;		def : Pat<(store v2f64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;
def : Pat<(store v2i64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;		def : Pat<(store v2i64:$rS, xoaddr:$dst), (STXVD2X $rS, xoaddr:$dst)>;
def : Pat<(store v4i32:$XT, xoaddr:$dst), (STXVW4X $XT, xoaddr:$dst)>;		def : Pat<(store v4i32:$XT, xoaddr:$dst), (STXVW4X $XT, xoaddr:$dst)>;
▲ Show 20 Lines • Show All 521 Lines • ▼ Show 20 Lines
def : Pat<(i32 (vector_extract v4i32:$S, 2)),		def : Pat<(i32 (vector_extract v4i32:$S, 2)),
(i32 VectorExtractions.LE_WORD_2)>;		(i32 VectorExtractions.LE_WORD_2)>;
def : Pat<(i32 (vector_extract v4i32:$S, 3)),		def : Pat<(i32 (vector_extract v4i32:$S, 3)),
(i32 VectorExtractions.LE_WORD_3)>;		(i32 VectorExtractions.LE_WORD_3)>;
def : Pat<(i32 (vector_extract v4i32:$S, i64:$Idx)),		def : Pat<(i32 (vector_extract v4i32:$S, i64:$Idx)),
(i32 VectorExtractions.LE_VARIABLE_WORD)>;		(i32 VectorExtractions.LE_VARIABLE_WORD)>;
} // HasVSX, HasDirectMove, NoP9Altivec, IsLittleEndian		} // HasVSX, HasDirectMove, NoP9Altivec, IsLittleEndian

// Big endian pre-Power9 VSX subtarget that has direct moves.		// Big endian pre-Power9 64Bit VSX subtarget that has direct moves.
let Predicates = [HasVSX, HasDirectMove, NoP9Vector, IsBigEndian] in {		let Predicates = [HasVSX, HasDirectMove, NoP9Vector, IsBigEndian, IsPPC64] in {
// Big endian integer vectors using direct moves.		// Big endian integer vectors using direct moves.
def : Pat<(v2i64 (build_vector i64:$A, i64:$B)),		def : Pat<(v2i64 (build_vector i64:$A, i64:$B)),
(v2i64 (XXPERMDI		(v2i64 (XXPERMDI
(COPY_TO_REGCLASS (MTVSRD $A), VSRC),		(COPY_TO_REGCLASS (MTVSRD $A), VSRC),
(COPY_TO_REGCLASS (MTVSRD $B), VSRC), 0))>;		(COPY_TO_REGCLASS (MTVSRD $B), VSRC), 0))>;
def : Pat<(v4i32 (build_vector i32:$A, i32:$B, i32:$C, i32:$D)),		def : Pat<(v4i32 (build_vector i32:$A, i32:$B, i32:$C, i32:$D)),
(XXPERMDI		(XXPERMDI
(COPY_TO_REGCLASS		(COPY_TO_REGCLASS
(MTVSRD (RLDIMI AnyExts.B, AnyExts.A, 32, 0)), VSRC),		(MTVSRD (RLDIMI AnyExts.B, AnyExts.A, 32, 0)), VSRC),
(COPY_TO_REGCLASS		(COPY_TO_REGCLASS
(MTVSRD (RLDIMI AnyExts.D, AnyExts.C, 32, 0)), VSRC), 0)>;		(MTVSRD (RLDIMI AnyExts.D, AnyExts.C, 32, 0)), VSRC), 0)>;
def : Pat<(v4i32 (build_vector i32:$A, i32:$A, i32:$A, i32:$A)),		def : Pat<(v4i32 (build_vector i32:$A, i32:$A, i32:$A, i32:$A)),
(XXSPLTW (COPY_TO_REGCLASS (MTVSRWZ $A), VSRC), 1)>;		(XXSPLTW (COPY_TO_REGCLASS (MTVSRWZ $A), VSRC), 1)>;
} // HasVSX, HasDirectMove, NoP9Vector, IsBigEndian		} // HasVSX, HasDirectMove, NoP9Vector, IsBigEndian, IsPPC64

// Little endian pre-Power9 VSX subtarget that has direct moves.		// Little endian pre-Power9 VSX subtarget that has direct moves.
let Predicates = [HasVSX, HasDirectMove, NoP9Vector, IsLittleEndian] in {		let Predicates = [HasVSX, HasDirectMove, NoP9Vector, IsLittleEndian] in {
// Little endian integer vectors using direct moves.		// Little endian integer vectors using direct moves.
def : Pat<(v2i64 (build_vector i64:$A, i64:$B)),		def : Pat<(v2i64 (build_vector i64:$A, i64:$B)),
(v2i64 (XXPERMDI		(v2i64 (XXPERMDI
(COPY_TO_REGCLASS (MTVSRD $B), VSRC),		(COPY_TO_REGCLASS (MTVSRD $B), VSRC),
(COPY_TO_REGCLASS (MTVSRD $A), VSRC), 0))>;		(COPY_TO_REGCLASS (MTVSRD $A), VSRC), 0))>;
▲ Show 20 Lines • Show All 336 Lines • ▼ Show 20 Lines	(SUBREG_TO_REG
(i64 1),		(i64 1),
(XSCVDPUXDS (COPY_TO_REGCLASS (DFLOADf32 iaddrX4:$A), VSFRC)), sub_64)>;		(XSCVDPUXDS (COPY_TO_REGCLASS (DFLOADf32 iaddrX4:$A), VSFRC)), sub_64)>;
def : Pat<(v4f32 (PPCldsplat xoaddr:$A)),		def : Pat<(v4f32 (PPCldsplat xoaddr:$A)),
(v4f32 (LXVWSX xoaddr:$A))>;		(v4f32 (LXVWSX xoaddr:$A))>;
def : Pat<(v4i32 (PPCldsplat xoaddr:$A)),		def : Pat<(v4i32 (PPCldsplat xoaddr:$A)),
(v4i32 (LXVWSX xoaddr:$A))>;		(v4i32 (LXVWSX xoaddr:$A))>;
} // HasVSX, HasP9Vector		} // HasVSX, HasP9Vector

// Big endian Power9 subtarget.		// Big endian 64Bit Power9 subtarget.
let Predicates = [HasVSX, HasP9Vector, IsBigEndian] in {		let Predicates = [HasVSX, HasP9Vector, IsBigEndian, IsPPC64] in {
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 0)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 0)))))),
(f32 (XSCVUXDSP (XXEXTRACTUW $A, 0)))>;		(f32 (XSCVUXDSP (XXEXTRACTUW $A, 0)))>;
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 1)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 1)))))),
(f32 (XSCVUXDSP (XXEXTRACTUW $A, 4)))>;		(f32 (XSCVUXDSP (XXEXTRACTUW $A, 4)))>;
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 2)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 2)))))),
(f32 (XSCVUXDSP (XXEXTRACTUW $A, 8)))>;		(f32 (XSCVUXDSP (XXEXTRACTUW $A, 8)))>;
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 3)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 3)))))),
(f32 (XSCVUXDSP (XXEXTRACTUW $A, 12)))>;		(f32 (XSCVUXDSP (XXEXTRACTUW $A, 12)))>;
▲ Show 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	def : Pat<(f128 (uint_to_fp
(f128 (XSCVUDQP		(f128 (XSCVUDQP
(EXTRACT_SUBREG (VEXTRACTUB Idx, $src), sub_64)))>;		(EXTRACT_SUBREG (VEXTRACTUB Idx, $src), sub_64)))>;
}		}

// Unsiged int in vsx register -> QP		// Unsiged int in vsx register -> QP
def : Pat<(f128 (uint_to_fp (i32 (PPCmfvsr f64:$src)))),		def : Pat<(f128 (uint_to_fp (i32 (PPCmfvsr f64:$src)))),
(f128 (XSCVUDQP		(f128 (XSCVUDQP
(XXEXTRACTUW (SUBREG_TO_REG (i64 1), $src, sub_64), 4)))>;		(XXEXTRACTUW (SUBREG_TO_REG (i64 1), $src, sub_64), 4)))>;
} // HasVSX, HasP9Vector, IsBigEndian		} // HasVSX, HasP9Vector, IsBigEndian, IsPPC64

// Little endian Power9 subtarget.		// Little endian Power9 subtarget.
let Predicates = [HasVSX, HasP9Vector, IsLittleEndian] in {		let Predicates = [HasVSX, HasP9Vector, IsLittleEndian] in {
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 0)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 0)))))),
(f32 (XSCVUXDSP (XXEXTRACTUW $A, 12)))>;		(f32 (XSCVUXDSP (XXEXTRACTUW $A, 12)))>;
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 1)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 1)))))),
(f32 (XSCVUXDSP (XXEXTRACTUW $A, 8)))>;		(f32 (XSCVUXDSP (XXEXTRACTUW $A, 8)))>;
def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 2)))))),		def : Pat<(f32 (PPCfcfidus (f64 (PPCmtvsrz (i32 (extractelt v4i32:$A, 2)))))),
▲ Show 20 Lines • Show All 208 Lines • ▼ Show 20 Lines	def : Pat<(v16i8 (PPCvabsd v16i8:$A, v16i8:$B, (i32 0))),
(v16i8 (VABSDUB $A, $B))>;		(v16i8 (VABSDUB $A, $B))>;

// As PPCVABSD description, the last operand indicates whether do the		// As PPCVABSD description, the last operand indicates whether do the
// sign bit flip.		// sign bit flip.
def : Pat<(v4i32 (PPCvabsd v4i32:$A, v4i32:$B, (i32 1))),		def : Pat<(v4i32 (PPCvabsd v4i32:$A, v4i32:$B, (i32 1))),
(v4i32 (VABSDUW (XVNEGSP $A), (XVNEGSP $B)))>;		(v4i32 (VABSDUW (XVNEGSP $A), (XVNEGSP $B)))>;
} // HasVSX, HasP9Altivec		} // HasVSX, HasP9Altivec

// Big endian Power9 VSX subtargets with P9 Altivec support.		// Big endian Power9 64Bit VSX subtargets with P9 Altivec support.
let Predicates = [HasVSX, HasP9Altivec, IsBigEndian] in {		let Predicates = [HasVSX, HasP9Altivec, IsBigEndian, IsPPC64] in {
def : Pat<(i64 (anyext (i32 (vector_extract v16i8:$S, i64:$Idx)))),		def : Pat<(i64 (anyext (i32 (vector_extract v16i8:$S, i64:$Idx)))),
(VEXTUBLX $Idx, $S)>;		(VEXTUBLX $Idx, $S)>;

def : Pat<(i64 (anyext (i32 (vector_extract v8i16:$S, i64:$Idx)))),		def : Pat<(i64 (anyext (i32 (vector_extract v8i16:$S, i64:$Idx)))),
(VEXTUHLX (RLWINM8 $Idx, 1, 28, 30), $S)>;		(VEXTUHLX (RLWINM8 $Idx, 1, 28, 30), $S)>;
def : Pat<(i64 (anyext (i32 (vector_extract v8i16:$S, 0)))),		def : Pat<(i64 (anyext (i32 (vector_extract v8i16:$S, 0)))),
(VEXTUHLX (LI8 0), $S)>;		(VEXTUHLX (LI8 0), $S)>;
def : Pat<(i64 (anyext (i32 (vector_extract v8i16:$S, 1)))),		def : Pat<(i64 (anyext (i32 (vector_extract v8i16:$S, 1)))),
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
def : Pat<(v4i32 (build_vector HWordToWord.BE_A0, HWordToWord.BE_A1,		def : Pat<(v4i32 (build_vector HWordToWord.BE_A0, HWordToWord.BE_A1,
HWordToWord.BE_A2, HWordToWord.BE_A3)),		HWordToWord.BE_A2, HWordToWord.BE_A3)),
(v4i32 (VEXTSH2W $A))>;		(v4i32 (VEXTSH2W $A))>;
def : Pat<(v4i32 (build_vector ByteToWord.BE_A0, ByteToWord.BE_A1,		def : Pat<(v4i32 (build_vector ByteToWord.BE_A0, ByteToWord.BE_A1,
ByteToWord.BE_A2, ByteToWord.BE_A3)),		ByteToWord.BE_A2, ByteToWord.BE_A3)),
(v4i32 (VEXTSB2W $A))>;		(v4i32 (VEXTSB2W $A))>;
def : Pat<(v2i64 (build_vector ByteToDWord.BE_A0, ByteToDWord.BE_A1)),		def : Pat<(v2i64 (build_vector ByteToDWord.BE_A0, ByteToDWord.BE_A1)),
(v2i64 (VEXTSB2D $A))>;		(v2i64 (VEXTSB2D $A))>;
} // HasVSX, HasP9Altivec, IsBigEndian		} // HasVSX, HasP9Altivec, IsBigEndian, IsPPC64

// Little endian Power9 VSX subtargets with P9 Altivec support.		// Little endian Power9 VSX subtargets with P9 Altivec support.
let Predicates = [HasVSX, HasP9Altivec, IsLittleEndian] in {		let Predicates = [HasVSX, HasP9Altivec, IsLittleEndian] in {
def : Pat<(i64 (anyext (i32 (vector_extract v16i8:$S, i64:$Idx)))),		def : Pat<(i64 (anyext (i32 (vector_extract v16i8:$S, i64:$Idx)))),
(VEXTUBRX $Idx, $S)>;		(VEXTUBRX $Idx, $S)>;

def : Pat<(i64 (anyext (i32 (vector_extract v8i16:$S, i64:$Idx)))),		def : Pat<(i64 (anyext (i32 (vector_extract v8i16:$S, i64:$Idx)))),
(VEXTUHRX (RLWINM8 $Idx, 1, 28, 30), $S)>;		(VEXTUHRX (RLWINM8 $Idx, 1, 28, 30), $S)>;
▲ Show 20 Lines • Show All 120 Lines • ▼ Show 20 Lines	def : Pat<(v4i32 (build_vector HWordToWord.LE_A0, HWordToWord.LE_A1,
(v4i32 (VEXTSH2W $A))>;		(v4i32 (VEXTSH2W $A))>;
def : Pat<(v4i32 (build_vector ByteToWord.LE_A0, ByteToWord.LE_A1,		def : Pat<(v4i32 (build_vector ByteToWord.LE_A0, ByteToWord.LE_A1,
ByteToWord.LE_A2, ByteToWord.LE_A3)),		ByteToWord.LE_A2, ByteToWord.LE_A3)),
(v4i32 (VEXTSB2W $A))>;		(v4i32 (VEXTSB2W $A))>;
def : Pat<(v2i64 (build_vector ByteToDWord.LE_A0, ByteToDWord.LE_A1)),		def : Pat<(v2i64 (build_vector ByteToDWord.LE_A0, ByteToDWord.LE_A1)),
(v2i64 (VEXTSB2D $A))>;		(v2i64 (VEXTSB2D $A))>;
} // HasVSX, HasP9Altivec, IsLittleEndian		} // HasVSX, HasP9Altivec, IsLittleEndian

// Big endian VSX subtarget that supports additional direct moves from ISA3.0.		// Big endian 64Bit VSX subtarget that supports additional direct moves from
let Predicates = [HasVSX, IsISA3_0, HasDirectMove, IsBigEndian] in {		// ISA3.0.
		let Predicates = [HasVSX, IsISA3_0, HasDirectMove, IsBigEndian, IsPPC64] in {
def : Pat<(i64 (extractelt v2i64:$A, 1)),		def : Pat<(i64 (extractelt v2i64:$A, 1)),
(i64 (MFVSRLD $A))>;		(i64 (MFVSRLD $A))>;
// Better way to build integer vectors if we have MTVSRDD. Big endian.		// Better way to build integer vectors if we have MTVSRDD. Big endian.
def : Pat<(v2i64 (build_vector i64:$rB, i64:$rA)),		def : Pat<(v2i64 (build_vector i64:$rB, i64:$rA)),
(v2i64 (MTVSRDD $rB, $rA))>;		(v2i64 (MTVSRDD $rB, $rA))>;
def : Pat<(v4i32 (build_vector i32:$A, i32:$B, i32:$C, i32:$D)),		def : Pat<(v4i32 (build_vector i32:$A, i32:$B, i32:$C, i32:$D)),
(MTVSRDD		(MTVSRDD
(RLDIMI AnyExts.B, AnyExts.A, 32, 0),		(RLDIMI AnyExts.B, AnyExts.A, 32, 0),
(RLDIMI AnyExts.D, AnyExts.C, 32, 0))>;		(RLDIMI AnyExts.D, AnyExts.C, 32, 0))>;

def : Pat<(f128 (PPCbuild_fp128 i64:$rB, i64:$rA)),		def : Pat<(f128 (PPCbuild_fp128 i64:$rB, i64:$rA)),
(f128 (COPY_TO_REGCLASS (MTVSRDD $rB, $rA), VRRC))>;		(f128 (COPY_TO_REGCLASS (MTVSRDD $rB, $rA), VRRC))>;
} // HasVSX, IsISA3_0, HasDirectMove, IsBigEndian		} // HasVSX, IsISA3_0, HasDirectMove, IsBigEndian, IsPPC64

// Little endian VSX subtarget that supports direct moves from ISA3.0.		// Little endian VSX subtarget that supports direct moves from ISA3.0.
let Predicates = [HasVSX, IsISA3_0, HasDirectMove, IsLittleEndian] in {		let Predicates = [HasVSX, IsISA3_0, HasDirectMove, IsLittleEndian] in {
def : Pat<(i64 (extractelt v2i64:$A, 0)),		def : Pat<(i64 (extractelt v2i64:$A, 0)),
(i64 (MFVSRLD $A))>;		(i64 (MFVSRLD $A))>;
// Better way to build integer vectors if we have MTVSRDD. Little endian.		// Better way to build integer vectors if we have MTVSRDD. Little endian.
def : Pat<(v2i64 (build_vector i64:$rA, i64:$rB)),		def : Pat<(v2i64 (build_vector i64:$rA, i64:$rB)),
(v2i64 (MTVSRDD $rB, $rA))>;		(v2i64 (MTVSRDD $rB, $rA))>;
▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/ppc-32bit-build-vector.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -verify-machineinstrs -mtriple=powerpc -mcpu=pwr8 < %s \|\
				; RUN: FileCheck %s --check-prefix=32BIT

				Xiangling_LUnsubmitted Done Reply Inline Actions It seems `-mcpu=pwr8` already implies HasAltivec and HasVSX enabled, so `-mattr=+altivec` and `-mattr=+vsx` are not necessary. Xiangling_L: It seems `-mcpu=pwr8` already implies HasAltivec and HasVSX enabled, so `-mattr=+altivec` and…
				; RUN: llc -verify-machineinstrs -mtriple=powerpc64 -mcpu=pwr8 < %s \|\
				; RUN: FileCheck %s --check-prefix=64BIT

				define dso_local fastcc void @BuildVectorICE() unnamed_addr #0 {
				; 32BIT-LABEL: BuildVectorICE:
				; 32BIT: # %bb.0: # %entry
				; 32BIT-NEXT: stwu 1, -64(1)
				; 32BIT-NEXT: .cfi_def_cfa_offset 64
				; 32BIT-NEXT: lxvx 34, 0, 3
				; 32BIT-NEXT: li 3, 0
				; 32BIT-NEXT: addi 4, 1, 32
				; 32BIT-NEXT: li 5, 0
				; 32BIT-NEXT: .p2align 4
				; 32BIT-NEXT: .LBB0_1: # %while.body
				; 32BIT-NEXT: #
				; 32BIT-NEXT: stw 5, 16(1)
				; 32BIT-NEXT: stw 3, 32(1)
				; 32BIT-NEXT: lxv 0, 16(1)
				; 32BIT-NEXT: lxv 1, 32(1)
				; 32BIT-NEXT: xxsldwi 0, 1, 0, 1
				; 32BIT-NEXT: lxvwsx 1, 0, 4
				; 32BIT-NEXT: xxsldwi 35, 0, 1, 3
				; 32BIT-NEXT: vadduwm 3, 2, 3
				; 32BIT-NEXT: xxspltw 36, 35, 1
				; 32BIT-NEXT: vadduwm 3, 3, 4
				; 32BIT-NEXT: stxv 35, 48(1)
				; 32BIT-NEXT: lwz 5, 48(1)
				; 32BIT-NEXT: b .LBB0_1
				;
				; 64BIT-LABEL: BuildVectorICE:
				; 64BIT: # %bb.0: # %entry
				; 64BIT-NEXT: lxvx 34, 0, 3
				; 64BIT-NEXT: li 3, 0
				; 64BIT-NEXT: li 4, 0
				; 64BIT-NEXT: li 5, 0
				; 64BIT-NEXT: rldimi 3, 3, 32, 0
				; 64BIT-NEXT: .p2align 5
				; 64BIT-NEXT: .LBB0_1: # %while.body
				; 64BIT-NEXT: #
				; 64BIT-NEXT: li 6, 0
				; 64BIT-NEXT: rldimi 6, 5, 32, 0
				; 64BIT-NEXT: mtvsrdd 35, 6, 3
				; 64BIT-NEXT: vadduwm 3, 2, 3
				; 64BIT-NEXT: xxspltw 36, 35, 1
				; 64BIT-NEXT: vadduwm 3, 3, 4
				; 64BIT-NEXT: vextuwlx 5, 4, 3
				; 64BIT-NEXT: b .LBB0_1
				entry:
				br label %while.body
				while.body: ; preds = %while.body, %entry
				%newelement = phi i32 [ 0, %entry ], [ %5, %while.body ]
				%0 = insertelement <4 x i32> <i32 undef, i32 0, i32 0, i32 0>, i32 %newelement, i32 0
				%1 = load <4 x i32>, <4 x i32>* undef, align 1
				%2 = add <4 x i32> %1, %0
				%3 = shufflevector <4 x i32> %2, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
				%4 = add <4 x i32> %2, %3
				%5 = extractelement <4 x i32> %4, i32 0
				br label %while.body
				}
				attributes #0 = { "target-features"="+altivec,+bpermd,+crypto,+direct-move,+extdiv,+htm,+power8-vector,+power9-vector,+vsx,-spe" }
				Xiangling_LUnsubmitted Done Reply Inline Actions Suggestion: The testcase can be further simplified into the following by removing tedious `zext` instruction and shortening the numbers of vector. define dso_local fastcc void @BuildVectorICE() unnamed_addr #0 { entry: br label %while.body while.body: ; preds = %while.body, %entry %newelement = phi i32 [ 0, %entry ], [ %5, %while.body ] %0 = insertelement <4 x i32> <i32 undef, i32 0, i32 0, i32 0>, i32 %newelement, i32 0 %1 = load <4 x i32>, <4 x i32>* undef, align 1 %2 = add <4 x i32> %1, %0 %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %4 = add <4 x i32> %2, %3 %5 = extractelement <4 x i32> %4, i32 0 br label %while.body } Xiangling_L: Suggestion: The testcase can be further simplified into the following by removing tedious…
				ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Thanks a lot! That simplifies it and I'm still able to get the compiler error without the patch. ZarkoCA: Thanks a lot! That simplifies it and I'm still able to get the compiler error without the patch.
				Xiangling_LUnsubmitted Done Reply Inline Actions Sorry that I forgot to mention, attributes `attributes #0` are also not necessary. Xiangling_L: Sorry that I forgot to mention, attributes `attributes #0` are also not necessary.