This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/WebAssembly/
-
Target/
-
WebAssembly/
-
WebAssemblyISD.def
4/9
WebAssemblyISelLowering.cpp
-
WebAssemblyInstrSIMD.td
-
test/
-
CodeGen/WebAssembly/
-
WebAssembly/
-
simd-build-vector.ll
-
MC/WebAssembly/
-
WebAssembly/
-
simd-encodings.s

Differential D68527

[WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering
ClosedPublic

Authored by tlively on Oct 4 2019, 4:49 PM.

Download Raw Diff

Details

Reviewers

aheejin
dschuff

Commits

rGd5b7a4e2e8dc: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering
rL374188: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering

Summary

Adds the new v8x16.swizzle SIMD instruction as specified at
https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#swizzling-using-variable-indices.
In addition to adding swizzles as a candidate lowering in
LowerBUILD_VECTOR, also rewrites and simplifies the lowering to
minimize the number of replace_lanes necessary rather than trying to
minimize code size. This leads to more uses of v128.const instead of
splats, which is expected to increase performance.

The new code will be easier to tune once V8 implements all the vector
construction operations, and it will also be easier to add new
candidate instructions in the future if necessary.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 39018
Build 39017: arc lint + arc unit

Event Timeline

tlively created this revision.Oct 4 2019, 4:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2019, 4:49 PM

Herald added subscribers: llvm-commits, sunfish, hiraditya and 2 others. · View Herald Transcript

Harbormaster completed remote builds in B39018: Diff 223340.Oct 4 2019, 4:49 PM

tlively added a child revision: D68531: [WebAssembly] Add builtin and intrinsic for v8x16.swizzle.Oct 4 2019, 6:01 PM

I remember before we had a somewhat complicated logic to calculate the number of bytes of total instructions of each case of the case we use v128.const and vs. when we use splats. Don't we need that anymore? Can we make the decision solely based the number of swizzles / consts / and splats?

Is the performance of v128.const better than splats? How is performance of swizzles compared to v128.const?

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1350	Would using `count_if` in place of `find_if` be simpler?
1384	Nit: Variable names for the same things in `GetSwizzleSrcs` are `SrcVec` and `IndexVec`. Making the variable names same in the two places might make reading easier.
1388	Is using `forward_as_tuple` any different from using `tie` again in this case, given that this is not passed as an argument to a function?
1426	It's not in this CL, but is there a case this condition is not satisfied?

Address variable naming comment

Harbormaster completed remote builds in B39190: Diff 223922.Oct 8 2019, 12:31 PM

In D68527#1699832, @aheejin wrote:

I remember before we had a somewhat complicated logic to calculate the number of bytes of total instructions of each case of the case we use v128.const and vs. when we use splats. Don't we need that anymore? Can we make the decision solely based the number of swizzles / consts / and splats?

Minimizing code size is not as important for SIMD as maximizing performance, so I dumped that complicated logic. I am led to believe that v128.const will be faster than splats once it is implemented, but we have no way to measure yet. We can make the decision based on whatever heuristic we want, but minimizing number of instructions seems like a good metric for now until we can run experiments to tune the selection algorithm.

Is the performance of v128.const better than splats? How is performance of swizzles compared to v128.const?

Yes, I believe v128.const will be faster than splats. I don't know how swizzles and v128.const compare, but I do know that emulating swizzles requires a lot of instructions per lane but emulating a v128.const only requires a single replace_lane and constant per lane. So it makes sense to prefer swizzles over v128.consts for now.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1350	No, I need to get the iterator to the proper entry since I'm using a vector as an associative array here. If I used `count_if` and it returned 1, I would still need to find the entry to increment the count, so `find_if` is simpler because it gives me that entry directly.
1388	It turns out you can't nest `std::tie`. I have no idea why, but I got this solution from https://stackoverflow.com/questions/21298732/can-we-do-deep-tie-with-a-c1y-stdtie-like-function.
1426	Yes, for example when doing a sign extending load of an i8 to an i32 then splatting that i32.

In D68527#1700360, @tlively wrote:

In D68527#1699832, @aheejin wrote:

I remember before we had a somewhat complicated logic to calculate the number of bytes of total instructions of each case of the case we use v128.const and vs. when we use splats. Don't we need that anymore? Can we make the decision solely based the number of swizzles / consts / and splats?

Minimizing code size is not as important for SIMD as maximizing performance, so I dumped that complicated logic. I am led to believe that v128.const will be faster than splats once it is implemented, but we have no way to measure yet. We can make the decision based on whatever heuristic we want, but minimizing number of instructions seems like a good metric for now until we can run experiments to tune the selection algorithm.

Wouldn't minimizing the number of instruction be the same thing as minimizing the number of bytes, only more inaccurate?

Is the performance of v128.const better than splats? How is performance of swizzles compared to v128.const?

Yes, I believe v128.const will be faster than splats. I don't know how swizzles and v128.const compare, but I do know that emulating swizzles requires a lot of instructions per lane but emulating a v128.const only requires a single replace_lane and constant per lane. So it makes sense to prefer swizzles over v128.consts for now.

I don't understand this part well. If swizzles are a lot more complicated that v128.const in execution, doesn't that mean swizzles will likely to take longer to execute in wasm? Why the opposite?

The above are just some passing questions, but I'm not suggesting we restore the byte computation logic or another more complicated logic at this point, given that we don't have measurable any performance data at hand. Optimizing at this point seems too premature anyway. LGTM.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1384	In `GetSwizzleSrcs`, `IndexVec` is still `IndexVec`, while `SrcVec` was changed to `SwizzleSrc. Was that intentional?

This revision is now accepted and ready to land.Oct 8 2019, 10:59 PM

In D68527#1700939, @aheejin wrote:

Wouldn't minimizing the number of instruction be the same thing as minimizing the number of bytes, only more inaccurate?

It's true that minimizing instructions approximates minimizing bytes, but it also stands on its own as a reasonable metric. In this case minimizing instructions makes more sense than minimizing bytes.

If swizzles are a lot more complicated that v128.const in execution, doesn't that mean swizzles will likely to take longer to execute in wasm? Why the opposite?

Swizzles lower directly to hardware instructions so they are fast for engines to execute. But doing the same operation without a swizzle instruction would require a long sequence of other wasm instructions and therefore be slow to execute. Because this difference is large for swizzles it is a good idea to prefer to use them when possible.

tlively marked an inline comment as done.Oct 9 2019, 10:37 AM

tlively added inline comments.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1384	Not intentional! Thanks.

Closed by commit rGd5b7a4e2e8dc: [WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering (authored by tlively). · Explain WhyOct 9 2019, 10:38 AM

This revision was automatically updated to reflect the committed changes.

In D68527#1701774, @tlively wrote:

In D68527#1700939, @aheejin wrote:

If swizzles are a lot more complicated that v128.const in execution, doesn't that mean swizzles will likely to take longer to execute in wasm? Why the opposite?

Swizzles lower directly to hardware instructions so they are fast for engines to execute. But doing the same operation without a swizzle instruction would require a long sequence of other wasm instructions and therefore be slow to execute. Because this difference is large for swizzles it is a good idea to prefer to use them when possible.

We are deciding which one among const/swizzle/splat to use based on the number of lanes hit by the instruction. The rest is the number of replace_lanes, so I don't think swizzles are more expensive to emulate than others, because after a single const/swizzle/splat, all emulation cost is down to the number of replace_lanes...? Anyway, not really related to the CL itself

Revision Contents

Path

Size

llvm/

lib/

Target/

WebAssembly/

WebAssemblyISD.def

1 line

WebAssemblyISelLowering.cpp

196 lines

WebAssemblyInstrSIMD.td

9 lines

test/

CodeGen/

WebAssembly/

simd-build-vector.ll

187 lines

MC/

WebAssembly/

simd-encodings.s

3 lines

Diff 223340

llvm/lib/Target/WebAssembly/WebAssemblyISD.def

	Show All 20 Lines
	// A wrapper node for TargetExternalSymbol, TargetGlobalAddress, and MCSymbol			// A wrapper node for TargetExternalSymbol, TargetGlobalAddress, and MCSymbol
	HANDLE_NODETYPE(Wrapper)			HANDLE_NODETYPE(Wrapper)
	// A special wapper used in PIC code for __memory_base/__table_base relcative			// A special wapper used in PIC code for __memory_base/__table_base relcative
	// access.			// access.
	HANDLE_NODETYPE(WrapperPIC)			HANDLE_NODETYPE(WrapperPIC)
	HANDLE_NODETYPE(BR_IF)			HANDLE_NODETYPE(BR_IF)
	HANDLE_NODETYPE(BR_TABLE)			HANDLE_NODETYPE(BR_TABLE)
	HANDLE_NODETYPE(SHUFFLE)			HANDLE_NODETYPE(SHUFFLE)
				HANDLE_NODETYPE(SWIZZLE)
	HANDLE_NODETYPE(VEC_SHL)			HANDLE_NODETYPE(VEC_SHL)
	HANDLE_NODETYPE(VEC_SHR_S)			HANDLE_NODETYPE(VEC_SHR_S)
	HANDLE_NODETYPE(VEC_SHR_U)			HANDLE_NODETYPE(VEC_SHR_U)
	HANDLE_NODETYPE(LOAD_SPLAT)			HANDLE_NODETYPE(LOAD_SPLAT)
	HANDLE_NODETYPE(THROW)			HANDLE_NODETYPE(THROW)
	HANDLE_NODETYPE(MEMORY_COPY)			HANDLE_NODETYPE(MEMORY_COPY)
	HANDLE_NODETYPE(MEMORY_FILL)			HANDLE_NODETYPE(MEMORY_FILL)

	// add memory opcodes starting at ISD::FIRST_TARGET_MEMORY_OPCODE here...			// add memory opcodes starting at ISD::FIRST_TARGET_MEMORY_OPCODE here...

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

	Show First 20 Lines • Show All 1,286 Lines • ▼ Show 20 Lines
	}			}

	SDValue WebAssemblyTargetLowering::LowerBUILD_VECTOR(SDValue Op,			SDValue WebAssemblyTargetLowering::LowerBUILD_VECTOR(SDValue Op,
	SelectionDAG &DAG) const {			SelectionDAG &DAG) const {
	SDLoc DL(Op);			SDLoc DL(Op);
	const EVT VecT = Op.getValueType();			const EVT VecT = Op.getValueType();
	const EVT LaneT = Op.getOperand(0).getValueType();			const EVT LaneT = Op.getOperand(0).getValueType();
	const size_t Lanes = Op.getNumOperands();			const size_t Lanes = Op.getNumOperands();
				bool CanSwizzle = Subtarget->hasUnimplementedSIMD128() && VecT == MVT::v16i8;

				// BUILD_VECTORs are lowered to the instruction that initializes the highest
				// possible number of lanes at once followed by a sequence of replace_lane
				// instructions to individually initialize any remaining lanes.

				// TODO: Tune this. For example, lanewise swizzling is very expensive, so
				// swizzled lanes should be given greater weight.

				// TODO: Investigate building vectors by shuffling together vectors built by
				// separately specialized means.

	auto IsConstant = [](const SDValue &V) {			auto IsConstant = [](const SDValue &V) {
	return V.getOpcode() == ISD::Constant \|\| V.getOpcode() == ISD::ConstantFP;			return V.getOpcode() == ISD::Constant \|\| V.getOpcode() == ISD::ConstantFP;
	};			};

	// Find the most common operand, which is approximately the best to splat			// Returns the source vector and index vector pair if they exist. Checks for:
	using Entry = std::pair<SDValue, size_t>;			// (extract_vector_elt
	SmallVector<Entry, 16> ValueCounts;			// $src,
	size_t NumConst = 0, NumDynamic = 0;			// (sign_extend_inreg (extract_vector_elt $indices, $i))
	for (const SDValue &Lane : Op->op_values()) {			// )
	if (Lane.isUndef()) {			auto GetSwizzleSrcs = [](size_t I, const SDValue &Lane) {
	continue;			auto Bail = std::make_pair(SDValue(), SDValue());
	} else if (IsConstant(Lane)) {			if (Lane->getOpcode() != ISD::EXTRACT_VECTOR_ELT)
	NumConst++;			return Bail;
	} else {			const SDValue &SrcVec = Lane->getOperand(0);
	NumDynamic++;			const SDValue &IndexExt = Lane->getOperand(1);
	}			if (IndexExt->getOpcode() != ISD::SIGN_EXTEND_INREG)
	auto CountIt = std::find_if(ValueCounts.begin(), ValueCounts.end(),			return Bail;
	[&Lane](Entry A) { return A.first == Lane; });			const SDValue &Index = IndexExt->getOperand(0);
	if (CountIt == ValueCounts.end()) {			if (Index->getOpcode() != ISD::EXTRACT_VECTOR_ELT)
	ValueCounts.emplace_back(Lane, 1);			return Bail;
				const SDValue &IndexVec = Index->getOperand(0);
				if (SrcVec.getValueType() != MVT::v16i8 \|\|
				IndexVec.getValueType() != MVT::v16i8 \|\|
				Index->getOperand(1)->getOpcode() != ISD::Constant \|\|
				Index->getConstantOperandVal(1) != I)
				return Bail;
				return std::make_pair(SrcVec, IndexVec);
				};

				using ValueEntry = std::pair<SDValue, size_t>;
				SmallVector<ValueEntry, 16> SplatValueCounts;

				using SwizzleEntry = std::pair<std::pair<SDValue, SDValue>, size_t>;
				SmallVector<SwizzleEntry, 16> SwizzleCounts;

				auto AddCount = [](auto &Counts, const auto &Val) {
				auto CountIt = std::find_if(Counts.begin(), Counts.end(),
				[&Val](auto E) { return E.first == Val; });
				if (CountIt == Counts.end()) {
				Counts.emplace_back(Val, 1);
	} else {			} else {
	CountIt->second++;			CountIt->second++;
	}			}
	}			};
				aheejinUnsubmitted Not Done Reply Inline Actions Would using `count_if` in place of `find_if` be simpler? aheejin: Would using `count_if` in place of `find_if` be simpler?
				tlivelyAuthorUnsubmitted Done Reply Inline Actions No, I need to get the iterator to the proper entry since I'm using a vector as an associative array here. If I used `count_if` and it returned 1, I would still need to find the entry to increment the count, so `find_if` is simpler because it gives me that entry directly. tlively: No, I need to get the iterator to the proper entry since I'm using a vector as an associative…

				auto GetMostCommon = [](auto &Counts) {
	auto CommonIt =			auto CommonIt =
	std::max_element(ValueCounts.begin(), ValueCounts.end(),			std::max_element(Counts.begin(), Counts.end(),
	[](Entry A, Entry B) { return A.second < B.second; });			[](auto A, auto B) { return A.second < B.second; });
	assert(CommonIt != ValueCounts.end() && "Unexpected all-undef build_vector");			assert(CommonIt != Counts.end() && "Unexpected all-undef build_vector");
	SDValue SplatValue = CommonIt->first;			return *CommonIt;
	size_t NumCommon = CommonIt->second;			};

				size_t NumConstantLanes = 0;

				// Count eligible lanes for each type of vector creation op
				for (size_t I = 0; I < Lanes; ++I) {
				const SDValue &Lane = Op->getOperand(I);
				if (Lane.isUndef())
				continue;

				AddCount(SplatValueCounts, Lane);

				if (IsConstant(Lane)) {
				NumConstantLanes++;
				} else if (CanSwizzle) {
				auto SwizzleSrcs = GetSwizzleSrcs(I, Lane);
				if (SwizzleSrcs.first)
				AddCount(SwizzleCounts, SwizzleSrcs);
				}
				}

	// If v128.const is available, consider using it instead of a splat			SDValue SplatValue;
				size_t NumSplatLanes;
				std::tie(SplatValue, NumSplatLanes) = GetMostCommon(SplatValueCounts);

				SDValue SwizzleSrc;
				SDValue SwizzleIndices;
				aheejinUnsubmitted Not Done Reply Inline Actions Nit: Variable names for the same things in `GetSwizzleSrcs` are `SrcVec` and `IndexVec`. Making the variable names same in the two places might make reading easier. aheejin: Nit: Variable names for the same things in `GetSwizzleSrcs` are `SrcVec` and `IndexVec`. Making…
				aheejinUnsubmitted Not Done Reply Inline Actions In `GetSwizzleSrcs`, `IndexVec` is still `IndexVec`, while `SrcVec` was changed to `SwizzleSrc. Was that intentional? aheejin: In `GetSwizzleSrcs`, `IndexVec` is still `IndexVec`, while `SrcVec` was changed to `SwizzleSrc.
				tlivelyAuthorUnsubmitted Done Reply Inline Actions Not intentional! Thanks. tlively: Not intentional! Thanks.
				size_t NumSwizzleLanes = 0;
				if (SwizzleCounts.size())
				std::forward_as_tuple(std::tie(SwizzleSrc, SwizzleIndices),
				NumSwizzleLanes) = GetMostCommon(SwizzleCounts);
				aheejinUnsubmitted Not Done Reply Inline Actions Is using `forward_as_tuple` any different from using `tie` again in this case, given that this is not passed as an argument to a function? aheejin: Is using `forward_as_tuple` any different from using `tie` again in this case, given that this…
				tlivelyAuthorUnsubmitted Done Reply Inline Actions It turns out you can't nest `std::tie`. I have no idea why, but I got this solution from https://stackoverflow.com/questions/21298732/can-we-do-deep-tie-with-a-c1y-stdtie-like-function. tlively: It turns out you can't nest `std::tie`. I have no idea why, but I got this solution from https…

				// Predicate returning true if the lane is properly initialized by the
				// original instruction
				std::function<bool(size_t, const SDValue &)> IsLaneConstructed;
				SDValue Result;
	if (Subtarget->hasUnimplementedSIMD128()) {			if (Subtarget->hasUnimplementedSIMD128()) {
	// {i32,i64,f32,f64}.const opcode, and value			// Prefer swizzles over vector consts over splats
	const size_t ConstBytes = 1 + std::max(size_t(4), 16 / Lanes);			if (NumSwizzleLanes >= NumSplatLanes &&
	// SIMD prefix and opcode			NumSwizzleLanes >= NumConstantLanes) {
	const size_t SplatBytes = 2;			Result = DAG.getNode(WebAssemblyISD::SWIZZLE, DL, VecT, SwizzleSrc,
	const size_t SplatConstBytes = SplatBytes + ConstBytes;			SwizzleIndices);
	// SIMD prefix, opcode, and lane index			auto Swizzled = std::make_pair(SwizzleSrc, SwizzleIndices);
	const size_t ReplaceBytes = 3;			IsLaneConstructed = [&, Swizzled](size_t I, const SDValue &Lane) {
	const size_t ReplaceConstBytes = ReplaceBytes + ConstBytes;			return Swizzled == GetSwizzleSrcs(I, Lane);
	// SIMD prefix, v128.const opcode, and 128-bit value			};
	const size_t VecConstBytes = 18;			} else if (NumConstantLanes >= NumSplatLanes) {
	// Initial v128.const and a replace_lane for each non-const operand
	const size_t ConstInitBytes = VecConstBytes + NumDynamic * ReplaceBytes;
	// Initial splat and all necessary replace_lanes
	const size_t SplatInitBytes =
	IsConstant(SplatValue)
	// Initial constant splat
	? (SplatConstBytes +
	// Constant replace_lanes
	(NumConst - NumCommon) * ReplaceConstBytes +
	// Dynamic replace_lanes
	(NumDynamic * ReplaceBytes))
	// Initial dynamic splat
	: (SplatBytes +
	// Constant replace_lanes
	(NumConst * ReplaceConstBytes) +
	// Dynamic replace_lanes
	(NumDynamic - NumCommon) * ReplaceBytes);
	if (ConstInitBytes < SplatInitBytes) {
	// Create build_vector that will lower to initial v128.const
	SmallVector<SDValue, 16> ConstLanes;			SmallVector<SDValue, 16> ConstLanes;
	for (const SDValue &Lane : Op->op_values()) {			for (const SDValue &Lane : Op->op_values()) {
	if (IsConstant(Lane)) {			if (IsConstant(Lane)) {
	ConstLanes.push_back(Lane);			ConstLanes.push_back(Lane);
	} else if (LaneT.isFloatingPoint()) {			} else if (LaneT.isFloatingPoint()) {
	ConstLanes.push_back(DAG.getConstantFP(0, DL, LaneT));			ConstLanes.push_back(DAG.getConstantFP(0, DL, LaneT));
	} else {			} else {
	ConstLanes.push_back(DAG.getConstant(0, DL, LaneT));			ConstLanes.push_back(DAG.getConstant(0, DL, LaneT));
	}			}
	}			}
	SDValue Result = DAG.getBuildVector(VecT, DL, ConstLanes);			Result = DAG.getBuildVector(VecT, DL, ConstLanes);
	// Add replace_lane instructions for non-const lanes			IsLaneConstructed = [&](size_t _, const SDValue &Lane) {
	for (size_t I = 0; I < Lanes; ++I) {			return IsConstant(Lane);
	const SDValue &Lane = Op->getOperand(I);			};
	if (!Lane.isUndef() && !IsConstant(Lane))
	Result = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VecT, Result, Lane,
	DAG.getConstant(I, DL, MVT::i32));
	}
	return Result;
	}			}
	}			}
	// Use a splat for the initial vector			if (!Result) {
	SDValue Result;			// Use a splat, but possibly a load_splat
	// Possibly a load_splat
	LoadSDNode *SplattedLoad;			LoadSDNode *SplattedLoad;
	if (Subtarget->hasUnimplementedSIMD128() &&			if (Subtarget->hasUnimplementedSIMD128() &&
	(SplattedLoad = dyn_cast<LoadSDNode>(SplatValue)) &&			(SplattedLoad = dyn_cast<LoadSDNode>(SplatValue)) &&
	SplattedLoad->getMemoryVT() == VecT.getVectorElementType()) {			SplattedLoad->getMemoryVT() == VecT.getVectorElementType()) {
				aheejinUnsubmitted Not Done Reply Inline Actions It's not in this CL, but is there a case this condition is not satisfied? aheejin: It's not in this CL, but is there a case this condition is not satisfied?
				tlivelyAuthorUnsubmitted Done Reply Inline Actions Yes, for example when doing a sign extending load of an i8 to an i32 then splatting that i32. tlively: Yes, for example when doing a sign extending load of an i8 to an i32 then splatting that i32.
	Result = DAG.getNode(WebAssemblyISD::LOAD_SPLAT, DL, VecT, SplatValue);			Result = DAG.getNode(WebAssemblyISD::LOAD_SPLAT, DL, VecT, SplatValue);
	} else {			} else {
	Result = DAG.getSplatBuildVector(VecT, DL, SplatValue);			Result = DAG.getSplatBuildVector(VecT, DL, SplatValue);
	}			}
	// Add replace_lane instructions for other values			IsLaneConstructed = [&](size_t _, const SDValue &Lane) {
				return Lane == SplatValue;
				};
				}

				// Add replace_lane instructions for any unhandled values
	for (size_t I = 0; I < Lanes; ++I) {			for (size_t I = 0; I < Lanes; ++I) {
	const SDValue &Lane = Op->getOperand(I);			const SDValue &Lane = Op->getOperand(I);
	if (Lane != SplatValue)			if (!Lane.isUndef() && !IsLaneConstructed(I, Lane))
	Result = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VecT, Result, Lane,			Result = DAG.getNode(ISD::INSERT_VECTOR_ELT, DL, VecT, Result, Lane,
	DAG.getConstant(I, DL, MVT::i32));			DAG.getConstant(I, DL, MVT::i32));
	}			}

	return Result;			return Result;
	}			}

	SDValue			SDValue
	WebAssemblyTargetLowering::LowerVECTOR_SHUFFLE(SDValue Op,			WebAssemblyTargetLowering::LowerVECTOR_SHUFFLE(SDValue Op,
	SelectionDAG &DAG) const {			SelectionDAG &DAG) const {
	SDLoc DL(Op);			SDLoc DL(Op);
	ArrayRef<int> Mask = cast<ShuffleVectorSDNode>(Op.getNode())->getMask();			ArrayRef<int> Mask = cast<ShuffleVectorSDNode>(Op.getNode())->getMask();
	▲ Show 20 Lines • Show All 101 Lines • Show Last 20 Lines

llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td

Show First 20 Lines • Show All 269 Lines • ▼ Show 20 Lines	def : Pat<(vec_t (wasm_shuffle (vec_t V128:$x), (vec_t V128:$y),
(i32 LaneIdx32:$m4), (i32 LaneIdx32:$m5),		(i32 LaneIdx32:$m4), (i32 LaneIdx32:$m5),
(i32 LaneIdx32:$m6), (i32 LaneIdx32:$m7),		(i32 LaneIdx32:$m6), (i32 LaneIdx32:$m7),
(i32 LaneIdx32:$m8), (i32 LaneIdx32:$m9),		(i32 LaneIdx32:$m8), (i32 LaneIdx32:$m9),
(i32 LaneIdx32:$mA), (i32 LaneIdx32:$mB),		(i32 LaneIdx32:$mA), (i32 LaneIdx32:$mB),
(i32 LaneIdx32:$mC), (i32 LaneIdx32:$mD),		(i32 LaneIdx32:$mC), (i32 LaneIdx32:$mD),
(i32 LaneIdx32:$mE), (i32 LaneIdx32:$mF)))>;		(i32 LaneIdx32:$mE), (i32 LaneIdx32:$mF)))>;
}		}

		// Swizzle lanes: v8x16.swizzle
		def wasm_swizzle_t : SDTypeProfile<1, 2, []>;
		def wasm_swizzle : SDNode<"WebAssemblyISD::SWIZZLE", wasm_swizzle_t>;
		defm SWIZZLE :
		SIMD_I<(outs V128:$dst), (ins V128:$src, V128:$mask), (outs), (ins),
		[(set (v16i8 V128:$dst),
		(wasm_swizzle (v16i8 V128:$src), (v16i8 V128:$mask)))],
		"v8x16.swizzle\t$dst, $src, $mask", "v8x16.swizzle", 192>;

// Create vector with identical lanes: splat		// Create vector with identical lanes: splat
def splat2 : PatFrag<(ops node:$x), (build_vector node:$x, node:$x)>;		def splat2 : PatFrag<(ops node:$x), (build_vector node:$x, node:$x)>;
def splat4 : PatFrag<(ops node:$x), (build_vector		def splat4 : PatFrag<(ops node:$x), (build_vector
node:$x, node:$x, node:$x, node:$x)>;		node:$x, node:$x, node:$x, node:$x)>;
def splat8 : PatFrag<(ops node:$x), (build_vector		def splat8 : PatFrag<(ops node:$x), (build_vector
node:$x, node:$x, node:$x, node:$x,		node:$x, node:$x, node:$x, node:$x,
node:$x, node:$x, node:$x, node:$x)>;		node:$x, node:$x, node:$x, node:$x)>;
def splat16 : PatFrag<(ops node:$x), (build_vector		def splat16 : PatFrag<(ops node:$x), (build_vector
▲ Show 20 Lines • Show All 589 Lines • Show Last 20 Lines

llvm/test/CodeGen/WebAssembly/simd-build-vector.ll

	; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+unimplemented-simd128 \| FileCheck %s			; RUN: llc < %s -asm-verbose=false -verify-machineinstrs -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -mattr=+unimplemented-simd128 \| FileCheck %s

	; Test that the logic to choose between v128.const vector			; Test that the logic to choose between v128.const vector
	; initialization and splat vector initialization and to optimize the			; initialization and splat vector initialization and to optimize the
	; choice of splat value works correctly.			; choice of splat value works correctly.

	target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"			target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"
	target triple = "wasm32-unknown-unknown"			target triple = "wasm32-unknown-unknown"

	; CHECK-LABEL: same_const_one_replaced_i8x16:			; CHECK-LABEL: same_const_one_replaced_i16x8:
	; CHECK-NEXT: .functype same_const_one_replaced_i8x16 (i32) -> (v128)			; CHECK-NEXT: .functype same_const_one_replaced_i16x8 (i32) -> (v128)
	; CHECK-NEXT: i32.const $push[[L0:[0-9]+]]=, 42			; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 42, 42, 42, 42, 42, 0, 42, 42
	; CHECK-NEXT: i16x8.splat $push[[L1:[0-9]+]]=, $pop[[L0]]			; CHECK-NEXT: i16x8.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 5, $0
	; CHECK-NEXT: i16x8.replace_lane $push[[L2:[0-9]+]]=, $pop[[L1]], 5, $0			; CHECK-NEXT: return $pop[[L1]]
	; CHECK-NEXT: return $pop[[L2]]			define <8 x i16> @same_const_one_replaced_i16x8(i16 %x) {
	define <8 x i16> @same_const_one_replaced_i8x16(i16 %x) {
	%v = insertelement			%v = insertelement
	<8 x i16> <i16 42, i16 42, i16 42, i16 42, i16 42, i16 42, i16 42, i16 42>,			<8 x i16> <i16 42, i16 42, i16 42, i16 42, i16 42, i16 42, i16 42, i16 42>,
	i16 %x,			i16 %x,
	i32 5			i32 5
	ret <8 x i16> %v			ret <8 x i16> %v
	}			}

	; CHECK-LABEL: different_const_one_replaced_i8x16:			; CHECK-LABEL: different_const_one_replaced_i16x8:
	; CHECK-NEXT: .functype different_const_one_replaced_i8x16 (i32) -> (v128)			; CHECK-NEXT: .functype different_const_one_replaced_i16x8 (i32) -> (v128)
	; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 1, -2, 3, -4, 5, 0, 7, -8			; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 1, -2, 3, -4, 5, 0, 7, -8
	; CHECK-NEXT: i16x8.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 5, $0			; CHECK-NEXT: i16x8.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 5, $0
	; CHECK-NEXT: return $pop[[L1]]			; CHECK-NEXT: return $pop[[L1]]
	define <8 x i16> @different_const_one_replaced_i8x16(i16 %x) {			define <8 x i16> @different_const_one_replaced_i16x8(i16 %x) {
	%v = insertelement			%v = insertelement
	<8 x i16> <i16 1, i16 -2, i16 3, i16 -4, i16 5, i16 -6, i16 7, i16 -8>,			<8 x i16> <i16 1, i16 -2, i16 3, i16 -4, i16 5, i16 -6, i16 7, i16 -8>,
	i16 %x,			i16 %x,
	i32 5			i32 5
	ret <8 x i16> %v			ret <8 x i16> %v
	}			}

	; CHECK-LABEL: same_const_one_replaced_f32x4:			; CHECK-LABEL: same_const_one_replaced_f32x4:
	; CHECK-NEXT: .functype same_const_one_replaced_f32x4 (f32) -> (v128)			; CHECK-NEXT: .functype same_const_one_replaced_f32x4 (f32) -> (v128)
	; CHECK-NEXT: f32.const $push[[L0:[0-9]+]]=, 0x1.5p5			; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 0x1.5p5, 0x1.5p5, 0x0p0, 0x1.5p5
	; CHECK-NEXT: f32x4.splat $push[[L1:[0-9]+]]=, $pop[[L0]]			; CHECK-NEXT: f32x4.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 2, $0
	; CHECK-NEXT: f32x4.replace_lane $push[[L2:[0-9]+]]=, $pop[[L1]], 2, $0			; CHECK-NEXT: return $pop[[L1]]
	; CHECK-NEXT: return $pop[[L2]]
	define <4 x float> @same_const_one_replaced_f32x4(float %x) {			define <4 x float> @same_const_one_replaced_f32x4(float %x) {
	%v = insertelement			%v = insertelement
	<4 x float> <float 42., float 42., float 42., float 42.>,			<4 x float> <float 42., float 42., float 42., float 42.>,
	float %x,			float %x,
	i32 2			i32 2
	ret <4 x float> %v			ret <4 x float> %v
	}			}

	; CHECK-LABEL: different_const_one_replaced_f32x4:			; CHECK-LABEL: different_const_one_replaced_f32x4:
	; CHECK-NEXT: .functype different_const_one_replaced_f32x4 (f32) -> (v128)			; CHECK-NEXT: .functype different_const_one_replaced_f32x4 (f32) -> (v128)
	; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 0x1p0, 0x1p1, 0x0p0, 0x1p2			; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 0x1p0, 0x1p1, 0x0p0, 0x1p2
	; CHECK-NEXT: f32x4.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 2, $0			; CHECK-NEXT: f32x4.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 2, $0
	; CHECK-NEXT: return $pop[[L1]]			; CHECK-NEXT: return $pop[[L1]]
	define <4 x float> @different_const_one_replaced_f32x4(float %x) {			define <4 x float> @different_const_one_replaced_f32x4(float %x) {
	%v = insertelement			%v = insertelement
	<4 x float> <float 1., float 2., float 3., float 4.>,			<4 x float> <float 1., float 2., float 3., float 4.>,
	float %x,			float %x,
	i32 2			i32 2
	ret <4 x float> %v			ret <4 x float> %v
	}			}

	; CHECK-LABEL: splat_common_const_i32x4:			; CHECK-LABEL: splat_common_const_i32x4:
	; CHECK-NEXT: .functype splat_common_const_i32x4 () -> (v128)			; CHECK-NEXT: .functype splat_common_const_i32x4 () -> (v128)
	; CHECK-NEXT: i32.const $push[[L0:[0-9]+]]=, 3			; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 0, 3, 3, 1
	; CHECK-NEXT: i32x4.splat $push[[L1:[0-9]+]]=, $pop[[L0]]			; CHECK-NEXT: return $pop[[L0]]
	; CHECK-NEXT: i32.const $push[[L2:[0-9]+]]=, 1
	; CHECK-NEXT: i32x4.replace_lane $push[[L3:[0-9]+]]=, $pop[[L1]], 3, $pop[[L2]]
	; CHECK-NEXT: return $pop[[L3]]
	define <4 x i32> @splat_common_const_i32x4() {			define <4 x i32> @splat_common_const_i32x4() {
	ret <4 x i32> <i32 undef, i32 3, i32 3, i32 1>			ret <4 x i32> <i32 undef, i32 3, i32 3, i32 1>
	}			}

	; CHECK-LABEL: splat_common_arg_i16x8:			; CHECK-LABEL: splat_common_arg_i16x8:
	; CHECK-NEXT: .functype splat_common_arg_i16x8 (i32, i32, i32) -> (v128)			; CHECK-NEXT: .functype splat_common_arg_i16x8 (i32, i32, i32) -> (v128)
	; CHECK-NEXT: i16x8.splat $push[[L0:[0-9]+]]=, $2			; CHECK-NEXT: i16x8.splat $push[[L0:[0-9]+]]=, $2
	; CHECK-NEXT: i16x8.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 0, $1			; CHECK-NEXT: i16x8.replace_lane $push[[L1:[0-9]+]]=, $pop[[L0]], 0, $1
	; CHECK-NEXT: i16x8.replace_lane $push[[L2:[0-9]+]]=, $pop[[L1]], 2, $0			; CHECK-NEXT: i16x8.replace_lane $push[[L2:[0-9]+]]=, $pop[[L1]], 2, $0
	; CHECK-NEXT: i16x8.replace_lane $push[[L3:[0-9]+]]=, $pop[[L2]], 4, $1			; CHECK-NEXT: i16x8.replace_lane $push[[L3:[0-9]+]]=, $pop[[L2]], 4, $1
	; CHECK-NEXT: i16x8.replace_lane $push[[L4:[0-9]+]]=, $pop[[L3]], 7, $1			; CHECK-NEXT: i16x8.replace_lane $push[[L4:[0-9]+]]=, $pop[[L3]], 7, $1
	; CHECK-NEXT: return $pop[[L4]]			; CHECK-NEXT: return $pop[[L4]]
	define <8 x i16> @splat_common_arg_i16x8(i16 %a, i16 %b, i16 %c) {			define <8 x i16> @splat_common_arg_i16x8(i16 %a, i16 %b, i16 %c) {
	%v0 = insertelement <8 x i16> undef, i16 %b, i32 0			%v0 = insertelement <8 x i16> undef, i16 %b, i32 0
	%v1 = insertelement <8 x i16> %v0, i16 %c, i32 1			%v1 = insertelement <8 x i16> %v0, i16 %c, i32 1
	%v2 = insertelement <8 x i16> %v1, i16 %a, i32 2			%v2 = insertelement <8 x i16> %v1, i16 %a, i32 2
	%v3 = insertelement <8 x i16> %v2, i16 %c, i32 3			%v3 = insertelement <8 x i16> %v2, i16 %c, i32 3
	%v4 = insertelement <8 x i16> %v3, i16 %b, i32 4			%v4 = insertelement <8 x i16> %v3, i16 %b, i32 4
	%v5 = insertelement <8 x i16> %v4, i16 %c, i32 5			%v5 = insertelement <8 x i16> %v4, i16 %c, i32 5
	%v6 = insertelement <8 x i16> %v5, i16 %c, i32 6			%v6 = insertelement <8 x i16> %v5, i16 %c, i32 6
	%v7 = insertelement <8 x i16> %v6, i16 %b, i32 7			%v7 = insertelement <8 x i16> %v6, i16 %b, i32 7
	ret <8 x i16> %v7			ret <8 x i16> %v7
	}			}

				; CHECK-LABEL: swizzle_one_i8x16:
				; CHECK-NEXT: .functype swizzle_one_i8x16 (v128, v128) -> (v128)
				; CHECK-NEXT: v8x16.swizzle $push[[L0:[0-9]+]]=, $0, $1
				; CHECK-NEXT: return $pop[[L0]]
				define <16 x i8> @swizzle_one_i8x16(<16 x i8> %src, <16 x i8> %mask) {
				%m0 = extractelement <16 x i8> %mask, i32 0
				%s0 = extractelement <16 x i8> %src, i8 %m0
				%v0 = insertelement <16 x i8> undef, i8 %s0, i32 0
				ret <16 x i8> %v0
				}

				; CHECK-LABEL: swizzle_all_i8x16:
				; CHECK-NEXT: .functype swizzle_all_i8x16 (v128, v128) -> (v128)
				; CHECK-NEXT: v8x16.swizzle $push[[L0:[0-9]+]]=, $0, $1
				; CHECK-NEXT: return $pop[[L0]]
				define <16 x i8> @swizzle_all_i8x16(<16 x i8> %src, <16 x i8> %mask) {
				%m0 = extractelement <16 x i8> %mask, i32 0
				%s0 = extractelement <16 x i8> %src, i8 %m0
				%v0 = insertelement <16 x i8> undef, i8 %s0, i32 0
				%m1 = extractelement <16 x i8> %mask, i32 1
				%s1 = extractelement <16 x i8> %src, i8 %m1
				%v1 = insertelement <16 x i8> %v0, i8 %s1, i32 1
				%m2 = extractelement <16 x i8> %mask, i32 2
				%s2 = extractelement <16 x i8> %src, i8 %m2
				%v2 = insertelement <16 x i8> %v1, i8 %s2, i32 2
				%m3 = extractelement <16 x i8> %mask, i32 3
				%s3 = extractelement <16 x i8> %src, i8 %m3
				%v3 = insertelement <16 x i8> %v2, i8 %s3, i32 3
				%m4 = extractelement <16 x i8> %mask, i32 4
				%s4 = extractelement <16 x i8> %src, i8 %m4
				%v4 = insertelement <16 x i8> %v3, i8 %s4, i32 4
				%m5 = extractelement <16 x i8> %mask, i32 5
				%s5 = extractelement <16 x i8> %src, i8 %m5
				%v5 = insertelement <16 x i8> %v4, i8 %s5, i32 5
				%m6 = extractelement <16 x i8> %mask, i32 6
				%s6 = extractelement <16 x i8> %src, i8 %m6
				%v6 = insertelement <16 x i8> %v5, i8 %s6, i32 6
				%m7 = extractelement <16 x i8> %mask, i32 7
				%s7 = extractelement <16 x i8> %src, i8 %m7
				%v7 = insertelement <16 x i8> %v6, i8 %s7, i32 7
				%m8 = extractelement <16 x i8> %mask, i32 8
				%s8 = extractelement <16 x i8> %src, i8 %m8
				%v8 = insertelement <16 x i8> %v7, i8 %s8, i32 8
				%m9 = extractelement <16 x i8> %mask, i32 9
				%s9 = extractelement <16 x i8> %src, i8 %m9
				%v9 = insertelement <16 x i8> %v8, i8 %s9, i32 9
				%m10 = extractelement <16 x i8> %mask, i32 10
				%s10 = extractelement <16 x i8> %src, i8 %m10
				%v10 = insertelement <16 x i8> %v9, i8 %s10, i32 10
				%m11 = extractelement <16 x i8> %mask, i32 11
				%s11 = extractelement <16 x i8> %src, i8 %m11
				%v11 = insertelement <16 x i8> %v10, i8 %s11, i32 11
				%m12 = extractelement <16 x i8> %mask, i32 12
				%s12 = extractelement <16 x i8> %src, i8 %m12
				%v12 = insertelement <16 x i8> %v11, i8 %s12, i32 12
				%m13 = extractelement <16 x i8> %mask, i32 13
				%s13 = extractelement <16 x i8> %src, i8 %m13
				%v13 = insertelement <16 x i8> %v12, i8 %s13, i32 13
				%m14 = extractelement <16 x i8> %mask, i32 14
				%s14 = extractelement <16 x i8> %src, i8 %m14
				%v14 = insertelement <16 x i8> %v13, i8 %s14, i32 14
				%m15 = extractelement <16 x i8> %mask, i32 15
				%s15 = extractelement <16 x i8> %src, i8 %m15
				%v15 = insertelement <16 x i8> %v14, i8 %s15, i32 15
				ret <16 x i8> %v15
				}

				; CHECK-LABEL: swizzle_one_i16x8:
				; CHECK-NEXT: .functype swizzle_one_i16x8 (v128, v128) -> (v128)
				; CHECK-NOT: swizzle
				; CHECK: return
				define <8 x i16> @swizzle_one_i16x8(<8 x i16> %src, <8 x i16> %mask) {
				%m0 = extractelement <8 x i16> %mask, i32 0
				%s0 = extractelement <8 x i16> %src, i16 %m0
				%v0 = insertelement <8 x i16> undef, i16 %s0, i32 0
				ret <8 x i16> %v0
				}

				; CHECK-LABEL: mashup_swizzle_i8x16:
				; CHECK-NEXT: .functype mashup_swizzle_i8x16 (v128, v128, i32) -> (v128)
				; CHECK-NEXT: v8x16.swizzle $push[[L0:[0-9]+]]=, $0, $1
				; CHECK: i8x16.replace_lane
				; CHECK: i8x16.replace_lane
				; CHECK: i8x16.replace_lane
				; CHECK: i8x16.replace_lane
				; CHECK: return
				define <16 x i8> @mashup_swizzle_i8x16(<16 x i8> %src, <16 x i8> %mask, i8 %splatted) {
				; swizzle 0
				%m0 = extractelement <16 x i8> %mask, i32 0
				%s0 = extractelement <16 x i8> %src, i8 %m0
				%v0 = insertelement <16 x i8> undef, i8 %s0, i32 0
				; swizzle 7
				%m1 = extractelement <16 x i8> %mask, i32 7
				%s1 = extractelement <16 x i8> %src, i8 %m1
				%v1 = insertelement <16 x i8> %v0, i8 %s1, i32 7
				; splat 3
				%v2 = insertelement <16 x i8> %v1, i8 %splatted, i32 3
				; splat 12
				%v3 = insertelement <16 x i8> %v2, i8 %splatted, i32 12
				; const 4
				%v4 = insertelement <16 x i8> %v3, i8 42, i32 4
				; const 14
				%v5 = insertelement <16 x i8> %v4, i8 42, i32 14
				ret <16 x i8> %v5
				}

				; CHECK-LABEL: mashup_const_i8x16:
				; CHECK-NEXT: .functype mashup_const_i8x16 (v128, v128, i32) -> (v128)
				; CHECK: v128.const $push[[L0:[0-9]+]]=, 0, 0, 0, 0, 42, 0, 0, 0, 0, 0, 0, 0, 0, 0, 42, 0
				; CHECK: i8x16.replace_lane
				; CHECK: i8x16.replace_lane
				; CHECK: i8x16.replace_lane
				; CHECK: return
				define <16 x i8> @mashup_const_i8x16(<16 x i8> %src, <16 x i8> %mask, i8 %splatted) {
				; swizzle 0
				%m0 = extractelement <16 x i8> %mask, i32 0
				%s0 = extractelement <16 x i8> %src, i8 %m0
				%v0 = insertelement <16 x i8> undef, i8 %s0, i32 0
				; splat 3
				%v1 = insertelement <16 x i8> %v0, i8 %splatted, i32 3
				; splat 12
				%v2 = insertelement <16 x i8> %v1, i8 %splatted, i32 12
				; const 4
				%v3 = insertelement <16 x i8> %v2, i8 42, i32 4
				; const 14
				%v4 = insertelement <16 x i8> %v3, i8 42, i32 14
				ret <16 x i8> %v4
				}

				; CHECK-LABEL: mashup_splat_i8x16:
				; CHECK-NEXT: .functype mashup_splat_i8x16 (v128, v128, i32) -> (v128)
				; CHECK: i8x16.splat $push[[L0:[0-9]+]]=, $2
				; CHECK: i8x16.replace_lane
				; CHECK: i8x16.replace_lane
				; CHECK: return
				define <16 x i8> @mashup_splat_i8x16(<16 x i8> %src, <16 x i8> %mask, i8 %splatted) {
				; swizzle 0
				%m0 = extractelement <16 x i8> %mask, i32 0
				%s0 = extractelement <16 x i8> %src, i8 %m0
				%v0 = insertelement <16 x i8> undef, i8 %s0, i32 0
				; splat 3
				%v1 = insertelement <16 x i8> %v0, i8 %splatted, i32 3
				; splat 12
				%v2 = insertelement <16 x i8> %v1, i8 %splatted, i32 12
				; const 4
				%v3 = insertelement <16 x i8> %v2, i8 42, i32 4
				ret <16 x i8> %v3
				}

	; CHECK-LABEL: undef_const_insert_f32x4:			; CHECK-LABEL: undef_const_insert_f32x4:
	; CHECK-NEXT: .functype undef_const_insert_f32x4 () -> (v128)			; CHECK-NEXT: .functype undef_const_insert_f32x4 () -> (v128)
	; CHECK-NEXT: f32.const $push[[L0:[0-9]+]]=, 0x1.5p5			; CHECK-NEXT: v128.const $push[[L0:[0-9]+]]=, 0x0p0, 0x1.5p5, 0x0p0, 0x0p0
	; CHECK-NEXT: f32x4.splat $push[[L1:[0-9]+]]=, $pop[[L0]]			; CHECK-NEXT: return $pop[[L0]]
	; CHECK-NEXT: return $pop[[L1]]
	define <4 x float> @undef_const_insert_f32x4() {			define <4 x float> @undef_const_insert_f32x4() {
	%v = insertelement <4 x float> undef, float 42., i32 1			%v = insertelement <4 x float> undef, float 42., i32 1
	ret <4 x float> %v			ret <4 x float> %v
	}			}

	; CHECK-LABEL: undef_arg_insert_i32x4:			; CHECK-LABEL: undef_arg_insert_i32x4:
	; CHECK-NEXT: .functype undef_arg_insert_i32x4 (i32) -> (v128)			; CHECK-NEXT: .functype undef_arg_insert_i32x4 (i32) -> (v128)
	; CHECK-NEXT: i32x4.splat $push[[L0:[0-9]+]]=, $0			; CHECK-NEXT: i32x4.splat $push[[L0:[0-9]+]]=, $0
	Show All 20 Lines

llvm/test/MC/WebAssembly/simd-encodings.s

Show First 20 Lines • Show All 457 Lines • ▼ Show 20 Lines	main:
f32x4.convert_i32x4_u		f32x4.convert_i32x4_u

# CHECK: f64x2.convert_i64x2_s # encoding: [0xfd,0xb1,0x01]		# CHECK: f64x2.convert_i64x2_s # encoding: [0xfd,0xb1,0x01]
f64x2.convert_i64x2_s		f64x2.convert_i64x2_s

# CHECK: f64x2.convert_i64x2_u # encoding: [0xfd,0xb2,0x01]		# CHECK: f64x2.convert_i64x2_u # encoding: [0xfd,0xb2,0x01]
f64x2.convert_i64x2_u		f64x2.convert_i64x2_u

		# CHECK: v8x16.swizzle # encoding: [0xfd,0xc0,0x01]
		v8x16.swizzle

# CHECK: v8x16.load_splat 48 # encoding: [0xfd,0xc2,0x01,0x00,0x30]		# CHECK: v8x16.load_splat 48 # encoding: [0xfd,0xc2,0x01,0x00,0x30]
v8x16.load_splat 48		v8x16.load_splat 48

# CHECK: v16x8.load_splat 48 # encoding: [0xfd,0xc3,0x01,0x01,0x30]		# CHECK: v16x8.load_splat 48 # encoding: [0xfd,0xc3,0x01,0x01,0x30]
v16x8.load_splat 48		v16x8.load_splat 48

# CHECK: v32x4.load_splat 48 # encoding: [0xfd,0xc4,0x01,0x02,0x30]		# CHECK: v32x4.load_splat 48 # encoding: [0xfd,0xc4,0x01,0x02,0x30]
v32x4.load_splat 48		v32x4.load_splat 48
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR loweringClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 223340

llvm/lib/Target/WebAssembly/WebAssemblyISD.def

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td

llvm/test/CodeGen/WebAssembly/simd-build-vector.ll

llvm/test/MC/WebAssembly/simd-encodings.s

[WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering
ClosedPublic