This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/WebAssembly/
-
Target/
-
WebAssembly/
2/2
WebAssemblyISelLowering.cpp
-
test/CodeGen/WebAssembly/
-
CodeGen/
-
WebAssembly/
-
simd-shuffle-bitcast.ll

Differential D80164

[WebAssembly] Fix bug in custom shuffle combine
ClosedPublic

Authored by tlively on May 18 2020, 2:37 PM.

Download Raw Diff

Details

Reviewers

aheejin

Commits

rG8a43d41a4070: [WebAssembly] Fix bug in custom shuffle combine

Summary

The code previously assumed the source of the bitcast in the combined
pattern was a vector type, but this is not always true. This patch
adds a check to avoid an assertion failure in that case.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tlively created this revision.May 18 2020, 2:37 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2020, 2:37 PM

Herald added subscribers: llvm-commits, sunfish, hiraditya and 3 others. · View Herald Transcript

Harbormaster completed remote builds in B57127: Diff 264723.May 18 2020, 3:45 PM

LGTM.

A separate question: In wasm_simd128.h, why are the type for all return values v128_t? If we don't [[ https://github.com/llvm/llvm-project/blob/56079e1de1129837aa7569d8b3bb5e50afc0f1ea/clang/lib/Headers/wasm_simd128.h#L315 | cast return types to v128_t ]] and return __f32x4 as is, are bitcasts still gonna be generated?

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1717–1718	Nit: double spaces between `,` and `undef`
1718–1719	Nit: `t0` -> `T0` for consistency

aheejin accepted this revision.May 19 2020, 3:59 AM

This revision is now accepted and ready to land.May 19 2020, 3:59 AM

Address comments

Closed by commit rG8a43d41a4070: [WebAssembly] Fix bug in custom shuffle combine (authored by tlively). · Explain WhyMay 19 2020, 1:11 PM

This revision was automatically updated to reflect the committed changes.

Harbormaster failed remote builds in B57249: Diff 264999!May 19 2020, 1:13 PM

In D80164#2043610, @aheejin wrote:

LGTM.

A separate question: In wasm_simd128.h, why are the type for all return values v128_t? If we don't [[ https://github.com/llvm/llvm-project/blob/56079e1de1129837aa7569d8b3bb5e50afc0f1ea/clang/lib/Headers/wasm_simd128.h#L315 | cast return types to v128_t ]] and return __f32x4 as is, are bitcasts still gonna be generated?

Ping

In D80164#2044958, @aheejin wrote:

In D80164#2043610, @aheejin wrote:

LGTM.

A separate question: In wasm_simd128.h, why are the type for all return values v128_t? If we don't [[ https://github.com/llvm/llvm-project/blob/56079e1de1129837aa7569d8b3bb5e50afc0f1ea/clang/lib/Headers/wasm_simd128.h#L315 | cast return types to v128_t ]] and return __f32x4 as is, are bitcasts still gonna be generated?

Ping

Sorry for the very delayed response. No, without those casts there are no bitcasts generated. For some reason I don't understand, though, the bit casts are generated in a different order when the input and output type have the same number of elements. This is why the problem only showed up for wasm_f32x4_splat. If I changed the definition of v128_t to be the same as __i64x2, I got the same issue with wasm_f64x2_splat instead.

Sorry not sure if I understand... what my question was, why do we need to casts return values to v128_t at all? (I'm not very familiar with the header file) So for example, for f32x4_splat, can't we do just

static __inline__ __f32x4 __DEFAULT_FN_ATTRS wasm_f32x4_splat(float __a) {
  return (__f32x4){__a, __a, __a, __a};
}

instead of

static __inline__ v128_t __DEFAULT_FN_ATTRS wasm_f32x4_splat(float __a) {
  return (v128_t)(__f32x4){__a, __a, __a, __a};
}

? The same for all other intrinsics. Is there a reason that all intrinsic's return type should be v128_t?

In D80164#2045666, @aheejin wrote:
Sorry not sure if I understand... what my question was, why do we need to casts return values to v128_t at all? (I'm not very familiar with the header file) So for example, for f32x4_splat, can't we do just
static __inline__ __f32x4 __DEFAULT_FN_ATTRS wasm_f32x4_splat(float __a) {
  return (__f32x4){__a, __a, __a, __a};
}
instead of
static __inline__ v128_t __DEFAULT_FN_ATTRS wasm_f32x4_splat(float __a) {
  return (v128_t)(__f32x4){__a, __a, __a, __a};
}
? The same for all other intrinsics. Is there a reason that all intrinsic's return type should be v128_t?

Oh gotcha. This is just a design decision. The header originally had different user-facing types for the different lane interpretations, but after some feedback from some early users we decided to just expose a single v128_t type for users to worry about. Internally we still need to use all the different types just to get the compiler to emit the correct code.

Revision Contents

Path

Size

llvm/

lib/

Target/

WebAssembly/

WebAssemblyISelLowering.cpp

7 lines

test/

CodeGen/

WebAssembly/

simd-shuffle-bitcast.ll

11 lines

Diff 264999

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

	Show First 20 Lines • Show All 1,708 Lines • ▼ Show 20 Lines
	// Custom DAG combine hooks			// Custom DAG combine hooks
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	static SDValue			static SDValue
	performVECTOR_SHUFFLECombine(SDNode *N, TargetLowering::DAGCombinerInfo &DCI) {			performVECTOR_SHUFFLECombine(SDNode *N, TargetLowering::DAGCombinerInfo &DCI) {
	auto &DAG = DCI.DAG;			auto &DAG = DCI.DAG;
	auto Shuffle = cast<ShuffleVectorSDNode>(N);			auto Shuffle = cast<ShuffleVectorSDNode>(N);

	// Hoist vector bitcasts that don't change the number of lanes out of unary			// Hoist vector bitcasts that don't change the number of lanes out of unary
	// shuffles, where they are less likely to get in the way of other combines.			// shuffles, where they are less likely to get in the way of other combines.
	// (shuffle (vNxT1 (bitcast (vNxT0 x))), undef, mask) ->			// (shuffle (vNxT1 (bitcast (vNxT0 x))), undef, mask) ->
				aheejinUnsubmitted Done Reply Inline Actions Nit: double spaces between `,` and `undef` aheejin: Nit: double spaces between `,` and `undef`
	// (vNxT1 (bitcast (vNxt0 (shuffle x, undef, mask))))			// (vNxT1 (bitcast (vNxT0 (shuffle x, undef, mask))))
				aheejinUnsubmitted Done Reply Inline Actions Nit: `t0` -> `T0` for consistency aheejin: Nit: `t0` -> `T0` for consistency
	SDValue Bitcast = N->getOperand(0);			SDValue Bitcast = N->getOperand(0);
	if (Bitcast.getOpcode() != ISD::BITCAST)			if (Bitcast.getOpcode() != ISD::BITCAST)
	return SDValue();			return SDValue();
	if (!N->getOperand(1).isUndef())			if (!N->getOperand(1).isUndef())
	return SDValue();			return SDValue();
	SDValue CastOp = Bitcast.getOperand(0);			SDValue CastOp = Bitcast.getOperand(0);
	MVT SrcType = CastOp.getSimpleValueType();			MVT SrcType = CastOp.getSimpleValueType();
	MVT DstType = Bitcast.getSimpleValueType();			MVT DstType = Bitcast.getSimpleValueType();
	if (SrcType.getVectorNumElements() != DstType.getVectorNumElements())			if (!SrcType.is128BitVector() \|\|
				SrcType.getVectorNumElements() != DstType.getVectorNumElements())
	return SDValue();			return SDValue();
	SDValue NewShuffle = DAG.getVectorShuffle(			SDValue NewShuffle = DAG.getVectorShuffle(
	SrcType, SDLoc(N), CastOp, DAG.getUNDEF(SrcType), Shuffle->getMask());			SrcType, SDLoc(N), CastOp, DAG.getUNDEF(SrcType), Shuffle->getMask());
	return DAG.getBitcast(DstType, NewShuffle);			return DAG.getBitcast(DstType, NewShuffle);
	}			}

	SDValue			SDValue
	WebAssemblyTargetLowering::PerformDAGCombine(SDNode *N,			WebAssemblyTargetLowering::PerformDAGCombine(SDNode *N,
	DAGCombinerInfo &DCI) const {			DAGCombinerInfo &DCI) const {
	switch (N->getOpcode()) {			switch (N->getOpcode()) {
	default:			default:
	return SDValue();			return SDValue();
	case ISD::VECTOR_SHUFFLE:			case ISD::VECTOR_SHUFFLE:
	return performVECTOR_SHUFFLECombine(N, DCI);			return performVECTOR_SHUFFLECombine(N, DCI);
	}			}
	}			}

llvm/test/CodeGen/WebAssembly/simd-shuffle-bitcast.ll

	Show All 11 Lines
	; CHECK-NEXT: f32x4.splat $push[[R:[0-9]+]]=, $0{{$}}			; CHECK-NEXT: f32x4.splat $push[[R:[0-9]+]]=, $0{{$}}
	; CHECK-NEXT: return $pop[[R]]{{$}}			; CHECK-NEXT: return $pop[[R]]{{$}}
	define <4 x i32> @f32x4_splat(float %x) {			define <4 x i32> @f32x4_splat(float %x) {
	%vecinit = insertelement <4 x float> undef, float %x, i32 0			%vecinit = insertelement <4 x float> undef, float %x, i32 0
	%a = bitcast <4 x float> %vecinit to <4 x i32>			%a = bitcast <4 x float> %vecinit to <4 x i32>
	%b = shufflevector <4 x i32> %a, <4 x i32> undef, <4 x i32> zeroinitializer			%b = shufflevector <4 x i32> %a, <4 x i32> undef, <4 x i32> zeroinitializer
	ret <4 x i32> %b			ret <4 x i32> %b
	}			}

				; CHECK-LABEL: not_a_vec:
				; CHECK-NEXT: .functype not_a_vec (i64, i64) -> (v128){{$}}
				; CHECK-NEXT: i64x2.splat $push[[L1:[0-9]+]]=, $0{{$}}
				; CHECK-NEXT: v8x16.shuffle $push[[R:[0-9]+]]=, $pop[[L1]], $2, 0, 1, 2, 3
				; CHECK-NEXT: return $pop[[R]]
				define <4 x i32> @not_a_vec(i128 %x) {
				%a = bitcast i128 %x to <4 x i32>
				%b = shufflevector <4 x i32> %a, <4 x i32> undef, <4 x i32> zeroinitializer
				ret <4 x i32> %b
				}