This is an archive of the discontinued LLVM Phabricator instance.

Current upstream assert with "Assertion `(!EVT.isVector() || EVT.getVectorElementCount() == VT.getVectorElementCount()) && "Vector element counts must match in SIGN_EXTEND_INREG"' failed."

huihuiz added a reviewer: ctetreau.Feb 3 2021, 12:44 PM

Harbormaster completed remote builds in B87756: Diff 321185.Feb 3 2021, 1:51 PM

Hi, this patch looks like it has some good fixes, thanks! I just left one comment about adding a test if possible?

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8419	Is it possible to add a test for this too?

paulwalker-arm added inline comments.Feb 4 2021, 3:34 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8364–8365	Just a thought but in general it seems like a good way to mitigate such issues is to remove any explicit vector length handling and thus I'm wondering if it's worth changing this and the other instances to use changeVectorElementType (i.e. `VT.changeVectorElementType(ExtVT)` in this instance). It might even reduce line wrapping and thus read better.

Thanks David and Paul for the reviews!

I have included test coverage for the second change of getVectorElementCount.

But when trying to use changeVectorElementType(), I ran into a case where VT is v3i16, and the computed ExtVT is v3i1 not a valid simple integer vector type. We might end up needing extra special handling for i1 and other types.
Please refer to detailed test case in the comments.

Let me know if it's ok to keep the usage of EVT::getVectorVT(..., ElementCount) ?

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
8364–8365	It's definitely nicer to use changeVectorElementType(). I tried to change it as suggested, but ran into a case I couldn't get over. VT is v3i16, a valid simple value type, but ExtTy will be v3i1, not a valid simple integer vector type. Please refer to test.ll , run with this patch and changeVectorElementType() change llc -mtriple aarch64-none-linux-gnu < test.ll Then you will see it assert with "Simple vector VT not representable by simple integer vector VT". define <3 x i16> @sext_in_reg_v3i1_to_v3i16(<3 x i16> %a, <3 x i16> %b) { %c = add <3 x i16> %a, %b ; %shl = shl <3 x i16> %a, <i16 15, i16 15, i16 15> %ashr = ashr <3 x i16> %shl, <i16 15, i16 15, i16 15> ret <3 x i16> %ashr } ( I found this test from test/CodeGen/AMDGPU/sext-in-reg.ll ) Also, we will need to restrict ExtTy/TruncTy to be simple type. In most cases VT is simple type, but with ashr (shl ...) TruncTy could be extended type (e.g., i7), then we will have null pointer access to LLVMTy when calling changeExtendedVectorElementType().
8419	Test @ashr_shl and @ashr_shl_illegal_trunc_vec_ty will hit this line. Although I couldn't construct a test to make the folding happen. Subtraction of two splat value is not a constantSDNode.

Harbormaster completed remote builds in B88012: Diff 321629.Feb 4 2021, 7:44 PM

paulwalker-arm accepted this revision.Feb 5 2021, 3:24 AM

This revision is now accepted and ready to land.Feb 5 2021, 3:24 AM

This revision was landed with ongoing or failed builds.Feb 5 2021, 9:57 AM

Closed by commit rG1b81117f88e4: [DAGCombiner][SVE] Fix invalid use of getVectorNumElements() in visitSRA. (authored by huihuiz). · Explain Why

This revision was automatically updated to reflect the committed changes.

huihuiz added a commit: rG1b81117f88e4: [DAGCombiner][SVE] Fix invalid use of getVectorNumElements() in visitSRA..

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

8 lines

test/

CodeGen/

AArch64/

DAGCombine_vscale.ll

70 lines

Diff 321815

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,355 Lines • ▼ Show 20 Lines	if (SDValue NewSel = foldBinOpIntoSelect(N))
return NewSel;		return NewSel;

// fold (sra (shl x, c1), c1) -> sext_inreg for some c1 and target supports		// fold (sra (shl x, c1), c1) -> sext_inreg for some c1 and target supports
// sext_inreg.		// sext_inreg.
if (N1C && N0.getOpcode() == ISD::SHL && N1 == N0.getOperand(1)) {		if (N1C && N0.getOpcode() == ISD::SHL && N1 == N0.getOperand(1)) {
unsigned LowBits = OpSizeInBits - (unsigned)N1C->getZExtValue();		unsigned LowBits = OpSizeInBits - (unsigned)N1C->getZExtValue();
EVT ExtVT = EVT::getIntegerVT(*DAG.getContext(), LowBits);		EVT ExtVT = EVT::getIntegerVT(*DAG.getContext(), LowBits);
if (VT.isVector())		if (VT.isVector())
ExtVT = EVT::getVectorVT(*DAG.getContext(),		ExtVT = EVT::getVectorVT(*DAG.getContext(), ExtVT,
ExtVT, VT.getVectorNumElements());		VT.getVectorElementCount());
		paulwalker-armUnsubmitted Not Done Reply Inline Actions Just a thought but in general it seems like a good way to mitigate such issues is to remove any explicit vector length handling and thus I'm wondering if it's worth changing this and the other instances to use changeVectorElementType (i.e. `VT.changeVectorElementType(ExtVT)` in this instance). It might even reduce line wrapping and thus read better. paulwalker-arm: Just a thought but in general it seems like a good way to mitigate such issues is to remove any…
		huihuizAuthorUnsubmitted Done Reply Inline Actions It's definitely nicer to use changeVectorElementType(). I tried to change it as suggested, but ran into a case I couldn't get over. VT is v3i16, a valid simple value type, but ExtTy will be v3i1, not a valid simple integer vector type. Please refer to test.ll , run with this patch and changeVectorElementType() change llc -mtriple aarch64-none-linux-gnu < test.ll Then you will see it assert with "Simple vector VT not representable by simple integer vector VT". define <3 x i16> @sext_in_reg_v3i1_to_v3i16(<3 x i16> %a, <3 x i16> %b) { %c = add <3 x i16> %a, %b ; %shl = shl <3 x i16> %a, <i16 15, i16 15, i16 15> %ashr = ashr <3 x i16> %shl, <i16 15, i16 15, i16 15> ret <3 x i16> %ashr } ( I found this test from test/CodeGen/AMDGPU/sext-in-reg.ll ) Also, we will need to restrict ExtTy/TruncTy to be simple type. In most cases VT is simple type, but with ashr (shl ...) TruncTy could be extended type (e.g., i7), then we will have null pointer access to LLVMTy when calling changeExtendedVectorElementType(). huihuiz: It's definitely nicer to use changeVectorElementType(). I tried to change it as suggested, but…
if (!LegalOperations \|\|		if (!LegalOperations \|\|
TLI.getOperationAction(ISD::SIGN_EXTEND_INREG, ExtVT) ==		TLI.getOperationAction(ISD::SIGN_EXTEND_INREG, ExtVT) ==
TargetLowering::Legal)		TargetLowering::Legal)
return DAG.getNode(ISD::SIGN_EXTEND_INREG, SDLoc(N), VT,		return DAG.getNode(ISD::SIGN_EXTEND_INREG, SDLoc(N), VT,
N0.getOperand(0), DAG.getValueType(ExtVT));		N0.getOperand(0), DAG.getValueType(ExtVT));
// Even if we can't convert to sext_inreg, we might be able to remove		// Even if we can't convert to sext_inreg, we might be able to remove
// this shift pair if the input is already sign extended.		// this shift pair if the input is already sign extended.
if (DAG.ComputeNumSignBits(N0.getOperand(0)) > N1C->getZExtValue())		if (DAG.ComputeNumSignBits(N0.getOperand(0)) > N1C->getZExtValue())
Show All 37 Lines	if (N0.getOpcode() == ISD::SHL && N1C) {
// Get the two constanst of the shifts, CN0 = m, CN = n.		// Get the two constanst of the shifts, CN0 = m, CN = n.
const ConstantSDNode *N01C = isConstOrConstSplat(N0.getOperand(1));		const ConstantSDNode *N01C = isConstOrConstSplat(N0.getOperand(1));
if (N01C) {		if (N01C) {
LLVMContext &Ctx = *DAG.getContext();		LLVMContext &Ctx = *DAG.getContext();
// Determine what the truncate's result bitsize and type would be.		// Determine what the truncate's result bitsize and type would be.
EVT TruncVT = EVT::getIntegerVT(Ctx, OpSizeInBits - N1C->getZExtValue());		EVT TruncVT = EVT::getIntegerVT(Ctx, OpSizeInBits - N1C->getZExtValue());

if (VT.isVector())		if (VT.isVector())
TruncVT = EVT::getVectorVT(Ctx, TruncVT, VT.getVectorNumElements());		TruncVT = EVT::getVectorVT(Ctx, TruncVT, VT.getVectorElementCount());
		david-armUnsubmitted Done Reply Inline Actions Is it possible to add a test for this too? david-arm: Is it possible to add a test for this too?
		huihuizAuthorUnsubmitted Done Reply Inline Actions Test @ashr_shl and @ashr_shl_illegal_trunc_vec_ty will hit this line. Although I couldn't construct a test to make the folding happen. Subtraction of two splat value is not a constantSDNode. huihuiz: Test @ashr_shl and @ashr_shl_illegal_trunc_vec_ty will hit this line. Although I couldn't…

// Determine the residual right-shift amount.		// Determine the residual right-shift amount.
int ShiftAmt = N1C->getZExtValue() - N01C->getZExtValue();		int ShiftAmt = N1C->getZExtValue() - N01C->getZExtValue();

// If the shift is not a no-op (in which case this should be just a sign		// If the shift is not a no-op (in which case this should be just a sign
// extend already), the truncated to type is legal, sign_extend is legal		// extend already), the truncated to type is legal, sign_extend is legal
// on that type, and the truncate to that type is both legal and free,		// on that type, and the truncate to that type is both legal and free,
// perform the transform.		// perform the transform.
Show All 23 Lines	if (N0.getOpcode() == ISD::ADD && N0.hasOneUse() && N1C &&
if (ConstantSDNode *AddC = isConstOrConstSplat(N0.getOperand(1))) {		if (ConstantSDNode *AddC = isConstOrConstSplat(N0.getOperand(1))) {
SDValue Shl = N0.getOperand(0);		SDValue Shl = N0.getOperand(0);
// Determine what the truncate's type would be and ask the target if that		// Determine what the truncate's type would be and ask the target if that
// is a free operation.		// is a free operation.
LLVMContext &Ctx = *DAG.getContext();		LLVMContext &Ctx = *DAG.getContext();
unsigned ShiftAmt = N1C->getZExtValue();		unsigned ShiftAmt = N1C->getZExtValue();
EVT TruncVT = EVT::getIntegerVT(Ctx, OpSizeInBits - ShiftAmt);		EVT TruncVT = EVT::getIntegerVT(Ctx, OpSizeInBits - ShiftAmt);
if (VT.isVector())		if (VT.isVector())
TruncVT = EVT::getVectorVT(Ctx, TruncVT, VT.getVectorNumElements());		TruncVT = EVT::getVectorVT(Ctx, TruncVT, VT.getVectorElementCount());

// TODO: The simple type check probably belongs in the default hook		// TODO: The simple type check probably belongs in the default hook
// implementation and/or target-specific overrides (because		// implementation and/or target-specific overrides (because
// non-simple types likely require masking when legalized), but that		// non-simple types likely require masking when legalized), but that
// restriction may conflict with other transforms.		// restriction may conflict with other transforms.
if (TruncVT.isSimple() && isTypeLegal(TruncVT) &&		if (TruncVT.isSimple() && isTypeLegal(TruncVT) &&
TLI.isTruncateFree(VT, TruncVT)) {		TLI.isTruncateFree(VT, TruncVT)) {
SDLoc DL(N);		SDLoc DL(N);
▲ Show 20 Lines • Show All 14,276 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/DAGCombine_vscale.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t \| FileCheck %s
				; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t

				; WARN-NOT: warning

				; Check that DAGCombiner is not asserting with mis-matched vector element count, "Vector element counts must match in SIGN_EXTEND_INREG".
				; Also no warning message of "warning: Possible incorrect use of EVT::getVectorNumElements() for scalable vector.".

				define <vscale x 4 x i32> @sext_inreg(<vscale x 4 x i32> %a) {
				; CHECK-LABEL: sext_inreg:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: sxth z0.s, p0/m, z0.s
				; CHECK-NEXT: ret
				%in = insertelement <vscale x 4 x i32> undef, i32 16, i32 0
				%splat = shufflevector <vscale x 4 x i32> %in, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer
				%sext = shl <vscale x 4 x i32> %a, %splat
				%conv = ashr <vscale x 4 x i32> %sext, %splat
				ret <vscale x 4 x i32> %conv
				}

				define <vscale x 4 x i32> @ashr_shl(<vscale x 4 x i32> %a) {
				; CHECK-LABEL: ashr_shl:
				; CHECK: // %bb.0:
				; CHECK-NEXT: lsl z0.s, z0.s, #8
				; CHECK-NEXT: asr z0.s, z0.s, #16
				; CHECK-NEXT: ret
				%in1 = insertelement <vscale x 4 x i32> undef, i32 8, i32 0
				%splat1 = shufflevector <vscale x 4 x i32> %in1, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer
				%in2 = insertelement <vscale x 4 x i32> undef, i32 16, i32 0
				%splat2 = shufflevector <vscale x 4 x i32> %in2, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer
				%shl = shl <vscale x 4 x i32> %a, %splat1
				%r = ashr <vscale x 4 x i32> %shl, %splat2
				ret <vscale x 4 x i32> %r
				}

				define <vscale x 4 x i32> @ashr_shl_illegal_trunc_vec_ty(<vscale x 4 x i32> %a) {
				; CHECK-LABEL: ashr_shl_illegal_trunc_vec_ty:
				; CHECK: // %bb.0:
				; CHECK-NEXT: lsl z0.s, z0.s, #8
				; CHECK-NEXT: asr z0.s, z0.s, #11
				; CHECK-NEXT: ret
				%in1 = insertelement <vscale x 4 x i32> undef, i32 8, i32 0
				%splat1 = shufflevector <vscale x 4 x i32> %in1, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer
				%in2 = insertelement <vscale x 4 x i32> undef, i32 11, i32 0
				%splat2 = shufflevector <vscale x 4 x i32> %in2, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer
				%shl = shl <vscale x 4 x i32> %a, %splat1
				%r = ashr <vscale x 4 x i32> %shl, %splat2
				ret <vscale x 4 x i32> %r
				}

				define <vscale x 4 x i32> @ashr_add_shl_nxv4i8(<vscale x 4 x i32> %a) {
				; CHECK-LABEL: ashr_add_shl_nxv4i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mov w8, #16777216
				; CHECK-NEXT: mov z1.s, w8
				; CHECK-NEXT: lsl z0.s, z0.s, #24
				; CHECK-NEXT: add z0.s, z0.s, z1.s
				; CHECK-NEXT: asr z0.s, z0.s, #24
				; CHECK-NEXT: ret
				%in1 = insertelement <vscale x 4 x i32> undef, i32 24, i32 0
				%splat1 = shufflevector <vscale x 4 x i32> %in1, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer
				%in2 = insertelement <vscale x 4 x i32> undef, i32 16777216, i32 0
				%splat2 = shufflevector <vscale x 4 x i32> %in2, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer
				%conv = shl <vscale x 4 x i32> %a, %splat1
				%sext = add <vscale x 4 x i32> %conv, %splat2
				%conv1 = ashr <vscale x 4 x i32> %sext, %splat1
				ret <vscale x 4 x i32> %conv1
				}

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner][SVE] Fix invalid use of getVectorNumElements() in visitSRA.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 321815

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/test/CodeGen/AArch64/DAGCombine_vscale.ll

[DAGCombiner][SVE] Fix invalid use of getVectorNumElements() in visitSRA.
ClosedPublic