Download Raw Diff

Details

Reviewers

sdesmalen
CarolineConcatto
david-arm
MattDevereau
hassnaa-arm

Summary

Compiler hits infinite loop in DAGCombine. For force-streaming-compatible-sve mode we have custom lowering for 128-bit vector splats and later in DAGCombiner::SimplifyVCastOp() we scalarized SPLAT because we have custom lowering for SME. Later, we restored SPLAT opertion via performMulCombine().

Diff Detail

Event Timeline

dtemirbulatov created this revision.Mar 8 2023, 5:35 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2023, 5:35 AM

Herald added subscribers: ctetreau, steven.zhang, hiraditya and 2 others. · View Herald Transcript

dtemirbulatov requested review of this revision.Mar 8 2023, 5:35 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2023, 5:35 AM

Harbormaster completed remote builds in B218078: Diff 503331.Mar 8 2023, 6:45 AM

Matt added a subscriber: Matt.Mar 8 2023, 11:05 AM

sdesmalen added inline comments.Mar 10 2023, 3:58 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
24422–24424	The only reason this currently works for NEON is because `DAGCombiner::SimplifyVCastOp` for no good reason singles out `ISD::SPLAT_VECTOR`, which is only used for scalable vectors and SVE. When I add `\|\| N0.getOpcode() == ISD::BUILD_VECTOR` to the condition in `SimplifyVCastOp`, various AArch64 tests fail for the same reason as the test you've added here. I think the better solution would be to pass the whole `Node*` to preferScalarizeSplat, and then to look at its uses. If the opcode is a zero/sign extend and any of the uses is a mul, then we don't want to do this transformation, because that's when a umull/smull instruction can be used. I looked into whether it was possible to make `LoweMUL` smarter and have it recognise the `dup(ext(x))` pattern so that we don't need to do the combine for `dup(ext(x)) -> ext(dup(x)` in the first place, but this seems rather non-trivial.
llvm/test/CodeGen/AArch64/aarch64-force-streaming-compatible-sve.ll
4 ↗	(On Diff #503331)	Just a general comment (and not a suggestion for this patch): It would be nice if we could use SVE's `[us]mull[bt]` instructions for this.

david-arm added inline comments.Mar 15 2023, 9:08 AM

llvm/test/CodeGen/AArch64/aarch64-force-streaming-compatible-sve.ll
1 ↗	(On Diff #503331)	I think it makes sense to move this test into one of the existing files, perhaps something like CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll?
4 ↗	(On Diff #503331)	I think it would be good to rename this function to something that is more meaningful, perhaps extend_cmp_of_mul? Also, I think it would be nice to add a comment that describes what we're testing here, i.e. that the DAG combiner does not get stuck in an infinite loop.

Adding @hassnaa-arm as a reviewer too as she has been doing a lot of work on the streaming compatible code generation.

Addressed remarks.

Herald added subscribers: luke, • pcwang-thead, frasercrmck and 24 others. · View Herald TranscriptMar 16 2023, 8:09 AM

Harbormaster completed remote builds in B219870: Diff 505810.Mar 16 2023, 9:16 AM

Removed double not in AArch64TargetLowering::preferScalarizeSplat() function.

Harbormaster completed remote builds in B219900: Diff 505846.Mar 16 2023, 11:18 AM

Ping!

sdesmalen added inline comments.Mar 21 2023, 6:52 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
24423	I guess this condition can be removed now?
24424	Should this code also check for ISD::ANY_EXTEND ?
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
894	Can you also add a test where the result of the extend is not used by a `mul`?

Rebased, Resolved comments.

Harbormaster completed remote builds in B221201: Diff 507587.Mar 22 2023, 9:55 PM

sdesmalen added inline comments.Mar 23 2023, 7:27 AM

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
930–932	This is testing the wrong thing, because it is testing `splat(sext(scalar))` rather than `sext(splat(scalar))` which is what `SimplifyVCastOp` tries to change.
931	It's better to use `poison` here and below.
955	No vectors are used in this loop, so the code you've modified does not apply to this test at all. I'm expecting a similar test to the above, but then using a different operator instead of a `mul`.

Changed the second test case with suggestions from comment.

dtemirbulatov marked 2 inline comments as done.Mar 27 2023, 1:58 AM

dtemirbulatov added inline comments.

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
894	ok, I added function with the setcc node instead of mul.

Harbormaster completed remote builds in B221919: Diff 508531.Mar 27 2023, 2:40 AM

sdesmalen added inline comments.Mar 27 2023, 3:51 AM

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
932	This could be `poison` as well.
934–935	I'm not sure why the `icmp` and `zext` are relevant to this test.
953	Can you just store `%1` directly without the add?

dtemirbulatov marked 2 inline comments as done.Mar 27 2023, 5:34 AM

Changing tests.

Harbormaster completed remote builds in B221972: Diff 508599.Mar 27 2023, 5:39 AM

sdesmalen added inline comments.Mar 27 2023, 5:43 AM

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
919	This test is different from the one above in ways other than the `mul`. Can you make the tests otherwise identical? Can you also verify that it still exercises the code you've added to your patch?

sdesmalen added inline comments.Mar 27 2023, 5:45 AM

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
911	nit: If you replace the `8` here by a smaller number like `2`, then we don't need to observe the effect of legalisation and we'd only get `1` mul instead of `4`.

Updated tests.

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
911	I could not reproduce the issue with <2 x > type.
919	I can reproduce the error with updated test.

Harbormaster completed remote builds in B221988: Diff 508616.Mar 27 2023, 6:05 AM

Moved tests to <2 x > types.

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
911	sorry, I was wrong. I can reproduce with <2 x > types.

Harbormaster completed remote builds in B221993: Diff 508624.Mar 27 2023, 6:32 AM

Replaced zeroinitializer to poison in second operand for shufflevector.

dtemirbulatov marked an inline comment as done.Mar 27 2023, 10:11 AM

dtemirbulatov marked an inline comment as done.

Harbormaster completed remote builds in B222066: Diff 508723.Mar 27 2023, 12:29 PM

LGTM! Thanks @dtemirbulatov.

This revision is now accepted and ready to land.Apr 4 2023, 7:12 AM

Commited with 7f05bdf4ee1c

Diff 508723

llvm/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 796 Lines • ▼ Show 20 Lines	public:

// By default prefer folding (abs (sub nsw x, y)) -> abds(x, y). Some targets		// By default prefer folding (abs (sub nsw x, y)) -> abds(x, y). Some targets
// may want to avoid this to prevent loss of sub_nsw pattern.		// may want to avoid this to prevent loss of sub_nsw pattern.
virtual bool preferABDSToABSWithNSW(EVT VT) const {		virtual bool preferABDSToABSWithNSW(EVT VT) const {
return true;		return true;
}		}

// Return true if the target wants to transform Op(Splat(X)) -> Splat(Op(X))		// Return true if the target wants to transform Op(Splat(X)) -> Splat(Op(X))
virtual bool preferScalarizeSplat(unsigned Opc) const { return true; }		virtual bool preferScalarizeSplat(SDNode *N) const { return true; }

/// Return true if the target wants to use the optimization that		/// Return true if the target wants to use the optimization that
/// turns ext(promotableInst1(...(promotableInstN(load)))) into		/// turns ext(promotableInst1(...(promotableInstN(load)))) into
/// promotedInst1(...(promotedInstN(ext(load)))).		/// promotedInst1(...(promotedInstN(ext(load)))).
bool enableExtLdPromotion() const { return EnableExtLdPromotion; }		bool enableExtLdPromotion() const { return EnableExtLdPromotion; }

/// Return true if the target can combine store(extractelement VectorTy,		/// Return true if the target can combine store(extractelement VectorTy,
/// Idx).		/// Idx).
▲ Show 20 Lines • Show All 4,450 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 25,613 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::SimplifyVCastOp(SDNode *N, const SDLoc &DL) {

// TODO: promote operation might be also good here?		// TODO: promote operation might be also good here?
int Index0;		int Index0;
SDValue Src0 = DAG.getSplatSourceVector(N0, Index0);		SDValue Src0 = DAG.getSplatSourceVector(N0, Index0);
if (Src0 &&		if (Src0 &&
(N0.getOpcode() == ISD::SPLAT_VECTOR \|\|		(N0.getOpcode() == ISD::SPLAT_VECTOR \|\|
TLI.isExtractVecEltCheap(VT, Index0)) &&		TLI.isExtractVecEltCheap(VT, Index0)) &&
TLI.isOperationLegalOrCustom(Opcode, EltVT) &&		TLI.isOperationLegalOrCustom(Opcode, EltVT) &&
TLI.preferScalarizeSplat(Opcode)) {		TLI.preferScalarizeSplat(N)) {
SDValue IndexC = DAG.getVectorIdxConstant(Index0, DL);		SDValue IndexC = DAG.getVectorIdxConstant(Index0, DL);
SDValue Elt =		SDValue Elt =
DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, SrcEltVT, Src0, IndexC);		DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, SrcEltVT, Src0, IndexC);
SDValue ScalarBO = DAG.getNode(Opcode, DL, EltVT, Elt, N->getFlags());		SDValue ScalarBO = DAG.getNode(Opcode, DL, EltVT, Elt, N->getFlags());
if (VT.isScalableVector())		if (VT.isScalableVector())
return DAG.getSplatVector(VT, DL, ScalarBO);		return DAG.getSplatVector(VT, DL, ScalarBO);
SmallVector<SDValue, 8> Ops(VT.getVectorNumElements(), ScalarBO);		SmallVector<SDValue, 8> Ops(VT.getVectorNumElements(), ScalarBO);
return DAG.getBuildVector(VT, DL, Ops);		return DAG.getBuildVector(VT, DL, Ops);
▲ Show 20 Lines • Show All 1,478 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 1,216 Lines • ▼ Show 20 Lines	private:

// Returns the runtime value for PSTATE.SM. When the function is streaming-		// Returns the runtime value for PSTATE.SM. When the function is streaming-
// compatible, this generates a call to __arm_sme_state.		// compatible, this generates a call to __arm_sme_state.
SDValue getPStateSM(SelectionDAG &DAG, SDValue Chain, SMEAttrs Attrs,		SDValue getPStateSM(SelectionDAG &DAG, SDValue Chain, SMEAttrs Attrs,
SDLoc DL, EVT VT) const;		SDLoc DL, EVT VT) const;

bool isConstantUnsignedBitfieldExtractLegal(unsigned Opc, LLT Ty1,		bool isConstantUnsignedBitfieldExtractLegal(unsigned Opc, LLT Ty1,
LLT Ty2) const override;		LLT Ty2) const override;

		bool preferScalarizeSplat(SDNode *N) const override;
};		};

namespace AArch64 {		namespace AArch64 {
FastISel *createFastISel(FunctionLoweringInfo &funcInfo,		FastISel *createFastISel(FunctionLoweringInfo &funcInfo,
const TargetLibraryInfo *libInfo);		const TargetLibraryInfo *libInfo);
} // end namespace AArch64		} // end namespace AArch64

} // end namespace llvm		} // end namespace llvm

#endif		#endif

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 24,411 Lines • ▼ Show 20 Lines	if (OperationType == ComplexDeinterleavingOperation::CAdd) {
if (IntId == Intrinsic::not_intrinsic)		if (IntId == Intrinsic::not_intrinsic)
return nullptr;		return nullptr;

return B.CreateIntrinsic(IntId, Ty, {InputA, InputB});		return B.CreateIntrinsic(IntId, Ty, {InputA, InputB});
}		}

return nullptr;		return nullptr;
}		}

		bool AArch64TargetLowering::preferScalarizeSplat(SDNode *N) const {
		unsigned Opc = N->getOpcode();
		if (Opc == ISD::ZERO_EXTEND \|\| Opc == ISD::SIGN_EXTEND \|\|
		sdesmalenUnsubmitted Done Reply Inline Actions I guess this condition can be removed now? sdesmalen: I guess this condition can be removed now?
		Opc == ISD::ANY_EXTEND) {
		sdesmalenUnsubmitted Done Reply Inline Actions The only reason this currently works for NEON is because `DAGCombiner::SimplifyVCastOp` for no good reason singles out `ISD::SPLAT_VECTOR`, which is only used for scalable vectors and SVE. When I add `\|\| N0.getOpcode() == ISD::BUILD_VECTOR` to the condition in `SimplifyVCastOp`, various AArch64 tests fail for the same reason as the test you've added here. I think the better solution would be to pass the whole `Node` to preferScalarizeSplat, and then to look at its uses. If the opcode is a zero/sign extend and any of the uses is a mul, then we don't want to do this transformation, because that's when a umull/smull instruction can be used. I looked into whether it was possible to make `LoweMUL` smarter and have it recognise the `dup(ext(x))` pattern so that we don't need to do the combine for `dup(ext(x)) -> ext(dup(x)` in the first place, but this seems rather non-trivial. sdesmalen:* The only reason this currently works for NEON is because `DAGCombiner::SimplifyVCastOp` for no…
		sdesmalenUnsubmitted Done Reply Inline Actions Should this code also check for ISD::ANY_EXTEND ? sdesmalen: Should this code also check for ISD::ANY_EXTEND ?
		if (any_of(N->uses(),
		[&](SDNode *Use) { return Use->getOpcode() == ISD::MUL; }))
		return false;
		}
		return true;
		}

llvm/lib/Target/RISCV/RISCVISelLowering.h

Show First 20 Lines • Show All 395 Lines • ▼ Show 20 Lines	public:
int getLegalZfaFPImm(const APFloat &Imm, EVT VT) const;		int getLegalZfaFPImm(const APFloat &Imm, EVT VT) const;
bool isFPImmLegal(const APFloat &Imm, EVT VT,		bool isFPImmLegal(const APFloat &Imm, EVT VT,
bool ForCodeSize) const override;		bool ForCodeSize) const override;
bool isExtractSubvectorCheap(EVT ResVT, EVT SrcVT,		bool isExtractSubvectorCheap(EVT ResVT, EVT SrcVT,
unsigned Index) const override;		unsigned Index) const override;

bool isIntDivCheap(EVT VT, AttributeList Attr) const override;		bool isIntDivCheap(EVT VT, AttributeList Attr) const override;

bool preferScalarizeSplat(unsigned Opc) const override;		bool preferScalarizeSplat(SDNode *N) const override;

bool softPromoteHalfType() const override { return true; }		bool softPromoteHalfType() const override { return true; }

/// Return the register type for a given MVT, ensuring vectors are treated		/// Return the register type for a given MVT, ensuring vectors are treated
/// as a series of gpr sized integers.		/// as a series of gpr sized integers.
MVT getRegisterTypeForCallingConv(LLVMContext &Context, CallingConv::ID CC,		MVT getRegisterTypeForCallingConv(LLVMContext &Context, CallingConv::ID CC,
EVT VT) const override;		EVT VT) const override;

▲ Show 20 Lines • Show All 427 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 14,919 Lines • ▼ Show 20 Lines
	bool RISCVTargetLowering::isIntDivCheap(EVT VT, AttributeList Attr) const {			bool RISCVTargetLowering::isIntDivCheap(EVT VT, AttributeList Attr) const {
	// When aggressively optimizing for code size, we prefer to use a div			// When aggressively optimizing for code size, we prefer to use a div
	// instruction, as it is usually smaller than the alternative sequence.			// instruction, as it is usually smaller than the alternative sequence.
	// TODO: Add vector division?			// TODO: Add vector division?
	bool OptSize = Attr.hasFnAttr(Attribute::MinSize);			bool OptSize = Attr.hasFnAttr(Attribute::MinSize);
	return OptSize && !VT.isVector();			return OptSize && !VT.isVector();
	}			}

	bool RISCVTargetLowering::preferScalarizeSplat(unsigned Opc) const {			bool RISCVTargetLowering::preferScalarizeSplat(SDNode *N) const {
	// Scalarize zero_ext and sign_ext might stop match to widening instruction in			// Scalarize zero_ext and sign_ext might stop match to widening instruction in
	// some situation.			// some situation.
				unsigned Opc = N->getOpcode();
	if (Opc == ISD::ZERO_EXTEND \|\| Opc == ISD::SIGN_EXTEND)			if (Opc == ISD::ZERO_EXTEND \|\| Opc == ISD::SIGN_EXTEND)
	return false;			return false;
	return true;			return true;
	}			}

	static Value *useTpOffset(IRBuilderBase &IRB, unsigned Offset) {			static Value *useTpOffset(IRBuilderBase &IRB, unsigned Offset) {
	Module *M = IRB.GetInsertBlock()->getParent()->getParent();			Module *M = IRB.GetInsertBlock()->getParent()->getParent();
	Function *ThreadPointerFunc =			Function *ThreadPointerFunc =
	▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.h

Show First 20 Lines • Show All 1,120 Lines • ▼ Show 20 Lines	public:

bool hasBitTest(SDValue X, SDValue Y) const override;		bool hasBitTest(SDValue X, SDValue Y) const override;

bool shouldProduceAndByConstByHoistingConstFromShiftsLHSOfAnd(		bool shouldProduceAndByConstByHoistingConstFromShiftsLHSOfAnd(
SDValue X, ConstantSDNode XC, ConstantSDNode CC, SDValue Y,		SDValue X, ConstantSDNode XC, ConstantSDNode CC, SDValue Y,
unsigned OldShiftOpcode, unsigned NewShiftOpcode,		unsigned OldShiftOpcode, unsigned NewShiftOpcode,
SelectionDAG &DAG) const override;		SelectionDAG &DAG) const override;

bool preferScalarizeSplat(unsigned Opc) const override;		bool preferScalarizeSplat(SDNode *N) const override;

bool shouldFoldConstantShiftPairToMask(const SDNode *N,		bool shouldFoldConstantShiftPairToMask(const SDNode *N,
CombineLevel Level) const override;		CombineLevel Level) const override;

bool shouldFoldMaskToVariableShiftPair(SDValue Y) const override;		bool shouldFoldMaskToVariableShiftPair(SDValue Y) const override;

bool		bool
shouldTransformSignedTruncationCheck(EVT XVT,		shouldTransformSignedTruncationCheck(EVT XVT,
▲ Show 20 Lines • Show All 732 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,112 Lines • ▼ Show 20 Lines	if (DAG.isSplatValue(Y, /AllowUndefs=/true))
return true;		return true;
// If we have AVX2 with it's powerful shift operations, then it's also good.		// If we have AVX2 with it's powerful shift operations, then it's also good.
if (Subtarget.hasAVX2())		if (Subtarget.hasAVX2())
return true;		return true;
// Pre-AVX2 vector codegen for this pattern is best for variant with 'shl'.		// Pre-AVX2 vector codegen for this pattern is best for variant with 'shl'.
return NewShiftOpcode == ISD::SHL;		return NewShiftOpcode == ISD::SHL;
}		}

bool X86TargetLowering::preferScalarizeSplat(unsigned Opc) const {		bool X86TargetLowering::preferScalarizeSplat(SDNode *N) const {
return Opc != ISD::FP_EXTEND;		return N->getOpcode() != ISD::FP_EXTEND;
}		}

bool X86TargetLowering::shouldFoldConstantShiftPairToMask(		bool X86TargetLowering::shouldFoldConstantShiftPairToMask(
const SDNode *N, CombineLevel Level) const {		const SDNode *N, CombineLevel Level) const {
assert(((N->getOpcode() == ISD::SHL &&		assert(((N->getOpcode() == ISD::SHL &&
N->getOperand(0).getOpcode() == ISD::SRL) \|\|		N->getOperand(0).getOpcode() == ISD::SRL) \|\|
(N->getOpcode() == ISD::SRL &&		(N->getOpcode() == ISD::SRL &&
N->getOperand(0).getOpcode() == ISD::SHL)) &&		N->getOperand(0).getOpcode() == ISD::SHL)) &&
▲ Show 20 Lines • Show All 52,549 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll

	Show First 20 Lines • Show All 885 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%a = load <8 x i32>, ptr %in			%a = load <8 x i32>, ptr %in
	%b = add <8 x i32> %a, %a			%b = add <8 x i32> %a, %a
	%c = zext <8 x i32> %b to <8 x i64>			%c = zext <8 x i32> %b to <8 x i64>
	store <8 x i64> %c, ptr %out			store <8 x i64> %c, ptr %out
	ret void			ret void
	}			}

				define void @extend_and_mul(i32 %0, <2 x i64> %1, ptr %2) #0 {
				sdesmalenUnsubmitted Done Reply Inline Actions Can you also add a test where the result of the extend is not used by a `mul`? sdesmalen: Can you also add a test where the result of the extend is not used by a `mul`?
				dtemirbulatovAuthorUnsubmitted Done Reply Inline Actions ok, I added function with the setcc node instead of mul. dtemirbulatov: ok, I added function with the setcc node instead of mul.
				; CHECK-LABEL: extend_and_mul:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mov z1.s, w0
				; CHECK-NEXT: // kill: def $q0 killed $q0 def $z0
				; CHECK-NEXT: ptrue p0.d, vl2
				; CHECK-NEXT: uunpklo z1.d, z1.s
				; CHECK-NEXT: mul z0.d, p0/m, z0.d, z1.d
				; CHECK-NEXT: str q0, [x1]
				; CHECK-NEXT: ret
				%broadcast.splatinsert2 = insertelement <2 x i32> poison, i32 %0, i64 0
				%broadcast.splat3 = shufflevector <2 x i32> %broadcast.splatinsert2, <2 x i32> poison, <2 x i32> zeroinitializer
				%4 = zext <2 x i32> %broadcast.splat3 to <2 x i64>
				%5 = mul <2 x i64> %4, %1
				store <2 x i64> %5, ptr %2, align 2
				ret void
				}

				sdesmalenUnsubmitted Done Reply Inline Actions nit: If you replace the `8` here by a smaller number like `2`, then we don't need to observe the effect of legalisation and we'd only get `1` mul instead of `4`. sdesmalen: nit: If you replace the `8` here by a smaller number like `2`, then we don't need to observe…
				dtemirbulatovAuthorUnsubmitted Done Reply Inline Actions I could not reproduce the issue with <2 x > type. dtemirbulatov: I could not reproduce the issue with <2 x > type.
				dtemirbulatovAuthorUnsubmitted Done Reply Inline Actions sorry, I was wrong. I can reproduce with <2 x > types. dtemirbulatov: sorry, I was wrong. I can reproduce with <2 x > types.
				define void @extend_no_mul(i32 %0, <2 x i64> %1, ptr %2) #0 {
				; CHECK-LABEL: extend_no_mul:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: mov w8, w0
				; CHECK-NEXT: mov z0.d, x8
				; CHECK-NEXT: str q0, [x1]
				; CHECK-NEXT: ret
				entry:
				sdesmalenUnsubmitted Done Reply Inline Actions This test is different from the one above in ways other than the `mul`. Can you make the tests otherwise identical? Can you also verify that it still exercises the code you've added to your patch? sdesmalen: This test is different from the one above in ways //other// than the `mul`. Can you make the…
				dtemirbulatovAuthorUnsubmitted Done Reply Inline Actions I can reproduce the error with updated test. dtemirbulatov: I can reproduce the error with updated test.
				%broadcast.splatinsert2 = insertelement <2 x i32> poison, i32 %0, i64 0
				%broadcast.splat3 = shufflevector <2 x i32> %broadcast.splatinsert2, <2 x i32> poison, <2 x i32> zeroinitializer
				%3 = zext <2 x i32> %broadcast.splat3 to <2 x i64>
				store <2 x i64> %3, ptr %2, align 2
				ret void
				}

	attributes #0 = { nounwind "target-features"="+sve" }			attributes #0 = { nounwind "target-features"="+sve" }
				sdesmalenUnsubmitted Done Reply Inline Actions No vectors are used in this loop, so the code you've modified does not apply to this test at all. I'm expecting a similar test to the above, but then using a different operator instead of a `mul`. sdesmalen: No vectors are used in this loop, so the code you've modified does not apply to this test at…
				sdesmalenUnsubmitted Done Reply Inline Actions This is testing the wrong thing, because it is testing `splat(sext(scalar))` rather than `sext(splat(scalar))` which is what `SimplifyVCastOp` tries to change. sdesmalen: This is testing the wrong thing, because it is testing `splat(sext(scalar))` rather than `sext…
				sdesmalenUnsubmitted Done Reply Inline Actions It's better to use `poison` here and below. sdesmalen: It's better to use `poison` here and below.
				sdesmalenUnsubmitted Done Reply Inline Actions Can you just store `%1` directly without the add? sdesmalen: Can you just store `%1` directly without the add?
				sdesmalenUnsubmitted Done Reply Inline Actions This could be `poison` as well. sdesmalen: This could be `poison` as well.
				sdesmalenUnsubmitted Done Reply Inline Actions I'm not sure why the `icmp` and `zext` are relevant to this test. sdesmalen: I'm not sure why the `icmp` and `zext` are relevant to this test.

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SME] Fix an infinite loop in DAGCombine related to adding -force-streaming-compatible-sve flag.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 508723

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/Target/AArch64/AArch64ISelLowering.h

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/RISCV/RISCVISelLowering.h

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Target/X86/X86ISelLowering.h

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SME] Fix an infinite loop in DAGCombine related to adding -force-streaming-compatible-sve flag.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 508723

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/Target/AArch64/AArch64ISelLowering.h

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/RISCV/RISCVISelLowering.h

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Target/X86/X86ISelLowering.h

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll

[AArch64][SME] Fix an infinite loop in DAGCombine related to adding -force-streaming-compatible-sve flag.
ClosedPublic