Download Raw Diff

Details

Reviewers

RKSimon
MatzeB
foad

Commits

rG83cb9632a13d: [DAGCombiner] Add support for mulhi const folding in DAGCombiner

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dstuttard created this revision.May 28 2021, 8:42 AM

Herald added subscribers: ecnelises, hiraditya. · View Herald TranscriptMay 28 2021, 8:42 AM

dstuttard requested review of this revision.May 28 2021, 8:42 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2021, 8:42 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

dstuttard added reviewers: RKSimon, MatzeB, foad.May 28 2021, 8:43 AM

RKSimon added inline comments.May 28 2021, 8:50 AM

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
5088	APInt::extractBits ?
5094	APInt::extractBits ?

Test cases would be useful as well of course

Harbormaster completed remote builds in B106723: Diff 348540.May 28 2021, 9:02 AM

foad added inline comments.May 28 2021, 9:03 AM

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
5086	You don't need "OrTrunc" here and below.

Thanks for reviews.
Made suggested changes.

I'll add some test cases as well - but I'm off for a few days. I'll do it on return.

dstuttard marked 3 inline comments as done.May 28 2021, 9:14 AM

Harbormaster completed remote builds in B106731: Diff 348550.May 28 2021, 9:59 AM

craig.topper added a subscriber: craig.topper.May 28 2021, 10:18 AM

craig.topper added inline comments.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4456	Use DL variable that already exists
4508	Use DL variable that already exists

Updating for review comments.

Tests still to be added...

dstuttard marked 2 inline comments as done.Jun 17 2021, 1:14 AM

Harbormaster completed remote builds in B109658: Diff 352637.Jun 17 2021, 9:37 AM

In D103323#2823907, @dstuttard wrote:

Updating for review comments.

Tests still to be added...

Do you have any in mind? I can add SSE2 vector examples if that would help? But scalars are trickier

Add test

Herald added subscribers: kerbowa, nhaehnle, jvesely. · View Herald TranscriptJun 18 2021, 3:22 AM

You can simplify the test case to:

define amdgpu_cs i64 @main(i64 %arg) {
entry:
  %d = udiv i64 %arg, 100000
  ret i64 %d
}

and it still shows the effect. Surely there are already some tests for i64 divide-by-constant that you could tweak, rather than adding a whole new file.

llvm/test/CodeGen/AMDGPU/dagcombine-mulhs-const.ll
5	Obviously folding the mul_hi is good, but the s_add that you check for looks like this: s_mov_b32 s0, 0x346d900 ... s_add_u32 s0, 0x4237, s0 so it should also be folded to a constant!

Harbormaster completed remote builds in B109894: Diff 352960.Jun 18 2021, 7:20 PM

RKSimon mentioned this in rGcc38f8939da4: [X86][SSE] Add mulhu/mulhs constant folding tests.Jul 3 2021, 9:02 AM

@dstuttard Please can you rebase? rGcc38f8939da4aec85e7d0ef4de412e30d4de5a14 should give you vector coverage

Updated existing test based on feedback

Rebase and adjust X86 test that now folds

Herald added a subscriber: pengfei. · View Herald TranscriptJul 5 2021, 1:47 AM

In D103323#2856828, @RKSimon wrote:

@dstuttard Please can you rebase? rGcc38f8939da4aec85e7d0ef4de412e30d4de5a14 should give you vector coverage

Thanks - have rebased and updated the test

dstuttard added inline comments.Jul 5 2021, 1:50 AM

llvm/test/CodeGen/AMDGPU/dagcombine-mulhs-const.ll
5	Yes - that could be another one to do - then fix up this test (or not worry about it at all given that there's now an X86 test that tests this, thanks to Simon)

Cheers!

llvm/test/CodeGen/AMDGPU/udiv.ll
206 ↗	(On Diff #356448)	Possibly pre-commit this to show current codegen? Do we need a GFX1030-NOT v_mul_hi_u32 check of some kind?
210 ↗	(On Diff #356448)	Fix missing newline

Harbormaster completed remote builds in B112412: Diff 356448.Jul 5 2021, 2:31 AM

dstuttard mentioned this in D105424: [DAGCombiner] Pre-commit test to demonstrate mulhi const folding.Jul 5 2021, 3:24 AM

Pre-committed test and rebased on top

See D105424

Harbormaster completed remote builds in B112422: Diff 356464.Jul 5 2021, 3:27 AM

dstuttard marked 2 inline comments as done.Jul 5 2021, 3:28 AM

dstuttard added inline comments.

llvm/test/CodeGen/AMDGPU/udiv.ll
206 ↗	(On Diff #356448)	Good idea. See latest change.

dstuttard marked an inline comment as done.Jul 5 2021, 3:28 AM

dstuttard mentioned this in rG4b125b23ba95: [DAGCombiner] Pre-commit test to demonstrate mulhi const folding.Jul 5 2021, 3:35 AM

LGTM - cheers!

This revision is now accepted and ready to land.Jul 5 2021, 3:58 AM

This revision was landed with ongoing or failed builds.Jul 5 2021, 4:08 AM

Closed by commit rG83cb9632a13d: [DAGCombiner] Add support for mulhi const folding in DAGCombiner (authored by dstuttard). · Explain Why

This revision was automatically updated to reflect the committed changes.

dstuttard added a commit: rG83cb9632a13d: [DAGCombiner] Add support for mulhi const folding in DAGCombiner.

Diff 352960

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,445 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitMULHS(SDNode *N) {
if (VT.isVector()) {		if (VT.isVector()) {
// fold (mulhs x, 0) -> 0		// fold (mulhs x, 0) -> 0
// do not return N0/N1, because undef node may exist.		// do not return N0/N1, because undef node may exist.
if (ISD::isConstantSplatVectorAllZeros(N0.getNode()) \|\|		if (ISD::isConstantSplatVectorAllZeros(N0.getNode()) \|\|
ISD::isConstantSplatVectorAllZeros(N1.getNode()))		ISD::isConstantSplatVectorAllZeros(N1.getNode()))
return DAG.getConstant(0, DL, VT);		return DAG.getConstant(0, DL, VT);
}		}

		// fold (mulhs c1, c2)
		if (SDValue C = DAG.FoldConstantArithmetic(ISD::MULHS, DL, VT, {N0, N1}))
		return C;
		craig.topperUnsubmitted Done Reply Inline Actions Use DL variable that already exists craig.topper: Use DL variable that already exists

// fold (mulhs x, 0) -> 0		// fold (mulhs x, 0) -> 0
if (isNullConstant(N1))		if (isNullConstant(N1))
return N1;		return N1;
// fold (mulhs x, 1) -> (sra x, size(x)-1)		// fold (mulhs x, 1) -> (sra x, size(x)-1)
if (isOneConstant(N1))		if (isOneConstant(N1))
return DAG.getNode(ISD::SRA, DL, N0.getValueType(), N0,		return DAG.getNode(ISD::SRA, DL, N0.getValueType(), N0,
DAG.getConstant(N0.getScalarValueSizeInBits() - 1, DL,		DAG.getConstant(N0.getScalarValueSizeInBits() - 1, DL,
getShiftAmountTy(N0.getValueType())));		getShiftAmountTy(N0.getValueType())));
Show All 32 Lines	SDValue DAGCombiner::visitMULHU(SDNode *N) {
if (VT.isVector()) {		if (VT.isVector()) {
// fold (mulhu x, 0) -> 0		// fold (mulhu x, 0) -> 0
// do not return N0/N1, because undef node may exist.		// do not return N0/N1, because undef node may exist.
if (ISD::isConstantSplatVectorAllZeros(N0.getNode()) \|\|		if (ISD::isConstantSplatVectorAllZeros(N0.getNode()) \|\|
ISD::isConstantSplatVectorAllZeros(N1.getNode()))		ISD::isConstantSplatVectorAllZeros(N1.getNode()))
return DAG.getConstant(0, DL, VT);		return DAG.getConstant(0, DL, VT);
}		}

		// fold (mulhu c1, c2)
		if (SDValue C = DAG.FoldConstantArithmetic(ISD::MULHU, DL, VT, {N0, N1}))
		return C;
		craig.topperUnsubmitted Done Reply Inline Actions Use DL variable that already exists craig.topper: Use DL variable that already exists

// fold (mulhu x, 0) -> 0		// fold (mulhu x, 0) -> 0
if (isNullConstant(N1))		if (isNullConstant(N1))
return N1;		return N1;
// fold (mulhu x, 1) -> 0		// fold (mulhu x, 1) -> 0
if (isOneConstant(N1))		if (isOneConstant(N1))
return DAG.getConstant(0, DL, N0.getValueType());		return DAG.getConstant(0, DL, N0.getValueType());
// fold (mulhu x, undef) -> 0		// fold (mulhu x, undef) -> 0
if (N0.isUndef() \|\| N1.isUndef())		if (N0.isUndef() \|\| N1.isUndef())
▲ Show 20 Lines • Show All 18,742 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,075 Lines • ▼ Show 20 Lines	static llvm::Optional<APInt> FoldValue(unsigned Opcode, const APInt &C1,
case ISD::SDIV:		case ISD::SDIV:
if (!C2.getBoolValue())		if (!C2.getBoolValue())
break;		break;
return C1.sdiv(C2);		return C1.sdiv(C2);
case ISD::SREM:		case ISD::SREM:
if (!C2.getBoolValue())		if (!C2.getBoolValue())
break;		break;
return C1.srem(C2);		return C1.srem(C2);
		case ISD::MULHS: {
		unsigned FullWidth = C1.getBitWidth() * 2;
		APInt C1Ext = C1.sext(FullWidth);
		foadUnsubmitted Done Reply Inline Actions You don't need "OrTrunc" here and below. foad: You don't need "OrTrunc" here and below.
		APInt C2Ext = C2.sext(FullWidth);
		return (C1Ext * C2Ext).extractBits(C1.getBitWidth(), C1.getBitWidth());
		RKSimonUnsubmitted Done Reply Inline Actions APInt::extractBits ? RKSimon: APInt::extractBits ?
		}
		case ISD::MULHU: {
		unsigned FullWidth = C1.getBitWidth() * 2;
		APInt C1Ext = C1.zext(FullWidth);
		APInt C2Ext = C2.zext(FullWidth);
		return (C1Ext * C2Ext).extractBits(C1.getBitWidth(), C1.getBitWidth());
		RKSimonUnsubmitted Done Reply Inline Actions APInt::extractBits ? RKSimon: APInt::extractBits ?
		}
}		}
return llvm::None;		return llvm::None;
}		}

SDValue SelectionDAG::FoldSymbolOffset(unsigned Opcode, EVT VT,		SDValue SelectionDAG::FoldSymbolOffset(unsigned Opcode, EVT VT,
const GlobalAddressSDNode *GA,		const GlobalAddressSDNode *GA,
const SDNode *N2) {		const SDNode *N2) {
if (GA->getOpcode() != ISD::GlobalAddress)		if (GA->getOpcode() != ISD::GlobalAddress)
▲ Show 20 Lines • Show All 5,508 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/dagcombine-mulhs-const.ll

This file was added.

				; RUN: llc -march=amdgcn -verify-machineinstrs -mcpu=gfx1030 < %s \| FileCheck -check-prefix=GCN %s

				; GCN-LABEL: {{^}}main:
				; MULHS C1, C2 replacement results in 0x4237 in the following add
				; GCN: s_add_u32 s0, 0x4237, s0
				foadUnsubmitted Not Done Reply Inline Actions Obviously folding the mul_hi is good, but the s_add that you check for looks like this: s_mov_b32 s0, 0x346d900 ... s_add_u32 s0, 0x4237, s0 so it should also be folded to a constant! foad: Obviously folding the mul_hi is good, but the s_add that you check for looks like this: ```…
				dstuttardAuthorUnsubmitted Done Reply Inline Actions Yes - that could be another one to do - then fix up this test (or not worry about it at all given that there's now an X86 test that tests this, thanks to Simon) dstuttard: Yes - that could be another one to do - then fix up this test (or not worry about it at all…

				define amdgpu_cs void @main(<4 x i32> %0, <4 x i32> %1) {
				main_body:
				%2 = call nsz arcp <2 x float> @llvm.amdgcn.raw.buffer.load.v2f32(<4 x i32> %0, i32 0, i32 0, i32 0)
				%3 = bitcast <2 x float> %2 to <2 x i32>
				%4 = extractelement <2 x i32> %3, i32 0
				%5 = insertelement <2 x i32> undef, i32 %4, i32 0
				%6 = insertelement <2 x i32> %5, i32 undef, i32 1
				%7 = bitcast <2 x i32> %6 to i64
				%8 = mul i64 %7, 1000000
				%9 = udiv i64 %8, 100000
				%10 = bitcast i64 %9 to <2 x i32>
				%11 = extractelement <2 x i32> %10, i32 1
				%12 = select i1 false, i32 undef, i32 %11
				%.not33 = icmp eq i32 0, 0
				%13 = select i1 %.not33, i32 %12, i32 0
				%14 = insertelement <2 x i32> undef, i32 %13, i32 1
				%15 = bitcast <2 x i32> %14 to <2 x float>
				call void @llvm.amdgcn.raw.buffer.store.v2f32(<2 x float> %15, <4 x i32> %1, i32 0, i32 0, i32 0)
				ret void
				}

				declare <2 x float> @llvm.amdgcn.raw.buffer.load.v2f32(<4 x i32>, i32, i32, i32 immarg) #0
				declare void @llvm.amdgcn.raw.buffer.store.v2f32(<2 x float>, <4 x i32>, i32, i32, i32 immarg) #1
				attributes #0 = { nounwind readonly willreturn }
				attributes #1 = { nounwind willreturn writeonly }

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Add support for mulhi const folding in DAGCombiner
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 352960

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/test/CodeGen/AMDGPU/dagcombine-mulhs-const.ll

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Add support for mulhi const folding in DAGCombinerClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 352960

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/test/CodeGen/AMDGPU/dagcombine-mulhs-const.ll

[DAGCombiner] Add support for mulhi const folding in DAGCombiner
ClosedPublic