This is an archive of the discontinued LLVM Phabricator instance.

[ExpandReductions] Don't push all intrinsics to the worklist. Just push reductions.
ClosedPublic

Authored by craig.topper on Oct 26 2019, 6:12 PM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
aemerson

Commits

rG17bb2d7c803d: [ExpandReductions] Don't push all intrinsics to the worklist. Just push…

Summary

We were previously pushing all intrinsics used in a function to the
worklist. This is wasteful for memory in a function with a lot of
intrinsics.

We also ask TTI if we should expand every intrinsic, but we only
have expansion support for the reduction intrinsics. This just
wastes time for the non-reduction intrinsics.

This patch only pushes reduction intrinsics into the worklist and
skips other intrinsics.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 40093
Build 40175: arc lint + arc unit

Event Timeline

craig.topper created this revision.Oct 26 2019, 6:12 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 26 2019, 6:12 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B40093: Diff 226558.Oct 26 2019, 6:14 PM

RKSimon added inline comments.Oct 28 2019, 2:50 AM

llvm/lib/CodeGen/ExpandReductions.cpp
98	Do we gain anything by doing the if TTI->shouldExpandReduction(II) here to decide whether to add to WorkList?

simoll added a subscriber: simoll.Oct 28 2019, 7:33 AM

spatel added inline comments.Oct 28 2019, 7:47 AM

llvm/lib/CodeGen/ExpandReductions.cpp
81	Could reduce with something like: for (auto &I : instructions(F)) { Note - either way, this pass may be in danger of dying if used on unreachable blocks that contain weird IR. See for example: D67766

DoktorC added a subscriber: DoktorC.Oct 28 2019, 7:58 AM

craig.topper marked an inline comment as done.Nov 13 2019, 8:10 PM

craig.topper added inline comments.

llvm/lib/CodeGen/ExpandReductions.cpp
81	Does the UnreachableBlockEliminationPass that we run earlier in the codegen pipeline help prevent that?

Call shouldExpandReduction before putting in the worklist

Harbormaster completed remote builds in B40934: Diff 229224.Nov 13 2019, 9:05 PM

LGTM.

This revision is now accepted and ready to land.Nov 13 2019, 10:43 PM

spatel accepted this revision.Nov 14 2019, 5:17 AM

spatel added inline comments.

llvm/lib/CodeGen/ExpandReductions.cpp
81	Nice - I didn't know that existed. It's just a wrapper around a util function: llvm::EliminateUnreachableBlocks(). So yes, that should make it pretty safe. Still a chance that fuzzers will target this pass in a different pipeline or that something between that running and this running would create a dead block with bogus code, but the odds are low.

Closed by commit rG17bb2d7c803d: [ExpandReductions] Don't push all intrinsics to the worklist. Just push… (authored by craig.topper). · Explain WhyNov 14 2019, 10:34 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

ExpandReductions.cpp

34 lines

Diff 226558

llvm/lib/CodeGen/ExpandReductions.cpp

Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	RecurrenceDescriptor::MinMaxRecurrenceKind getMRK(Intrinsic::ID ID) {
default:		default:
return RecurrenceDescriptor::MRK_Invalid;		return RecurrenceDescriptor::MRK_Invalid;
}		}
}		}

bool expandReductions(Function &F, const TargetTransformInfo *TTI) {		bool expandReductions(Function &F, const TargetTransformInfo *TTI) {
bool Changed = false;		bool Changed = false;
SmallVector<IntrinsicInst *, 4> Worklist;		SmallVector<IntrinsicInst *, 4> Worklist;
for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)		for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I) {
		spatelUnsubmitted Not Done Reply Inline Actions Could reduce with something like: for (auto &I : instructions(F)) { Note - either way, this pass may be in danger of dying if used on unreachable blocks that contain weird IR. See for example: D67766 spatel: Could reduce with something like: for (auto &I : instructions(F)) { Note - either way, this…
		craig.topperAuthorUnsubmitted Done Reply Inline Actions Does the UnreachableBlockEliminationPass that we run earlier in the codegen pipeline help prevent that? craig.topper: Does the UnreachableBlockEliminationPass that we run earlier in the codegen pipeline help…
		spatelUnsubmitted Not Done Reply Inline Actions Nice - I didn't know that existed. It's just a wrapper around a util function: llvm::EliminateUnreachableBlocks(). So yes, that should make it pretty safe. Still a chance that fuzzers will target this pass in a different pipeline or that something between that running and this running would create a dead block with bogus code, but the odds are low. spatel: Nice - I didn't know that existed. It's just a wrapper around a util function: llvm…
if (auto II = dyn_cast<IntrinsicInst>(&*I))		if (auto II = dyn_cast<IntrinsicInst>(&*I)) {
		switch (II->getIntrinsicID()) {
		default: break;
		case Intrinsic::experimental_vector_reduce_v2_fadd:
		case Intrinsic::experimental_vector_reduce_v2_fmul:
		case Intrinsic::experimental_vector_reduce_add:
		case Intrinsic::experimental_vector_reduce_mul:
		case Intrinsic::experimental_vector_reduce_and:
		case Intrinsic::experimental_vector_reduce_or:
		case Intrinsic::experimental_vector_reduce_xor:
		case Intrinsic::experimental_vector_reduce_smax:
		case Intrinsic::experimental_vector_reduce_smin:
		case Intrinsic::experimental_vector_reduce_umax:
		case Intrinsic::experimental_vector_reduce_umin:
		case Intrinsic::experimental_vector_reduce_fmax:
		case Intrinsic::experimental_vector_reduce_fmin:
Worklist.push_back(II);		Worklist.push_back(II);
		RKSimonUnsubmitted Not Done Reply Inline Actions Do we gain anything by doing the if TTI->shouldExpandReduction(II) here to decide whether to add to WorkList? RKSimon: Do we gain anything by doing the if TTI->shouldExpandReduction(II) here to decide whether to…
		break;
		}
		}
		}

for (auto *II : Worklist) {		for (auto *II : Worklist) {
if (!TTI->shouldExpandReduction(II))		if (!TTI->shouldExpandReduction(II))
continue;		continue;

FastMathFlags FMF =		FastMathFlags FMF =
isa<FPMathOperator>(II) ? II->getFastMathFlags() : FastMathFlags{};		isa<FPMathOperator>(II) ? II->getFastMathFlags() : FastMathFlags{};
Intrinsic::ID ID = II->getIntrinsicID();		Intrinsic::ID ID = II->getIntrinsicID();
RecurrenceDescriptor::MinMaxRecurrenceKind MRK = getMRK(ID);		RecurrenceDescriptor::MinMaxRecurrenceKind MRK = getMRK(ID);

Value *Rdx = nullptr;		Value *Rdx = nullptr;
IRBuilder<> Builder(II);		IRBuilder<> Builder(II);
IRBuilder<>::FastMathFlagGuard FMFGuard(Builder);		IRBuilder<>::FastMathFlagGuard FMFGuard(Builder);
Builder.setFastMathFlags(FMF);		Builder.setFastMathFlags(FMF);
switch (ID) {		switch (ID) {
		default: llvm_unreachable("Unexpected intrinsic!");
case Intrinsic::experimental_vector_reduce_v2_fadd:		case Intrinsic::experimental_vector_reduce_v2_fadd:
case Intrinsic::experimental_vector_reduce_v2_fmul: {		case Intrinsic::experimental_vector_reduce_v2_fmul: {
// FMFs must be attached to the call, otherwise it's an ordered reduction		// FMFs must be attached to the call, otherwise it's an ordered reduction
// and it can't be handled by generating a shuffle sequence.		// and it can't be handled by generating a shuffle sequence.
Value *Acc = II->getArgOperand(0);		Value *Acc = II->getArgOperand(0);
Value *Vec = II->getArgOperand(1);		Value *Vec = II->getArgOperand(1);
if (!FMF.allowReassoc())		if (!FMF.allowReassoc())
Rdx = getOrderedReduction(Builder, Acc, Vec, getOpcode(ID), MRK);		Rdx = getOrderedReduction(Builder, Acc, Vec, getOpcode(ID), MRK);
else {		else {
Rdx = getShuffleReduction(Builder, Vec, getOpcode(ID), MRK);		Rdx = getShuffleReduction(Builder, Vec, getOpcode(ID), MRK);
Rdx = Builder.CreateBinOp((Instruction::BinaryOps)getOpcode(ID),		Rdx = Builder.CreateBinOp((Instruction::BinaryOps)getOpcode(ID),
Acc, Rdx, "bin.rdx");		Acc, Rdx, "bin.rdx");
}		}
} break;		break;
		}
case Intrinsic::experimental_vector_reduce_add:		case Intrinsic::experimental_vector_reduce_add:
case Intrinsic::experimental_vector_reduce_mul:		case Intrinsic::experimental_vector_reduce_mul:
case Intrinsic::experimental_vector_reduce_and:		case Intrinsic::experimental_vector_reduce_and:
case Intrinsic::experimental_vector_reduce_or:		case Intrinsic::experimental_vector_reduce_or:
case Intrinsic::experimental_vector_reduce_xor:		case Intrinsic::experimental_vector_reduce_xor:
case Intrinsic::experimental_vector_reduce_smax:		case Intrinsic::experimental_vector_reduce_smax:
case Intrinsic::experimental_vector_reduce_smin:		case Intrinsic::experimental_vector_reduce_smin:
case Intrinsic::experimental_vector_reduce_umax:		case Intrinsic::experimental_vector_reduce_umax:
case Intrinsic::experimental_vector_reduce_umin:		case Intrinsic::experimental_vector_reduce_umin:
case Intrinsic::experimental_vector_reduce_fmax:		case Intrinsic::experimental_vector_reduce_fmax:
case Intrinsic::experimental_vector_reduce_fmin: {		case Intrinsic::experimental_vector_reduce_fmin: {
Value *Vec = II->getArgOperand(0);		Value *Vec = II->getArgOperand(0);
Rdx = getShuffleReduction(Builder, Vec, getOpcode(ID), MRK);		Rdx = getShuffleReduction(Builder, Vec, getOpcode(ID), MRK);
} break;		break;
default:		}
continue;
}		}
II->replaceAllUsesWith(Rdx);		II->replaceAllUsesWith(Rdx);
II->eraseFromParent();		II->eraseFromParent();
Changed = true;		Changed = true;
}		}
return Changed;		return Changed;
}		}

Show All 39 Lines