This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/Utils/
-
llvm/
-
Transforms/
-
Utils/
-
Cloning.h
-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
-
CloneFunction.cpp
-
LoopUnroll.cpp
-
test/Transforms/
-
Transforms/
-
GVNSink/
-
assumption.ll
-
PhaseOrdering/X86/
-
X86/
-
assume-explosion.ll

Differential D99759

[LoopUnroll] avoid assumption clone explosion
AbandonedPublic

Authored by spatel on Apr 1 2021, 11:59 AM.

Download Raw Diff

Details

Reviewers

lebedev.ri
nikic
xbolva00
jdoerfert

Summary

The PhaseOrdering test example is based on:
https://llvm.org/PR49785

The test goes over the complexity cliff somewhere between -O2 and -O3 (more unrolling is done at -O3).
The initial assume is created by SimplifyCFG, multiplied by the LoopVectorizer, and then copied across numerous clone blocks by LoopUnroll. LoopUnroll then calls SimplifyInstruction() which calls simplifyICmpWithDominatingAssume() and we seem to be spending all of our time in there computing known bits, etc.

It seems unlikely that we have much to gain by cloning assumes late in the opt pipeline, so I'm proposing to stub that out when called during LoopUnroll.

Diff Detail

Event Timeline

spatel created this revision.Apr 1 2021, 11:59 AM

Herald added subscribers: zzheng, hiraditya, mcrosier. · View Herald TranscriptApr 1 2021, 11:59 AM

spatel requested review of this revision.Apr 1 2021, 11:59 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2021, 11:59 AM

Harbormaster completed remote builds in B96782: Diff 334790.Apr 1 2021, 11:59 AM

Ping.

Taking your example from https://bugs.llvm.org/show_bug.cgi?id=49785#c2, it seems to be scaling with about O(n^5.5) with iteration count, which is beyond all reasonable bounds. I think this needs to be mitigated at a lower level than unrolling.

For reference, this is the IR after unrolling but before simplification for the PhaseOrdering test: https://gist.github.com/nikic/efd3aae8b9282902a418f99e01ff5134 (note that mass of assumes on extractvalues is unproblematic, the issue are the assumes on %.pre.pre).

Limiting the number of assumes we look at for one value helps (and is something we should probably do as well), but I think there's a more fundamental problem with the recursion in computeKnownBitsForAssume. Possibly the exclusion mechanism there needs to be strengthened.

spatel mentioned this in D100408: [ValueTracking][InstSimplify] improve efficiency for detecting non-zero value.Apr 13 2021, 12:49 PM

spatel mentioned this in rG7ef2c68a3d24: [InstSimplify] improve efficiency for detecting non-zero value.Apr 14 2021, 6:12 AM

spatel mentioned this in D100573: [ValueTracking] don't recursively compute known bits using multiple llvm.assumes.Apr 15 2021, 8:56 AM

spatel mentioned this in rGbb907b26e2bf: [ValueTracking] don't recursively compute known bits using multiple llvm.assumes.Apr 16 2021, 5:46 AM

Abandoning.
We decided that this really is a ValueTracking problem.
I committed D100573 / bb907b26e2bf as an alternate fix for the bug.
Let's see if there are any regressions -- and potentially compile-time improvements -- from the loss of assumption analysis power.

Also, I think it's still worth including the test here as a regression test, so we will know quickly if we fall into this problem again.
But it will be 6K+ lines of IR with all of the current -O3 expansion, so this may be one of the rare times where it's better to not auto-generate the full CHECK lines!

spatel mentioned this in rG437fb4281787: [PhaseOrdering] add test to track PR49785; NFC.Apr 16 2021, 6:42 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

Utils/

Cloning.h

14 lines

lib/

Transforms/

Utils/

CloneFunction.cpp

10 lines

LoopUnroll.cpp

4 lines

test/

Transforms/

GVNSink/

assumption.ll

4 lines

PhaseOrdering/

X86/

assume-explosion.ll

206 lines

Diff 334790

llvm/include/llvm/Transforms/Utils/Cloning.h

Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	struct ClonedCodeInfo {
/// originally inserted callsites were DCE'ed after they were cloned.		/// originally inserted callsites were DCE'ed after they were cloned.
std::vector<WeakTrackingVH> OperandBundleCallSites;		std::vector<WeakTrackingVH> OperandBundleCallSites;

ClonedCodeInfo() = default;		ClonedCodeInfo() = default;
};		};

/// Return a copy of the specified basic block, but without		/// Return a copy of the specified basic block, but without
/// embedding the block into a particular function. The block returned is an		/// embedding the block into a particular function. The block returned is an
/// exact copy of the specified basic block, without any remapping having been		/// exact copy (see possible exception below) of the specified basic block,
/// performed. Because of this, this is only suitable for applications where		/// without any remapping having been performed. Because of this, this is only
/// the basic block will be inserted into the same function that it was cloned		/// suitable for applications where the basic block will be inserted into the
/// from (loop unrolling would use this, for example).		/// same function that it was cloned from (loop unrolling would use this, for
		/// example).
///		///
/// Also, note that this function makes a direct copy of the basic block, and		/// Also, note that this function makes a direct copy of the basic block, and
/// can thus produce illegal LLVM code. In particular, it will copy any PHI		/// can thus produce illegal LLVM code. In particular, it will copy any PHI
/// nodes from the original block, even though there are no predecessors for the		/// nodes from the original block, even though there are no predecessors for the
/// newly cloned block (thus, phi nodes will have to be updated). Also, this		/// newly cloned block (thus, phi nodes will have to be updated). Also, this
/// block will branch to the old successors of the original block: these		/// block will branch to the old successors of the original block: these
/// successors will have to have any PHI nodes updated to account for the new		/// successors will have to have any PHI nodes updated to account for the new
/// incoming edges.		/// incoming edges.
///		///
/// The correlation between instructions in the source and result basic blocks		/// The correlation between instructions in the source and result basic blocks
/// is recorded in the VMap map.		/// is recorded in the VMap map.
///		///
/// If you have a particular suffix you'd like to use to add to any cloned		/// If you have a particular suffix you'd like to use to add to any cloned
/// names, specify it as the optional third parameter.		/// names, specify it as the optional third parameter.
///		///
/// If you would like the basic block to be auto-inserted into the end of a		/// If you would like the basic block to be auto-inserted into the end of a
/// function, you can specify it as the optional fourth parameter.		/// function, you can specify it as the optional fourth parameter.
///		///
/// If you would like to collect additional information about the cloned		/// If you would like to collect additional information about the cloned
/// function, you can specify a ClonedCodeInfo object with the optional fifth		/// function, you can specify a ClonedCodeInfo object with the optional fifth
/// parameter.		/// parameter.
		///
		/// To exclude cloning of assumptions, set the optional 7th parameter to false.
BasicBlock CloneBasicBlock(const BasicBlock BB, ValueToValueMapTy &VMap,		BasicBlock CloneBasicBlock(const BasicBlock BB, ValueToValueMapTy &VMap,
const Twine &NameSuffix = "", Function *F = nullptr,		const Twine &NameSuffix = "", Function *F = nullptr,
ClonedCodeInfo *CodeInfo = nullptr,		ClonedCodeInfo *CodeInfo = nullptr,
DebugInfoFinder *DIFinder = nullptr);		DebugInfoFinder *DIFinder = nullptr,
		bool ShouldCloneAssumes = true);

/// Return a copy of the specified function and add it to that		/// Return a copy of the specified function and add it to that
/// function's module. Also, any references specified in the VMap are changed		/// function's module. Also, any references specified in the VMap are changed
/// to refer to their mapped value instead of the original one. If any of the		/// to refer to their mapped value instead of the original one. If any of the
/// arguments to the function are in the VMap, the arguments are deleted from		/// arguments to the function are in the VMap, the arguments are deleted from
/// the resultant function. The VMap is updated to include mappings from all of		/// the resultant function. The VMap is updated to include mappings from all of
/// the instructions and basicblocks in the function from their old to new		/// the instructions and basicblocks in the function from their old to new
/// values. The final argument captures information about the cloned code if		/// values. The final argument captures information about the cloned code if
▲ Show 20 Lines • Show All 223 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/CloneFunction.cpp

	Show All 37 Lines
	using namespace llvm;			using namespace llvm;

	#define DEBUG_TYPE "clone-function"			#define DEBUG_TYPE "clone-function"

	/// See comments in Cloning.h.			/// See comments in Cloning.h.
	BasicBlock llvm::CloneBasicBlock(const BasicBlock BB, ValueToValueMapTy &VMap,			BasicBlock llvm::CloneBasicBlock(const BasicBlock BB, ValueToValueMapTy &VMap,
	const Twine &NameSuffix, Function *F,			const Twine &NameSuffix, Function *F,
	ClonedCodeInfo *CodeInfo,			ClonedCodeInfo *CodeInfo,
	DebugInfoFinder *DIFinder) {			DebugInfoFinder *DIFinder,
				bool ShouldCloneAssumes) {
	DenseMap<const MDNode , MDNode > Cache;			DenseMap<const MDNode , MDNode > Cache;
	BasicBlock *NewBB = BasicBlock::Create(BB->getContext(), "", F);			BasicBlock *NewBB = BasicBlock::Create(BB->getContext(), "", F);
	if (BB->hasName())			if (BB->hasName())
	NewBB->setName(BB->getName() + NameSuffix);			NewBB->setName(BB->getName() + NameSuffix);

	bool hasCalls = false, hasDynamicAllocas = false;			bool hasCalls = false, hasDynamicAllocas = false;
	Module *TheModule = F ? F->getParent() : nullptr;			Module *TheModule = F ? F->getParent() : nullptr;

	// Loop over all instructions, and copy them over.			// Loop over all instructions, and copy them over.
	for (const Instruction &I : *BB) {			for (const Instruction &I : *BB) {
				// A caller (for example LoopUnroll) may want to avoid cloning assumptions
				// because they are not useful and potentially expensive to analyze.
				if (!ShouldCloneAssumes)
				if (auto *II = dyn_cast<IntrinsicInst>(&I))
				if (II->getIntrinsicID() == Intrinsic::assume)
				continue;

	if (DIFinder && TheModule)			if (DIFinder && TheModule)
	DIFinder->processInstruction(*TheModule, I);			DIFinder->processInstruction(*TheModule, I);

	Instruction *NewInst = I.clone();			Instruction *NewInst = I.clone();
	if (I.hasName())			if (I.hasName())
	NewInst->setName(I.getName() + NameSuffix);			NewInst->setName(I.getName() + NameSuffix);
	NewBB->getInstList().push_back(NewInst);			NewBB->getInstList().push_back(NewInst);
	VMap[&I] = NewInst; // Add instruction map to value.			VMap[&I] = NewInst; // Add instruction map to value.
	▲ Show 20 Lines • Show All 975 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/LoopUnroll.cpp

Show First 20 Lines • Show All 596 Lines • ▼ Show 20 Lines	LoopUnrollResult llvm::UnrollLoop(Loop L, UnrollLoopOptions ULO, LoopInfo LI,

for (unsigned It = 1; It != ULO.Count; ++It) {		for (unsigned It = 1; It != ULO.Count; ++It) {
SmallVector<BasicBlock *, 8> NewBlocks;		SmallVector<BasicBlock *, 8> NewBlocks;
SmallDenseMap<const Loop , Loop , 4> NewLoops;		SmallDenseMap<const Loop , Loop , 4> NewLoops;
NewLoops[L] = L;		NewLoops[L] = L;

for (LoopBlocksDFS::RPOIterator BB = BlockBegin; BB != BlockEnd; ++BB) {		for (LoopBlocksDFS::RPOIterator BB = BlockBegin; BB != BlockEnd; ++BB) {
ValueToValueMapTy VMap;		ValueToValueMapTy VMap;
BasicBlock New = CloneBasicBlock(BB, VMap, "." + Twine(It));		BasicBlock *New =
		CloneBasicBlock(*BB, VMap, "." + Twine(It), nullptr, nullptr, nullptr,
		/* ShouldCloneAssumes */ false);
Header->getParent()->getBasicBlockList().push_back(New);		Header->getParent()->getBasicBlockList().push_back(New);

assert((BB != Header \|\| LI->getLoopFor(BB) == L) &&		assert((BB != Header \|\| LI->getLoopFor(BB) == L) &&
"Header should not be in a sub-loop");		"Header should not be in a sub-loop");
// Tell LI about New.		// Tell LI about New.
const Loop OldLoop = addClonedBlockToLoopInfo(BB, New, LI, NewLoops);		const Loop OldLoop = addClonedBlockToLoopInfo(BB, New, LI, NewLoops);
if (OldLoop)		if (OldLoop)
LoopsToSimplify.insert(NewLoops[OldLoop]);		LoopsToSimplify.insert(NewLoops[OldLoop]);
▲ Show 20 Lines • Show All 365 Lines • Show Last 20 Lines

llvm/test/Transforms/GVNSink/assumption.ll

	Show All 11 Lines
	; CHECK-LABEL: @main(			; CHECK-LABEL: @main(
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK-NEXT: br label [[BB4_I:%.*]]			; CHECK-NEXT: br label [[BB4_I:%.*]]
	; CHECK: bb4.i:			; CHECK: bb4.i:
	; CHECK-NEXT: [[I1_I:%.]] = load volatile i32, i32 @g, align 4			; CHECK-NEXT: [[I1_I:%.]] = load volatile i32, i32 @g, align 4
	; CHECK-NEXT: [[I32_I:%.*]] = icmp eq i32 [[I1_I]], 0			; CHECK-NEXT: [[I32_I:%.*]] = icmp eq i32 [[I1_I]], 0
	; CHECK-NEXT: call void @llvm.assume(i1 [[I32_I]])			; CHECK-NEXT: call void @llvm.assume(i1 [[I32_I]])
	; CHECK-NEXT: [[I1_I_1:%.]] = load volatile i32, i32 @g, align 4			; CHECK-NEXT: [[I1_I_1:%.]] = load volatile i32, i32 @g, align 4
	; CHECK-NEXT: [[I32_I_1:%.*]] = icmp eq i32 [[I1_I_1]], 0
	; CHECK-NEXT: call void @llvm.assume(i1 [[I32_I_1]])
	; CHECK-NEXT: [[I1_I_2:%.]] = load volatile i32, i32 @g, align 4			; CHECK-NEXT: [[I1_I_2:%.]] = load volatile i32, i32 @g, align 4
	; CHECK-NEXT: [[I32_I_2:%.*]] = icmp eq i32 [[I1_I_2]], 0
	; CHECK-NEXT: call void @llvm.assume(i1 [[I32_I_2]])
	; CHECK-NEXT: br label [[BB4_I]], !llvm.loop [[LOOP0:![0-9]+]]			; CHECK-NEXT: br label [[BB4_I]], !llvm.loop [[LOOP0:![0-9]+]]
	; CHECK: func_1.exit:			; CHECK: func_1.exit:
	; CHECK-NEXT: unreachable			; CHECK-NEXT: unreachable
	;			;
	bb:			bb:
	%i1.i = load volatile i32, i32* @g			%i1.i = load volatile i32, i32* @g
	%i32.i = icmp eq i32 %i1.i, 0			%i32.i = icmp eq i32 %i1.i, 0
	call void @llvm.assume(i1 %i32.i) #3			call void @llvm.assume(i1 %i32.i) #3
	Show All 13 Lines

llvm/test/Transforms/PhaseOrdering/X86/assume-explosion.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -O3 -S < %s \| FileCheck %s

				; This test confirms that we do not create assumes,
				; clone them excessively, and then cause a compile-time
				; explosion trying to simplify them all.

				target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-apple-macosx11.0.0"

				@e = global i16 0, align 2
				@a = global i32 0, align 4
				@c = global i32 0, align 4
				@b = global i32 0, align 4
				@d = global i32 0, align 4

				define void @f() #0 {
				; CHECK-LABEL: @f(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: store i32 5, i32* @c, align 4, !tbaa [[TBAA3:![0-9]+]]
				; CHECK-NEXT: [[DOTPRE_PRE:%.]] = load i32, i32 @b, align 4, !tbaa [[TBAA3]]
				; CHECK-NEXT: [[DOTPRE7_PRE:%.]] = load i32, i32 @d, align 4, !tbaa [[TBAA3]]
				; CHECK-NEXT: [[XOR:%.*]] = xor i32 [[DOTPRE_PRE]], 57
				; CHECK-NEXT: [[CMP6_NOT:%.*]] = icmp eq i32 [[XOR]], [[DOTPRE7_PRE]]
				; CHECK-NEXT: [[BROADCAST_SPLATINSERT9:%.*]] = insertelement <16 x i32> poison, i32 [[DOTPRE7_PRE]], i32 0
				; CHECK-NEXT: [[BROADCAST_SPLAT10:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT9]], <16 x i32> poison, <16 x i32> zeroinitializer
				; CHECK-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <16 x i32> poison, i32 [[DOTPRE_PRE]], i32 0
				; CHECK-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <16 x i32> [[BROADCAST_SPLATINSERT]], <16 x i32> poison, <16 x i32> zeroinitializer
				; CHECK-NEXT: [[TMP0:%.*]] = xor <16 x i32> [[BROADCAST_SPLAT]], <i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24>
				; CHECK-NEXT: [[TMP1:%.*]] = icmp eq <16 x i32> [[TMP0]], [[BROADCAST_SPLAT10]]
				; CHECK-NEXT: [[TMP2:%.*]] = extractelement <16 x i1> [[TMP1]], i32 0
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP2]])
				; CHECK-NEXT: [[TMP3:%.*]] = extractelement <16 x i1> [[TMP1]], i32 1
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP3]])
				; CHECK-NEXT: [[TMP4:%.*]] = extractelement <16 x i1> [[TMP1]], i32 2
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP4]])
				; CHECK-NEXT: [[TMP5:%.*]] = extractelement <16 x i1> [[TMP1]], i32 3
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP5]])
				; CHECK-NEXT: [[TMP6:%.*]] = extractelement <16 x i1> [[TMP1]], i32 4
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP6]])
				; CHECK-NEXT: [[TMP7:%.*]] = extractelement <16 x i1> [[TMP1]], i32 5
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP7]])
				; CHECK-NEXT: [[TMP8:%.*]] = extractelement <16 x i1> [[TMP1]], i32 6
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP8]])
				; CHECK-NEXT: [[TMP9:%.*]] = extractelement <16 x i1> [[TMP1]], i32 7
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP9]])
				; CHECK-NEXT: [[TMP10:%.*]] = extractelement <16 x i1> [[TMP1]], i32 8
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP10]])
				; CHECK-NEXT: [[TMP11:%.*]] = extractelement <16 x i1> [[TMP1]], i32 9
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP11]])
				; CHECK-NEXT: [[TMP12:%.*]] = extractelement <16 x i1> [[TMP1]], i32 10
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP12]])
				; CHECK-NEXT: [[TMP13:%.*]] = extractelement <16 x i1> [[TMP1]], i32 11
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP13]])
				; CHECK-NEXT: [[TMP14:%.*]] = extractelement <16 x i1> [[TMP1]], i32 12
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP14]])
				; CHECK-NEXT: [[TMP15:%.*]] = extractelement <16 x i1> [[TMP1]], i32 13
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP15]])
				; CHECK-NEXT: [[TMP16:%.*]] = extractelement <16 x i1> [[TMP1]], i32 14
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP16]])
				; CHECK-NEXT: [[TMP17:%.*]] = extractelement <16 x i1> [[TMP1]], i32 15
				; CHECK-NEXT: call void @llvm.assume(i1 [[TMP17]])
				; CHECK-NEXT: call void @llvm.assume(i1 [[CMP6_NOT]])
				; CHECK-NEXT: br label [[VECTOR_PH:%.*]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[TMP18:%.]] = phi i32 [ 5, [[ENTRY:%.]] ], [ [[INC:%.*]], [[VECTOR_PH]] ]
				; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[TMP18]], 1
				; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[TMP18]], 63
				; CHECK-NEXT: br i1 [[CMP]], label [[VECTOR_PH]], label [[FOR_END34:%.*]], !llvm.loop [[LOOP7:![0-9]+]]
				; CHECK: for.end34:
				; CHECK-NEXT: store i32 [[INC]], i32* @c, align 4, !tbaa [[TBAA3]]
				; CHECK-NEXT: store i16 61, i16* @e, align 2, !tbaa [[TBAA9:![0-9]+]]
				; CHECK-NEXT: ret void
				;
				entry:
				store i32 5, i32* @c, align 4, !tbaa !3
				br label %for.cond

				for.cond:
				%0 = load i32, i32* @c, align 4, !tbaa !3
				%cmp = icmp sle i32 %0, 63
				br i1 %cmp, label %for.body, label %for.end34

				for.body:
				store i16 9, i16* @e, align 2, !tbaa !7
				br label %for.cond1

				for.cond1:
				%1 = load i16, i16* @e, align 2, !tbaa !7
				%conv = zext i16 %1 to i32
				%cmp2 = icmp sle i32 %conv, 60
				br i1 %cmp2, label %for.body4, label %for.end32

				for.body4:
				%2 = load i16, i16* @e, align 2, !tbaa !7
				%conv5 = zext i16 %2 to i32
				%3 = load i32, i32* @b, align 4, !tbaa !3
				%xor = xor i32 %conv5, %3
				%4 = load i32, i32* @d, align 4, !tbaa !3
				%cmp6 = icmp ne i32 %xor, %4
				br i1 %cmp6, label %if.then, label %if.end27

				if.then:
				%5 = load i32, i32* @a, align 4, !tbaa !3
				%conv8 = sext i32 %5 to i64
				%6 = inttoptr i64 %conv8 to i8*
				store i8 3, i8* %6, align 1, !tbaa !9
				br label %for.cond9

				for.cond9:
				%7 = load i8, i8* %6, align 1, !tbaa !9
				%conv10 = sext i8 %7 to i32
				%cmp11 = icmp sle i32 %conv10, 32
				br i1 %cmp11, label %for.body13, label %for.end26

				for.body13:
				%8 = load i8, i8* %6, align 1, !tbaa !9
				%tobool = icmp ne i8 %8, 0
				br i1 %tobool, label %if.then14, label %if.end

				if.then14:
				store i8 1, i8* bitcast (i32* @a to i8*), align 1, !tbaa !9
				br label %for.cond15

				for.cond15:
				%9 = load i8, i8* bitcast (i32* @a to i8*), align 1, !tbaa !9
				%conv16 = sext i8 %9 to i32
				%cmp17 = icmp sle i32 %conv16, 30
				br i1 %cmp17, label %for.body19, label %for.end

				for.body19:
				%10 = load i32, i32* @c, align 4, !tbaa !3
				%cmp20 = icmp eq i32 0, %10
				%conv21 = zext i1 %cmp20 to i32
				%11 = load i8, i8* bitcast (i32* @a to i8*), align 1, !tbaa !9
				%conv22 = sext i8 %11 to i32
				%and = and i32 %conv22, %conv21
				%conv23 = trunc i32 %and to i8
				store i8 %conv23, i8* bitcast (i32* @a to i8*), align 1, !tbaa !9
				br label %for.cond15, !llvm.loop !10

				for.end:
				br label %if.end

				if.end:
				br label %for.inc

				for.inc:
				%12 = load i8, i8* %6, align 1, !tbaa !9
				%conv24 = sext i8 %12 to i32
				%add = add nsw i32 %conv24, 1
				%conv25 = trunc i32 %add to i8
				store i8 %conv25, i8* %6, align 1, !tbaa !9
				br label %for.cond9, !llvm.loop !12

				for.end26:
				br label %if.end27

				if.end27:
				br label %for.inc28

				for.inc28:
				%13 = load i16, i16* @e, align 2, !tbaa !7
				%conv29 = zext i16 %13 to i32
				%add30 = add nsw i32 %conv29, 1
				%conv31 = trunc i32 %add30 to i16
				store i16 %conv31, i16* @e, align 2, !tbaa !7
				br label %for.cond1, !llvm.loop !13

				for.end32:
				br label %for.inc33

				for.inc33:
				%14 = load i32, i32* @c, align 4, !tbaa !3
				%inc = add nsw i32 %14, 1
				store i32 %inc, i32* @c, align 4, !tbaa !3
				br label %for.cond, !llvm.loop !14

				for.end34:
				ret void
				}

				declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1
				declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #1

				attributes #0 = { nounwind ssp uwtable "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "tune-cpu"="generic" }
				attributes #1 = { argmemonly nofree nosync nounwind willreturn }

				!llvm.module.flags = !{!0, !1}
				!llvm.ident = !{!2}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{i32 7, !"PIC Level", i32 2}
				!2 = !{!"clang version 13.0.0 (https://github.com/llvm/llvm-project.git 7a4abc07dd8f1d8217e482ebbf438197c1aea7f0)"}
				!3 = !{!4, !4, i64 0}
				!4 = !{!"int", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}
				!7 = !{!8, !8, i64 0}
				!8 = !{!"short", !5, i64 0}
				!9 = !{!5, !5, i64 0}
				!10 = distinct !{!10, !11}
				!11 = !{!"llvm.loop.mustprogress"}
				!12 = distinct !{!12, !11}
				!13 = distinct !{!13, !11}
				!14 = distinct !{!14, !11}

This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] avoid assumption clone explosionAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 334790

llvm/include/llvm/Transforms/Utils/Cloning.h

llvm/lib/Transforms/Utils/CloneFunction.cpp

llvm/lib/Transforms/Utils/LoopUnroll.cpp

llvm/test/Transforms/GVNSink/assumption.ll

llvm/test/Transforms/PhaseOrdering/X86/assume-explosion.ll

[LoopUnroll] avoid assumption clone explosion
AbandonedPublic