This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/Utils/
-
llvm/
-
Transforms/
-
Utils/
-
Cloning.h
-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
-
CloneFunction.cpp
-
InlineFunction.cpp
6
LoopUnroll.cpp
-
test/Transforms/
-
Transforms/
-
LoopUnroll/
1
noalias.ll
-
PhaseOrdering/
-
X86/
-
vdiv.ll
-
pr39282.ll

Differential D90104

[LoopUnroll] Duplicate noalias metadata
AbandonedPublic

Authored by nikic on Oct 24 2020, 1:58 PM.

Download Raw Diff

Details

Reviewers: None

Summary

Fix for https://bugs.llvm.org/show_bug.cgi?id=39282. Scoped noalias metadata inside a loop might only be applicable within an iteration, so we should duplicate it into distinct domains when unrolling a loop.

Diff Detail

Event Timeline

nikic created this revision.Oct 24 2020, 1:58 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2020, 1:58 PM

Herald added subscribers: llvm-commits, zzheng, hiraditya. · View Herald Transcript

nikic requested review of this revision.Oct 24 2020, 1:58 PM

Harbormaster completed remote builds in B76296: Diff 300501.Oct 24 2020, 2:34 PM

I think this makes sense, though it will probably only mitigate the problem.

llvm/lib/Transforms/Utils/LoopUnroll.cpp
262	Nit: Don't we have a range based version of this?
273	Do you pass MD by value on purpose? Also consider passing the return value by reference, though I doubt it makes much of a difference.
285	Nit: hoist and use clear, according to the programmers manual (or some other document we have).
286	Nit: range again?
304	Could you add some documentation here and above please.
307	Nit: range & auto?

It would make sense to add comments in the code to explain _why_ this is needed and why it is safe to do.
note: the transformation should be correct, as the unrolled version might now 'alias' with usages of the original scopes that are outside the loop.

aka, this will be a pessimization for some use cases (those where the cloning should not be needed)
for reference, D68510 is the patch in the 'full restrict patchset' (D68484) that adds similar, but more complete suport.

Meinersbur added a subscriber: Meinersbur.Oct 26 2020, 10:05 AM

Meinersbur added inline comments.

llvm/test/Transforms/LoopUnroll/noalias.ll
3	For robustness, the test case should specify the unroll factor, e.g. `-unroll-count=4`
48–59	User regex to match MDNodes?

Extract ScopedAliasMetadataCloner utility to share between inlining and loop unrolling. Tidy up the code with range based loops etc.

I've now updated this to introduce a common utility (ScopedAliasMetadataCloner) that can handle essentially the same task in inlining and loop unrolling. If that looks reasonable, I can commit the utility separately from the new usage in LoopUnroll.

@jeroen.dobbelaere: Yeah, I'm aware of that. However, the full restrict patches are a large change that has stalled for a long time, so I think we need to address the miscomples in a simple way in the meantime. I was also wondering if it might not make sense to split up the full restrict patches in a way that addresses individual problems without having to land the full machinery in one go. For example, by first introducing the noalias.decl function only, which is I believe the only part that's relevant for this particular problem.

In D90104#2366355, @nikic wrote:

@jeroen.dobbelaere: Yeah, I'm aware of that. However, the full restrict patches are a large change that has stalled for a long time, so I think we need to address the miscomples in a simple way in the meantime. I was also wondering if it might not make sense to split up the full restrict patches in a way that addresses individual problems without having to land the full machinery in one go. For example, by first introducing the noalias.decl function only, which is I believe the only part that's relevant for this particular problem.

I was hoping to be further in the review process by now.. but getting somebody to review/help reviewing the documentation patch already seems to be hard :(

Next week, I can put some effort into prepare a patch with the 'llvm.noalias.decl', together with the necessary cloning for loop unroll.
That should help you, by fixing this problem. It should also help the full restrict patches, by already incorporating part of it in the main llvm.

Maybe this is something we can discuss tomorrow in the llvm alias analysis technical call (Nov 3, noon central time) ?
( http://lists.llvm.org/pipermail/llvm-dev/2020-November/146311.html )

In D90104#2368046, @jeroen.dobbelaere wrote:

[..]
Note: the correct time of the LLVM Alias Analysis Technical call is 12am (noon). See http://lists.llvm.org/pipermail/llvm-dev/2020-November/146311.html

uenoku added a subscriber: uenoku.Nov 3 2020, 9:44 AM

As discussed during the LLVM Alias Analysis technical call, I will work this week on a fix using `llvm.noalias.decl`.

jrmuizel added a subscriber: jrmuizel.Nov 17 2020, 9:26 AM

MSxDOS added a subscriber: MSxDOS.Dec 4 2020, 2:24 PM

lzutao added a subscriber: lzutao.Dec 6 2020, 4:47 PM

Just an update: I am finalizing an alternative version using the @llvm.noalias.decl intrinsic. Should be there later today.

jeroen.dobbelaere mentioned this in D92887: [LoopUnroll] Use llvm.experimental.noalias.scope.decl for duplicating noalias metadata as needed.Dec 8 2020, 2:46 PM

jeroen.dobbelaere mentioned this in D93040: [InlineFunction] Use llvm.experimental.noalias.scope.decl for noalias arguments..Dec 10 2020, 8:30 AM

PaulGrandperrin added a subscriber: PaulGrandperrin.Jan 4 2021, 6:26 AM

penzn added a subscriber: penzn.Jan 5 2021, 10:09 AM

jeroen.dobbelaere mentioned this in rG2b9a834c43cb: [InlineFunction] Use llvm.experimental.noalias.scope.decl for noalias arguments..Jan 23 2021, 3:12 AM

Abandoning this in favor of D92887.

jeroen.dobbelaere mentioned this in rG774629641bf3: [LoopUnroll] Use llvm.experimental.noalias.scope.decl for duplicating noalias….Jan 24 2021, 4:49 AM

nikic abandoned this revision.Jan 31 2021, 9:10 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

Utils/

Cloning.h

24 lines

lib/

Transforms/

Utils/

CloneFunction.cpp

87 lines

InlineFunction.cpp

78 lines

LoopUnroll.cpp

12 lines

test/

Transforms/

LoopUnroll/

noalias.ll

58 lines

PhaseOrdering/

X86/

vdiv.ll

62 lines

pr39282.ll

12 lines

Diff 302079

llvm/include/llvm/Transforms/Utils/Cloning.h

	Show All 11 Lines
	// functions, to copying basic blocks to support loop unrolling or superblock			// functions, to copying basic blocks to support loop unrolling or superblock
	// formation, etc.			// formation, etc.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef LLVM_TRANSFORMS_UTILS_CLONING_H			#ifndef LLVM_TRANSFORMS_UTILS_CLONING_H
	#define LLVM_TRANSFORMS_UTILS_CLONING_H			#define LLVM_TRANSFORMS_UTILS_CLONING_H

				#include "llvm/ADT/SetVector.h"
	#include "llvm/ADT/SmallVector.h"			#include "llvm/ADT/SmallVector.h"
	#include "llvm/ADT/Twine.h"			#include "llvm/ADT/Twine.h"
	#include "llvm/Analysis/AssumptionCache.h"			#include "llvm/Analysis/AssumptionCache.h"
	#include "llvm/Analysis/InlineCost.h"			#include "llvm/Analysis/InlineCost.h"
	#include "llvm/IR/ValueHandle.h"			#include "llvm/IR/ValueHandle.h"
	#include "llvm/Transforms/Utils/ValueMapper.h"			#include "llvm/Transforms/Utils/ValueMapper.h"
	#include <functional>			#include <functional>
	#include <memory>			#include <memory>
	▲ Show 20 Lines • Show All 235 Lines • ▼ Show 20 Lines

	/// Updates profile information by adjusting the entry count by adding			/// Updates profile information by adjusting the entry count by adding
	/// entryDelta then scaling callsite information by the new count divided by the			/// entryDelta then scaling callsite information by the new count divided by the
	/// old count. VMap is used during inlinng to also update the new clone			/// old count. VMap is used during inlinng to also update the new clone
	void updateProfileCallee(			void updateProfileCallee(
	Function *Callee, int64_t entryDelta,			Function *Callee, int64_t entryDelta,
	const ValueMap<const Value , WeakTrackingVH> VMap = nullptr);			const ValueMap<const Value , WeakTrackingVH> VMap = nullptr);

				/// Utility for cloning !noalias and !alias.scope metadata. When a code region
				/// using scoped alias metadata is cloned, the aliasing relationships may not
				/// hold between the two clones, in which case it is necessary to clone the
				/// metadata using this utility. This comes up with inlining and unrolling.
				class ScopedAliasMetadataCloner {
				using MetadataMap = DenseMap<const MDNode *, TrackingMDNodeRef>;
				SetVector<const MDNode *> MD;
				MetadataMap Map;
				void addRecursiveMetadataUses();

				public:
				ScopedAliasMetadataCloner(ArrayRef<BasicBlock *> Blocks);
				ScopedAliasMetadataCloner(const Function *F);

				/// Create a new clone of the scoped alias metadata, which will be used by
				/// subsequent remap() calls.
				void clone();

				/// Remap instructions in the given VMap from the original to the cloned
				/// metadata.
				void remap(ValueToValueMapTy &VMap);
				};

	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TRANSFORMS_UTILS_CLONING_H			#endif // LLVM_TRANSFORMS_UTILS_CLONING_H

llvm/lib/Transforms/Utils/CloneFunction.cpp

Show First 20 Lines • Show All 878 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = New->getNumOperands(); i != e; ++i)
auto I = ValueMapping.find(Inst);		auto I = ValueMapping.find(Inst);
if (I != ValueMapping.end())		if (I != ValueMapping.end())
New->setOperand(i, I->second);		New->setOperand(i, I->second);
}		}
}		}

return NewBB;		return NewBB;
}		}

		ScopedAliasMetadataCloner::ScopedAliasMetadataCloner(
		ArrayRef<BasicBlock *> Blocks) {
		for (BasicBlock *BB : Blocks) {
		for (const Instruction &I : *BB) {
		if (const MDNode *M = I.getMetadata(LLVMContext::MD_alias_scope))
		MD.insert(M);
		if (const MDNode *M = I.getMetadata(LLVMContext::MD_noalias))
		MD.insert(M);
		}
		}
		addRecursiveMetadataUses();
		}

		ScopedAliasMetadataCloner::ScopedAliasMetadataCloner(const Function *F) {
		for (const BasicBlock &BB : *F) {
		for (const Instruction &I : BB) {
		if (const MDNode *M = I.getMetadata(LLVMContext::MD_alias_scope))
		MD.insert(M);
		if (const MDNode *M = I.getMetadata(LLVMContext::MD_noalias))
		MD.insert(M);
		}
		}
		addRecursiveMetadataUses();
		}

		void ScopedAliasMetadataCloner::addRecursiveMetadataUses() {
		SmallVector<const Metadata *, 16> Queue(MD.begin(), MD.end());
		while (!Queue.empty()) {
		const MDNode *M = cast<MDNode>(Queue.pop_back_val());
		for (const Metadata *Op : M->operands())
		if (const MDNode *OpMD = dyn_cast<MDNode>(Op))
		if (MD.insert(OpMD))
		Queue.push_back(OpMD);
		}
		}

		void ScopedAliasMetadataCloner::clone() {
		// Discard a previous clone that may exist.
		Map.clear();

		SmallVector<TempMDTuple, 16> DummyNodes;
		for (const MDNode *I : MD) {
		DummyNodes.push_back(MDTuple::getTemporary(I->getContext(), None));
		Map[I].reset(DummyNodes.back().get());
		}

		// Create new metadata nodes to replace the dummy nodes, replacing old
		// metadata references with either a dummy node or an already-created new
		// node.
		SmallVector<Metadata *, 4> NewOps;
		for (const MDNode *I : MD) {
		for (const Metadata *Op : I->operands()) {
		if (const MDNode *M = dyn_cast<MDNode>(Op))
		NewOps.push_back(Map[M]);
		else
		NewOps.push_back(const_cast<Metadata *>(Op));
		}

		MDNode *NewM = MDNode::get(I->getContext(), NewOps);
		MDTuple *TempM = cast<MDTuple>(Map[I]);
		assert(TempM->isTemporary() && "Expected temporary node");

		TempM->replaceAllUsesWith(NewM);
		NewOps.clear();
		}
		}

		void ScopedAliasMetadataCloner::remap(ValueToValueMapTy &VMap) {
		if (Map.empty())
		return; // Nothing to do.

		for (auto Entry : VMap) {
		if (!Entry->second)
		continue;

		Instruction *I = dyn_cast<Instruction>(Entry->second);
		if (!I)
		continue;

		if (MDNode *M = I->getMetadata(LLVMContext::MD_alias_scope))
		I->setMetadata(LLVMContext::MD_alias_scope, Map[M]);

		if (MDNode *M = I->getMetadata(LLVMContext::MD_noalias))
		I->setMetadata(LLVMContext::MD_noalias, Map[M]);
		}
		}

llvm/lib/Transforms/Utils/InlineFunction.cpp

	Show First 20 Lines • Show All 822 Lines • ▼ Show 20 Lines
	/// this metadata needs to be cloned so that the inlined blocks			/// this metadata needs to be cloned so that the inlined blocks
	/// have different "unique scopes" at every call site. Were this not done, then			/// have different "unique scopes" at every call site. Were this not done, then
	/// aliasing scopes from a function inlined into a caller multiple times could			/// aliasing scopes from a function inlined into a caller multiple times could
	/// not be differentiated (and this would lead to miscompiles because the			/// not be differentiated (and this would lead to miscompiles because the
	/// non-aliasing property communicated by the metadata could have			/// non-aliasing property communicated by the metadata could have
	/// call-site-specific control dependencies).			/// call-site-specific control dependencies).
	static void CloneAliasScopeMetadata(CallBase &CB, ValueToValueMapTy &VMap) {			static void CloneAliasScopeMetadata(CallBase &CB, ValueToValueMapTy &VMap) {
	const Function *CalledFunc = CB.getCalledFunction();			const Function *CalledFunc = CB.getCalledFunction();
	SetVector<const MDNode *> MD;			ScopedAliasMetadataCloner Cloner(CalledFunc);
				Cloner.clone();
	// Note: We could only clone the metadata if it is already used in the			Cloner.remap(VMap);
	// caller. I'm omitting that check here because it might confuse
	// inter-procedural alias analysis passes. We can revisit this if it becomes
	// an efficiency or overhead problem.

	for (const BasicBlock &I : *CalledFunc)
	for (const Instruction &J : I) {
	if (const MDNode *M = J.getMetadata(LLVMContext::MD_alias_scope))
	MD.insert(M);
	if (const MDNode *M = J.getMetadata(LLVMContext::MD_noalias))
	MD.insert(M);
	}

	if (MD.empty())
	return;

	// Walk the existing metadata, adding the complete (perhaps cyclic) chain to
	// the set.
	SmallVector<const Metadata *, 16> Queue(MD.begin(), MD.end());
	while (!Queue.empty()) {
	const MDNode *M = cast<MDNode>(Queue.pop_back_val());
	for (unsigned i = 0, ie = M->getNumOperands(); i != ie; ++i)
	if (const MDNode *M1 = dyn_cast<MDNode>(M->getOperand(i)))
	if (MD.insert(M1))
	Queue.push_back(M1);
	}

	// Now we have a complete set of all metadata in the chains used to specify
	// the noalias scopes and the lists of those scopes.
	SmallVector<TempMDTuple, 16> DummyNodes;
	DenseMap<const MDNode *, TrackingMDNodeRef> MDMap;
	for (const MDNode *I : MD) {
	DummyNodes.push_back(MDTuple::getTemporary(CalledFunc->getContext(), None));
	MDMap[I].reset(DummyNodes.back().get());
	}

	// Create new metadata nodes to replace the dummy nodes, replacing old
	// metadata references with either a dummy node or an already-created new
	// node.
	for (const MDNode *I : MD) {
	SmallVector<Metadata *, 4> NewOps;
	for (unsigned i = 0, ie = I->getNumOperands(); i != ie; ++i) {
	const Metadata *V = I->getOperand(i);
	if (const MDNode *M = dyn_cast<MDNode>(V))
	NewOps.push_back(MDMap[M]);
	else
	NewOps.push_back(const_cast<Metadata *>(V));
	}

	MDNode *NewM = MDNode::get(CalledFunc->getContext(), NewOps);
	MDTuple *TempM = cast<MDTuple>(MDMap[I]);
	assert(TempM->isTemporary() && "Expected temporary node");

	TempM->replaceAllUsesWith(NewM);
	}

	// Now replace the metadata in the new inlined instructions with the
	// repacements from the map.
	for (ValueToValueMapTy::iterator VMI = VMap.begin(), VMIE = VMap.end();
	VMI != VMIE; ++VMI) {
	if (!VMI->second)
	continue;

	Instruction *NI = dyn_cast<Instruction>(VMI->second);
	if (!NI)
	continue;

	if (MDNode *M = NI->getMetadata(LLVMContext::MD_alias_scope))
	NI->setMetadata(LLVMContext::MD_alias_scope, MDMap[M]);

	if (MDNode *M = NI->getMetadata(LLVMContext::MD_noalias))
	NI->setMetadata(LLVMContext::MD_noalias, MDMap[M]);
	}
	}			}

	/// If the inlined function has noalias arguments,			/// If the inlined function has noalias arguments,
	/// then add new alias scopes for each noalias argument, tag the mapped noalias			/// then add new alias scopes for each noalias argument, tag the mapped noalias
	/// parameters with noalias metadata specifying the new scope, and tag all			/// parameters with noalias metadata specifying the new scope, and tag all
	/// non-derived loads, stores and memory intrinsics with the new alias scopes.			/// non-derived loads, stores and memory intrinsics with the new alias scopes.
	static void AddAliasScopeMetadata(CallBase &CB, ValueToValueMapTy &VMap,			static void AddAliasScopeMetadata(CallBase &CB, ValueToValueMapTy &VMap,
	const DataLayout &DL, AAResults *CalleeAAR) {			const DataLayout &DL, AAResults *CalleeAAR) {
	▲ Show 20 Lines • Show All 1,594 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/LoopUnroll.cpp

Show First 20 Lines • Show All 253 Lines • ▼ Show 20 Lines
/// times the loop header executes. Note that UnrollLoop assumes that the loop		/// times the loop header executes. Note that UnrollLoop assumes that the loop
/// counter test is in LatchBlock in order to remove unnecesssary instances of		/// counter test is in LatchBlock in order to remove unnecesssary instances of
/// the test. If control can exit the loop from the LatchBlock's terminator		/// the test. If control can exit the loop from the LatchBlock's terminator
/// prior to TripCount iterations, flag PreserveCondBr needs to be set.		/// prior to TripCount iterations, flag PreserveCondBr needs to be set.
///		///
/// PreserveCondBr indicates whether the conditional branch of the LatchBlock		/// PreserveCondBr indicates whether the conditional branch of the LatchBlock
/// needs to be preserved. It is needed when we use trip count upper bound to		/// needs to be preserved. It is needed when we use trip count upper bound to
/// fully unroll the loop. If PreserveOnlyFirst is also set then only the first		/// fully unroll the loop. If PreserveOnlyFirst is also set then only the first
/// conditional branch needs to be preserved.		/// conditional branch needs to be preserved.
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: Don't we have a range based version of this? jdoerfert: Nit: Don't we have a range based version of this?
///		///
/// Similarly, TripMultiple divides the number of times that the LatchBlock may		/// Similarly, TripMultiple divides the number of times that the LatchBlock may
/// execute without exiting the loop.		/// execute without exiting the loop.
///		///
/// If AllowRuntime is true then UnrollLoop will consider unrolling loops that		/// If AllowRuntime is true then UnrollLoop will consider unrolling loops that
/// have a runtime (i.e. not compile time constant) trip count. Unrolling these		/// have a runtime (i.e. not compile time constant) trip count. Unrolling these
/// loops require a unroll "prologue" that runs "RuntimeTripCount % Count"		/// loops require a unroll "prologue" that runs "RuntimeTripCount % Count"
/// iterations before branching into the unrolled loop. UnrollLoop will not		/// iterations before branching into the unrolled loop. UnrollLoop will not
/// runtime-unroll the loop if computing RuntimeTripCount will be expensive and		/// runtime-unroll the loop if computing RuntimeTripCount will be expensive and
/// AllowExpensiveTripCount is false.		/// AllowExpensiveTripCount is false.
///		///
		jdoerfertUnsubmitted Not Done Reply Inline Actions Do you pass MD by value on purpose? Also consider passing the return value by reference, though I doubt it makes much of a difference. jdoerfert: Do you pass MD by value on purpose? Also consider passing the return value by reference, though…
/// If we want to perform PGO-based loop peeling, PeelCount is set to the		/// If we want to perform PGO-based loop peeling, PeelCount is set to the
/// number of iterations we want to peel off.		/// number of iterations we want to peel off.
///		///
/// The LoopInfo Analysis that is passed will be kept consistent.		/// The LoopInfo Analysis that is passed will be kept consistent.
///		///
/// This utility preserves LoopInfo. It will also preserve ScalarEvolution and		/// This utility preserves LoopInfo. It will also preserve ScalarEvolution and
/// DominatorTree if they are non-null.		/// DominatorTree if they are non-null.
///		///
/// If RemainderLoop is non-null, it will receive the remainder loop (if		/// If RemainderLoop is non-null, it will receive the remainder loop (if
/// required and not fully unrolled).		/// required and not fully unrolled).
LoopUnrollResult llvm::UnrollLoop(Loop L, UnrollLoopOptions ULO, LoopInfo LI,		LoopUnrollResult llvm::UnrollLoop(Loop L, UnrollLoopOptions ULO, LoopInfo LI,
ScalarEvolution SE, DominatorTree DT,		ScalarEvolution SE, DominatorTree DT,
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: hoist and use clear, according to the programmers manual (or some other document we have). jdoerfert: Nit: hoist and use clear, according to the programmers manual (or some other document we have).
AssumptionCache *AC,		AssumptionCache *AC,
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: range again? jdoerfert: Nit: range again?
const TargetTransformInfo *TTI,		const TargetTransformInfo *TTI,
OptimizationRemarkEmitter *ORE,		OptimizationRemarkEmitter *ORE,
bool PreserveLCSSA, Loop **RemainderLoop) {		bool PreserveLCSSA, Loop **RemainderLoop) {

BasicBlock *Preheader = L->getLoopPreheader();		BasicBlock *Preheader = L->getLoopPreheader();
if (!Preheader) {		if (!Preheader) {
LLVM_DEBUG(dbgs() << " Can't unroll; loop preheader-insertion failed.\n");		LLVM_DEBUG(dbgs() << " Can't unroll; loop preheader-insertion failed.\n");
return LoopUnrollResult::Unmodified;		return LoopUnrollResult::Unmodified;
}		}

BasicBlock *LatchBlock = L->getLoopLatch();		BasicBlock *LatchBlock = L->getLoopLatch();
if (!LatchBlock) {		if (!LatchBlock) {
LLVM_DEBUG(dbgs() << " Can't unroll; loop exit-block-insertion failed.\n");		LLVM_DEBUG(dbgs() << " Can't unroll; loop exit-block-insertion failed.\n");
return LoopUnrollResult::Unmodified;		return LoopUnrollResult::Unmodified;
}		}

// Loops with indirectbr cannot be cloned.		// Loops with indirectbr cannot be cloned.
if (!L->isSafeToClone()) {		if (!L->isSafeToClone()) {
		jdoerfertUnsubmitted Not Done Reply Inline Actions Could you add some documentation here and above please. jdoerfert: Could you add some documentation here and above please.
LLVM_DEBUG(dbgs() << " Can't unroll; Loop body cannot be cloned.\n");		LLVM_DEBUG(dbgs() << " Can't unroll; Loop body cannot be cloned.\n");
return LoopUnrollResult::Unmodified;		return LoopUnrollResult::Unmodified;
}		}
		jdoerfertUnsubmitted Not Done Reply Inline Actions Nit: range & auto? jdoerfert: Nit: range & auto?

// The current loop unroll pass can unroll loops that have		// The current loop unroll pass can unroll loops that have
// (1) single latch; and		// (1) single latch; and
// (2a) latch is unconditional; or		// (2a) latch is unconditional; or
// (2b) latch is conditional and is an exiting block		// (2b) latch is conditional and is an exiting block
// FIXME: The implementation can be extended to work with more complicated		// FIXME: The implementation can be extended to work with more complicated
// cases, e.g. loops with multiple latches.		// cases, e.g. loops with multiple latches.
BasicBlock *Header = L->getHeader();		BasicBlock *Header = L->getHeader();
▲ Show 20 Lines • Show All 242 Lines • ▼ Show 20 Lines	LoopUnrollResult llvm::UnrollLoop(Loop L, UnrollLoopOptions ULO, LoopInfo LI,
DFS.perform(LI);		DFS.perform(LI);

// Stash the DFS iterators before adding blocks to the loop.		// Stash the DFS iterators before adding blocks to the loop.
LoopBlocksDFS::RPOIterator BlockBegin = DFS.beginRPO();		LoopBlocksDFS::RPOIterator BlockBegin = DFS.beginRPO();
LoopBlocksDFS::RPOIterator BlockEnd = DFS.endRPO();		LoopBlocksDFS::RPOIterator BlockEnd = DFS.endRPO();

std::vector<BasicBlock*> UnrolledLoopBlocks = L->getBlocks();		std::vector<BasicBlock*> UnrolledLoopBlocks = L->getBlocks();

		// Scoped noalias metadata in the loop might only apply within one iteration,
		// in which case we need to use independent noalias metadata for each
		// unrolled iteration. We currently have no way to distinguish where the
		// metadata applies only within one iteration, or across them, so we have to
		// make the conservative assumption here.
		ScopedAliasMetadataCloner AliasMetadataCloner(UnrolledLoopBlocks);

// Loop Unrolling might create new loops. While we do preserve LoopInfo, we		// Loop Unrolling might create new loops. While we do preserve LoopInfo, we
// might break loop-simplified form for these loops (as they, e.g., would		// might break loop-simplified form for these loops (as they, e.g., would
// share the same exit blocks). We'll keep track of loops for which we can		// share the same exit blocks). We'll keep track of loops for which we can
// break this so that later we can re-simplify them.		// break this so that later we can re-simplify them.
SmallSetVector<Loop *, 4> LoopsToSimplify;		SmallSetVector<Loop *, 4> LoopsToSimplify;
for (Loop SubLoop : L)		for (Loop SubLoop : L)
LoopsToSimplify.insert(SubLoop);		LoopsToSimplify.insert(SubLoop);

Show All 11 Lines	for (BasicBlock *BB : L->getBlocks())
<< DIL->getFilename() << " Line: " << DIL->getLine());		<< DIL->getFilename() << " Line: " << DIL->getLine());
}		}

for (unsigned It = 1; It != ULO.Count; ++It) {		for (unsigned It = 1; It != ULO.Count; ++It) {
SmallVector<BasicBlock *, 8> NewBlocks;		SmallVector<BasicBlock *, 8> NewBlocks;
SmallDenseMap<const Loop , Loop , 4> NewLoops;		SmallDenseMap<const Loop , Loop , 4> NewLoops;
NewLoops[L] = L;		NewLoops[L] = L;

		// Use new scoped noalias metadata for each iteration.
		AliasMetadataCloner.clone();

for (LoopBlocksDFS::RPOIterator BB = BlockBegin; BB != BlockEnd; ++BB) {		for (LoopBlocksDFS::RPOIterator BB = BlockBegin; BB != BlockEnd; ++BB) {
ValueToValueMapTy VMap;		ValueToValueMapTy VMap;
BasicBlock New = CloneBasicBlock(BB, VMap, "." + Twine(It));		BasicBlock New = CloneBasicBlock(BB, VMap, "." + Twine(It));
Header->getParent()->getBasicBlockList().push_back(New);		Header->getParent()->getBasicBlockList().push_back(New);

assert((BB != Header \|\| LI->getLoopFor(BB) == L) &&		assert((BB != Header \|\| LI->getLoopFor(BB) == L) &&
"Header should not be in a sub-loop");		"Header should not be in a sub-loop");
// Tell LI about New.		// Tell LI about New.
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	for (LoopBlocksDFS::RPOIterator BB = BlockBegin; BB != BlockEnd; ++BB) {
else {		else {
auto BBDomNode = DT->getNode(*BB);		auto BBDomNode = DT->getNode(*BB);
auto BBIDom = BBDomNode->getIDom();		auto BBIDom = BBDomNode->getIDom();
BasicBlock *OriginalBBIDom = BBIDom->getBlock();		BasicBlock *OriginalBBIDom = BBIDom->getBlock();
DT->addNewBlock(		DT->addNewBlock(
New, cast<BasicBlock>(LastValueMap[cast<Value>(OriginalBBIDom)]));		New, cast<BasicBlock>(LastValueMap[cast<Value>(OriginalBBIDom)]));
}		}
}		}

		AliasMetadataCloner.remap(VMap);
}		}

// Remap all instructions in the most recent iteration		// Remap all instructions in the most recent iteration
remapInstructionsInBlocks(NewBlocks, LastValueMap);		remapInstructionsInBlocks(NewBlocks, LastValueMap);
for (BasicBlock *NewBlock : NewBlocks) {		for (BasicBlock *NewBlock : NewBlocks) {
for (Instruction &I : *NewBlock) {		for (Instruction &I : *NewBlock) {
if (auto *II = dyn_cast<IntrinsicInst>(&I))		if (auto *II = dyn_cast<IntrinsicInst>(&I))
if (II->getIntrinsicID() == Intrinsic::assume)		if (II->getIntrinsicID() == Intrinsic::assume)
▲ Show 20 Lines • Show All 284 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopUnroll/noalias.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -loop-unroll -unroll-count=4 < %s \| FileCheck %s

				MeinersburUnsubmitted Not Done Reply Inline Actions For robustness, the test case should specify the unroll factor, e.g. `-unroll-count=4` Meinersbur: For robustness, the test case should specify the unroll factor, e.g. `-unroll-count=4`
				define void @pr39282(i32* %addr1, i32* %addr2) {
				; CHECK-LABEL: @pr39282(
				; CHECK-NEXT: start:
				; CHECK-NEXT: br label [[BODY:%.*]]
				; CHECK: body:
				; CHECK-NEXT: [[X:%.]] = load i32, i32 [[ADDR1:%.*]], align 4, !alias.scope !0
				; CHECK-NEXT: store i32 [[X]], i32* [[ADDR2:%.*]], align 4, !noalias !0
				; CHECK-NEXT: [[ADDR1I_1:%.]] = getelementptr inbounds i32, i32 [[ADDR1]], i32 1
				; CHECK-NEXT: [[ADDR2I_1:%.]] = getelementptr inbounds i32, i32 [[ADDR2]], i32 1
				; CHECK-NEXT: [[X_1:%.]] = load i32, i32 [[ADDR1I_1]], align 4, !alias.scope !3
				; CHECK-NEXT: store i32 [[X_1]], i32* [[ADDR2I_1]], align 4, !noalias !3
				; CHECK-NEXT: [[X_2:%.]] = load i32, i32 [[ADDR1]], align 4, !alias.scope !6
				; CHECK-NEXT: store i32 [[X_2]], i32* [[ADDR2]], align 4, !noalias !6
				; CHECK-NEXT: [[ADDR1I_3:%.]] = getelementptr inbounds i32, i32 [[ADDR1]], i32 1
				; CHECK-NEXT: [[ADDR2I_3:%.]] = getelementptr inbounds i32, i32 [[ADDR2]], i32 1
				; CHECK-NEXT: [[X_3:%.]] = load i32, i32 [[ADDR1I_3]], align 4, !alias.scope !9
				; CHECK-NEXT: store i32 [[X_3]], i32* [[ADDR2I_3]], align 4, !noalias !9
				; CHECK-NEXT: ret void
				;
				start:
				br label %body

				body:
				%i = phi i32 [ 0, %start ], [ %i2, %body ]
				%j = and i32 %i, 1
				%addr1i = getelementptr inbounds i32, i32* %addr1, i32 %j
				%addr2i = getelementptr inbounds i32, i32* %addr2, i32 %j

				%x = load i32, i32* %addr1i, !alias.scope !2
				store i32 %x, i32* %addr2i, !noalias !2

				%i2 = add i32 %i, 1
				%cmp = icmp slt i32 %i2, 4
				br i1 %cmp, label %body, label %end

				end:
				ret void
				}

				!0 = distinct !{!0}
				!1 = distinct !{!1, !0}
				!2 = !{!1}

				; CHECK: !0 = !{!1}
				; CHECK: !1 = distinct !{!1, !2}
				; CHECK: !2 = distinct !{!2}
				; CHECK: !3 = !{!4}
				; CHECK: !4 = distinct !{!4, !5}
				; CHECK: !5 = distinct !{!5}
				; CHECK: !6 = !{!7}
				; CHECK: !7 = distinct !{!7, !8}
				; CHECK: !8 = distinct !{!8}
				; CHECK: !9 = !{!10}
				; CHECK: !10 = distinct !{!10, !11}
				; CHECK: !11 = distinct !{!11}

llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP6:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT]]			; CHECK-NEXT: [[TMP6:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT]]
	; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT]]			; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT]]
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[INDEX_NEXT_3:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH_NEW]] ], [ [[INDEX_NEXT_3:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[NITER:%.]] = phi i64 [ [[UNROLL_ITER]], [[VECTOR_PH_NEW]] ], [ [[NITER_NSUB_3:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[NITER:%.]] = phi i64 [ [[UNROLL_ITER]], [[VECTOR_PH_NEW]] ], [ [[NITER_NSUB_3:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[TMP8:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP8:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX]]
	; CHECK-NEXT: [[TMP9:%.]] = bitcast double [[TMP8]] to <4 x double>*			; CHECK-NEXT: [[TMP9:%.]] = bitcast double [[TMP8]] to <4 x double>*
	; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <4 x double>, <4 x double> [[TMP9]], align 8, !tbaa !3, !alias.scope !7			; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <4 x double>, <4 x double> [[TMP9]], align 8, [[TBAA3:!tbaa !.*]], !alias.scope !7
	; CHECK-NEXT: [[TMP10:%.*]] = fmul fast <4 x double> [[WIDE_LOAD]], [[TMP4]]			; CHECK-NEXT: [[TMP10:%.*]] = fmul fast <4 x double> [[WIDE_LOAD]], [[TMP4]]
	; CHECK-NEXT: [[TMP11:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX]]			; CHECK-NEXT: [[TMP11:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX]]
	; CHECK-NEXT: [[TMP12:%.]] = bitcast double [[TMP11]] to <4 x double>*			; CHECK-NEXT: [[TMP12:%.]] = bitcast double [[TMP11]] to <4 x double>*
	; CHECK-NEXT: store <4 x double> [[TMP10]], <4 x double>* [[TMP12]], align 8, !tbaa !3, !alias.scope !10, !noalias !7			; CHECK-NEXT: store <4 x double> [[TMP10]], <4 x double>* [[TMP12]], align 8, [[TBAA3]], !alias.scope !10, !noalias !7
	; CHECK-NEXT: [[INDEX_NEXT:%.*]] = or i64 [[INDEX]], 4			; CHECK-NEXT: [[INDEX_NEXT:%.*]] = or i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP13:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX_NEXT]]			; CHECK-NEXT: [[TMP13:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX_NEXT]]
	; CHECK-NEXT: [[TMP14:%.]] = bitcast double [[TMP13]] to <4 x double>*			; CHECK-NEXT: [[TMP14:%.]] = bitcast double [[TMP13]] to <4 x double>*
	; CHECK-NEXT: [[WIDE_LOAD_1:%.]] = load <4 x double>, <4 x double> [[TMP14]], align 8, !tbaa !3, !alias.scope !7			; CHECK-NEXT: [[WIDE_LOAD_1:%.]] = load <4 x double>, <4 x double> [[TMP14]], align 8, [[TBAA3]], !alias.scope !12
	; CHECK-NEXT: [[TMP15:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_1]], [[TMP5]]			; CHECK-NEXT: [[TMP15:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_1]], [[TMP5]]
	; CHECK-NEXT: [[TMP16:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX_NEXT]]			; CHECK-NEXT: [[TMP16:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX_NEXT]]
	; CHECK-NEXT: [[TMP17:%.]] = bitcast double [[TMP16]] to <4 x double>*			; CHECK-NEXT: [[TMP17:%.]] = bitcast double [[TMP16]] to <4 x double>*
	; CHECK-NEXT: store <4 x double> [[TMP15]], <4 x double>* [[TMP17]], align 8, !tbaa !3, !alias.scope !10, !noalias !7			; CHECK-NEXT: store <4 x double> [[TMP15]], <4 x double>* [[TMP17]], align 8, [[TBAA3]], !alias.scope !15, !noalias !12
	; CHECK-NEXT: [[INDEX_NEXT_1:%.*]] = or i64 [[INDEX]], 8			; CHECK-NEXT: [[INDEX_NEXT_1:%.*]] = or i64 [[INDEX]], 8
	; CHECK-NEXT: [[TMP18:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX_NEXT_1]]			; CHECK-NEXT: [[TMP18:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX_NEXT_1]]
	; CHECK-NEXT: [[TMP19:%.]] = bitcast double [[TMP18]] to <4 x double>*			; CHECK-NEXT: [[TMP19:%.]] = bitcast double [[TMP18]] to <4 x double>*
	; CHECK-NEXT: [[WIDE_LOAD_2:%.]] = load <4 x double>, <4 x double> [[TMP19]], align 8, !tbaa !3, !alias.scope !7			; CHECK-NEXT: [[WIDE_LOAD_2:%.]] = load <4 x double>, <4 x double> [[TMP19]], align 8, [[TBAA3]], !alias.scope !17
	; CHECK-NEXT: [[TMP20:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_2]], [[TMP6]]			; CHECK-NEXT: [[TMP20:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_2]], [[TMP6]]
	; CHECK-NEXT: [[TMP21:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX_NEXT_1]]			; CHECK-NEXT: [[TMP21:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX_NEXT_1]]
	; CHECK-NEXT: [[TMP22:%.]] = bitcast double [[TMP21]] to <4 x double>*			; CHECK-NEXT: [[TMP22:%.]] = bitcast double [[TMP21]] to <4 x double>*
	; CHECK-NEXT: store <4 x double> [[TMP20]], <4 x double>* [[TMP22]], align 8, !tbaa !3, !alias.scope !10, !noalias !7			; CHECK-NEXT: store <4 x double> [[TMP20]], <4 x double>* [[TMP22]], align 8, [[TBAA3]], !alias.scope !20, !noalias !17
	; CHECK-NEXT: [[INDEX_NEXT_2:%.*]] = or i64 [[INDEX]], 12			; CHECK-NEXT: [[INDEX_NEXT_2:%.*]] = or i64 [[INDEX]], 12
	; CHECK-NEXT: [[TMP23:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX_NEXT_2]]			; CHECK-NEXT: [[TMP23:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX_NEXT_2]]
	; CHECK-NEXT: [[TMP24:%.]] = bitcast double [[TMP23]] to <4 x double>*			; CHECK-NEXT: [[TMP24:%.]] = bitcast double [[TMP23]] to <4 x double>*
	; CHECK-NEXT: [[WIDE_LOAD_3:%.]] = load <4 x double>, <4 x double> [[TMP24]], align 8, !tbaa !3, !alias.scope !7			; CHECK-NEXT: [[WIDE_LOAD_3:%.]] = load <4 x double>, <4 x double> [[TMP24]], align 8, [[TBAA3]], !alias.scope !22
	; CHECK-NEXT: [[TMP25:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_3]], [[TMP7]]			; CHECK-NEXT: [[TMP25:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_3]], [[TMP7]]
	; CHECK-NEXT: [[TMP26:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX_NEXT_2]]			; CHECK-NEXT: [[TMP26:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX_NEXT_2]]
	; CHECK-NEXT: [[TMP27:%.]] = bitcast double [[TMP26]] to <4 x double>*			; CHECK-NEXT: [[TMP27:%.]] = bitcast double [[TMP26]] to <4 x double>*
	; CHECK-NEXT: store <4 x double> [[TMP25]], <4 x double>* [[TMP27]], align 8, !tbaa !3, !alias.scope !10, !noalias !7			; CHECK-NEXT: store <4 x double> [[TMP25]], <4 x double>* [[TMP27]], align 8, [[TBAA3]], !alias.scope !25, !noalias !22
	; CHECK-NEXT: [[INDEX_NEXT_3]] = add i64 [[INDEX]], 16			; CHECK-NEXT: [[INDEX_NEXT_3]] = add i64 [[INDEX]], 16
	; CHECK-NEXT: [[NITER_NSUB_3]] = add i64 [[NITER]], -4			; CHECK-NEXT: [[NITER_NSUB_3]] = add i64 [[NITER]], -4
	; CHECK-NEXT: [[NITER_NCMP_3:%.*]] = icmp eq i64 [[NITER_NSUB_3]], 0			; CHECK-NEXT: [[NITER_NCMP_3:%.*]] = icmp eq i64 [[NITER_NSUB_3]], 0
	; CHECK-NEXT: br i1 [[NITER_NCMP_3]], label [[MIDDLE_BLOCK_UNR_LCSSA]], label [[VECTOR_BODY]], !llvm.loop !12			; CHECK-NEXT: br i1 [[NITER_NCMP_3]], label [[MIDDLE_BLOCK_UNR_LCSSA]], label [[VECTOR_BODY]], [[LOOP27:!llvm.loop !.*]]
	; CHECK: middle.block.unr-lcssa:			; CHECK: middle.block.unr-lcssa:
	; CHECK-NEXT: [[INDEX_UNR:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT_3]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX_UNR:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT_3]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[LCMP_MOD:%.*]] = icmp eq i64 [[XTRAITER]], 0			; CHECK-NEXT: [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[XTRAITER]], 0
	; CHECK-NEXT: br i1 [[LCMP_MOD]], label [[MIDDLE_BLOCK:%.]], label [[VECTOR_BODY_EPIL_PREHEADER:%.]]			; CHECK-NEXT: br i1 [[LCMP_MOD_NOT]], label [[MIDDLE_BLOCK:%.]], label [[VECTOR_BODY_EPIL_PREHEADER:%.]]
	; CHECK: vector.body.epil.preheader:			; CHECK: vector.body.epil.preheader:
	; CHECK-NEXT: [[TMP28:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT]]			; CHECK-NEXT: [[TMP28:%.*]] = fdiv fast <4 x double> <double 1.000000e+00, double 1.000000e+00, double 1.000000e+00, double 1.000000e+00>, [[BROADCAST_SPLAT]]
	; CHECK-NEXT: br label [[VECTOR_BODY_EPIL:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY_EPIL:%.*]]
	; CHECK: vector.body.epil:			; CHECK: vector.body.epil:
	; CHECK-NEXT: [[INDEX_EPIL:%.]] = phi i64 [ [[INDEX_UNR]], [[VECTOR_BODY_EPIL_PREHEADER]] ], [ [[INDEX_NEXT_EPIL:%.]], [[VECTOR_BODY_EPIL]] ]			; CHECK-NEXT: [[INDEX_EPIL:%.]] = phi i64 [ [[INDEX_UNR]], [[VECTOR_BODY_EPIL_PREHEADER]] ], [ [[INDEX_NEXT_EPIL:%.]], [[VECTOR_BODY_EPIL]] ]
	; CHECK-NEXT: [[EPIL_ITER:%.]] = phi i64 [ [[XTRAITER]], [[VECTOR_BODY_EPIL_PREHEADER]] ], [ [[EPIL_ITER_SUB:%.]], [[VECTOR_BODY_EPIL]] ]			; CHECK-NEXT: [[EPIL_ITER:%.]] = phi i64 [ [[XTRAITER]], [[VECTOR_BODY_EPIL_PREHEADER]] ], [ [[EPIL_ITER_SUB:%.]], [[VECTOR_BODY_EPIL]] ]
	; CHECK-NEXT: [[TMP29:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX_EPIL]]			; CHECK-NEXT: [[TMP29:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDEX_EPIL]]
	; CHECK-NEXT: [[TMP30:%.]] = bitcast double [[TMP29]] to <4 x double>*			; CHECK-NEXT: [[TMP30:%.]] = bitcast double [[TMP29]] to <4 x double>*
	; CHECK-NEXT: [[WIDE_LOAD_EPIL:%.]] = load <4 x double>, <4 x double> [[TMP30]], align 8, !tbaa !3, !alias.scope !7			; CHECK-NEXT: [[WIDE_LOAD_EPIL:%.]] = load <4 x double>, <4 x double> [[TMP30]], align 8, [[TBAA3]], !alias.scope !7
	; CHECK-NEXT: [[TMP31:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_EPIL]], [[TMP28]]			; CHECK-NEXT: [[TMP31:%.*]] = fmul fast <4 x double> [[WIDE_LOAD_EPIL]], [[TMP28]]
	; CHECK-NEXT: [[TMP32:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX_EPIL]]			; CHECK-NEXT: [[TMP32:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDEX_EPIL]]
	; CHECK-NEXT: [[TMP33:%.]] = bitcast double [[TMP32]] to <4 x double>*			; CHECK-NEXT: [[TMP33:%.]] = bitcast double [[TMP32]] to <4 x double>*
	; CHECK-NEXT: store <4 x double> [[TMP31]], <4 x double>* [[TMP33]], align 8, !tbaa !3, !alias.scope !10, !noalias !7			; CHECK-NEXT: store <4 x double> [[TMP31]], <4 x double>* [[TMP33]], align 8, [[TBAA3]], !alias.scope !10, !noalias !7
	; CHECK-NEXT: [[INDEX_NEXT_EPIL]] = add i64 [[INDEX_EPIL]], 4			; CHECK-NEXT: [[INDEX_NEXT_EPIL]] = add i64 [[INDEX_EPIL]], 4
	; CHECK-NEXT: [[EPIL_ITER_SUB]] = add i64 [[EPIL_ITER]], -1			; CHECK-NEXT: [[EPIL_ITER_SUB]] = add i64 [[EPIL_ITER]], -1
	; CHECK-NEXT: [[EPIL_ITER_CMP:%.*]] = icmp eq i64 [[EPIL_ITER_SUB]], 0			; CHECK-NEXT: [[EPIL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[EPIL_ITER_SUB]], 0
	; CHECK-NEXT: br i1 [[EPIL_ITER_CMP]], label [[MIDDLE_BLOCK]], label [[VECTOR_BODY_EPIL]], !llvm.loop !14			; CHECK-NEXT: br i1 [[EPIL_ITER_CMP_NOT]], label [[MIDDLE_BLOCK]], label [[VECTOR_BODY_EPIL]], [[LOOP29:!llvm.loop !.*]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[WIDE_TRIP_COUNT]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[WIDE_TRIP_COUNT]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY_PREHEADER]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY_PREHEADER]]
	; CHECK: for.body.preheader:			; CHECK: for.body.preheader:
	; CHECK-NEXT: [[INDVARS_IV_PH:%.*]] = phi i64 [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[FOR_BODY_LR_PH]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[INDVARS_IV_PH:%.*]] = phi i64 [ 0, [[VECTOR_MEMCHECK]] ], [ 0, [[FOR_BODY_LR_PH]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: [[TMP34:%.*]] = xor i64 [[INDVARS_IV_PH]], -1			; CHECK-NEXT: [[TMP34:%.*]] = xor i64 [[INDVARS_IV_PH]], -1
	; CHECK-NEXT: [[TMP35:%.*]] = add nsw i64 [[TMP34]], [[WIDE_TRIP_COUNT]]			; CHECK-NEXT: [[TMP35:%.*]] = add nsw i64 [[TMP34]], [[WIDE_TRIP_COUNT]]
	; CHECK-NEXT: [[XTRAITER8:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 3			; CHECK-NEXT: [[XTRAITER8:%.*]] = and i64 [[WIDE_TRIP_COUNT]], 3
	; CHECK-NEXT: [[LCMP_MOD9:%.*]] = icmp eq i64 [[XTRAITER8]], 0			; CHECK-NEXT: [[LCMP_MOD9_NOT:%.*]] = icmp eq i64 [[XTRAITER8]], 0
	; CHECK-NEXT: br i1 [[LCMP_MOD9]], label [[FOR_BODY_PROL_LOOPEXIT:%.]], label [[FOR_BODY_PROL_PREHEADER:%.]]			; CHECK-NEXT: br i1 [[LCMP_MOD9_NOT]], label [[FOR_BODY_PROL_LOOPEXIT:%.]], label [[FOR_BODY_PROL_PREHEADER:%.]]
	; CHECK: for.body.prol.preheader:			; CHECK: for.body.prol.preheader:
	; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: br label [[FOR_BODY_PROL:%.*]]			; CHECK-NEXT: br label [[FOR_BODY_PROL:%.*]]
	; CHECK: for.body.prol:			; CHECK: for.body.prol:
	; CHECK-NEXT: [[INDVARS_IV_PROL:%.]] = phi i64 [ [[INDVARS_IV_NEXT_PROL:%.]], [[FOR_BODY_PROL]] ], [ [[INDVARS_IV_PH]], [[FOR_BODY_PROL_PREHEADER]] ]			; CHECK-NEXT: [[INDVARS_IV_PROL:%.]] = phi i64 [ [[INDVARS_IV_NEXT_PROL:%.]], [[FOR_BODY_PROL]] ], [ [[INDVARS_IV_PH]], [[FOR_BODY_PROL_PREHEADER]] ]
	; CHECK-NEXT: [[PROL_ITER:%.]] = phi i64 [ [[PROL_ITER_SUB:%.]], [[FOR_BODY_PROL]] ], [ [[XTRAITER8]], [[FOR_BODY_PROL_PREHEADER]] ]			; CHECK-NEXT: [[PROL_ITER:%.]] = phi i64 [ [[PROL_ITER_SUB:%.]], [[FOR_BODY_PROL]] ], [ [[XTRAITER8]], [[FOR_BODY_PROL_PREHEADER]] ]
	; CHECK-NEXT: [[ARRAYIDX_PROL:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV_PROL]]			; CHECK-NEXT: [[ARRAYIDX_PROL:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV_PROL]]
	; CHECK-NEXT: [[T0_PROL:%.]] = load double, double [[ARRAYIDX_PROL]], align 8, !tbaa !3			; CHECK-NEXT: [[T0_PROL:%.]] = load double, double [[ARRAYIDX_PROL]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[TMP37:%.*]] = fmul fast double [[T0_PROL]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fmul fast double [[T0_PROL]], [[TMP36]]
	; CHECK-NEXT: [[ARRAYIDX2_PROL:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV_PROL]]			; CHECK-NEXT: [[ARRAYIDX2_PROL:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV_PROL]]
	; CHECK-NEXT: store double [[TMP37]], double* [[ARRAYIDX2_PROL]], align 8, !tbaa !3			; CHECK-NEXT: store double [[TMP37]], double* [[ARRAYIDX2_PROL]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_PROL]] = add nuw nsw i64 [[INDVARS_IV_PROL]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT_PROL]] = add nuw nsw i64 [[INDVARS_IV_PROL]], 1
	; CHECK-NEXT: [[PROL_ITER_SUB]] = add i64 [[PROL_ITER]], -1			; CHECK-NEXT: [[PROL_ITER_SUB]] = add i64 [[PROL_ITER]], -1
	; CHECK-NEXT: [[PROL_ITER_CMP:%.*]] = icmp eq i64 [[PROL_ITER_SUB]], 0			; CHECK-NEXT: [[PROL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[PROL_ITER_SUB]], 0
	; CHECK-NEXT: br i1 [[PROL_ITER_CMP]], label [[FOR_BODY_PROL_LOOPEXIT]], label [[FOR_BODY_PROL]], !llvm.loop !16			; CHECK-NEXT: br i1 [[PROL_ITER_CMP_NOT]], label [[FOR_BODY_PROL_LOOPEXIT]], label [[FOR_BODY_PROL]], [[LOOP31:!llvm.loop !.*]]
	; CHECK: for.body.prol.loopexit:			; CHECK: for.body.prol.loopexit:
	; CHECK-NEXT: [[INDVARS_IV_UNR:%.*]] = phi i64 [ [[INDVARS_IV_PH]], [[FOR_BODY_PREHEADER]] ], [ [[INDVARS_IV_NEXT_PROL]], [[FOR_BODY_PROL]] ]			; CHECK-NEXT: [[INDVARS_IV_UNR:%.*]] = phi i64 [ [[INDVARS_IV_PH]], [[FOR_BODY_PREHEADER]] ], [ [[INDVARS_IV_NEXT_PROL]], [[FOR_BODY_PROL]] ]
	; CHECK-NEXT: [[TMP38:%.*]] = icmp ult i64 [[TMP35]], 3			; CHECK-NEXT: [[TMP38:%.*]] = icmp ult i64 [[TMP35]], 3
	; CHECK-NEXT: br i1 [[TMP38]], label [[FOR_END]], label [[FOR_BODY_PREHEADER_NEW:%.*]]			; CHECK-NEXT: br i1 [[TMP38]], label [[FOR_END]], label [[FOR_BODY_PREHEADER_NEW:%.*]]
	; CHECK: for.body.preheader.new:			; CHECK: for.body.preheader.new:
	; CHECK-NEXT: [[TMP39:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP39:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP40:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP40:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP41:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP41:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: [[TMP42:%.*]] = fdiv fast double 1.000000e+00, [[A]]			; CHECK-NEXT: [[TMP42:%.*]] = fdiv fast double 1.000000e+00, [[A]]
	; CHECK-NEXT: br label [[FOR_BODY:%.*]]			; CHECK-NEXT: br label [[FOR_BODY:%.*]]
	; CHECK: for.body:			; CHECK: for.body:
	; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_UNR]], [[FOR_BODY_PREHEADER_NEW]] ], [ [[INDVARS_IV_NEXT_3:%.]], [[FOR_BODY]] ]			; CHECK-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_UNR]], [[FOR_BODY_PREHEADER_NEW]] ], [ [[INDVARS_IV_NEXT_3:%.]], [[FOR_BODY]] ]
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: [[T0:%.]] = load double, double [[ARRAYIDX]], align 8, !tbaa !3			; CHECK-NEXT: [[T0:%.]] = load double, double [[ARRAYIDX]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[TMP43:%.*]] = fmul fast double [[T0]], [[TMP39]]			; CHECK-NEXT: [[TMP43:%.*]] = fmul fast double [[T0]], [[TMP39]]
	; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV]]			; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV]]
	; CHECK-NEXT: store double [[TMP43]], double* [[ARRAYIDX2]], align 8, !tbaa !3			; CHECK-NEXT: store double [[TMP43]], double* [[ARRAYIDX2]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 1			; CHECK-NEXT: [[INDVARS_IV_NEXT:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 1
	; CHECK-NEXT: [[ARRAYIDX_1:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV_NEXT]]			; CHECK-NEXT: [[ARRAYIDX_1:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV_NEXT]]
	; CHECK-NEXT: [[T0_1:%.]] = load double, double [[ARRAYIDX_1]], align 8, !tbaa !3			; CHECK-NEXT: [[T0_1:%.]] = load double, double [[ARRAYIDX_1]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[TMP44:%.*]] = fmul fast double [[T0_1]], [[TMP40]]			; CHECK-NEXT: [[TMP44:%.*]] = fmul fast double [[T0_1]], [[TMP40]]
	; CHECK-NEXT: [[ARRAYIDX2_1:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV_NEXT]]			; CHECK-NEXT: [[ARRAYIDX2_1:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV_NEXT]]
	; CHECK-NEXT: store double [[TMP44]], double* [[ARRAYIDX2_1]], align 8, !tbaa !3			; CHECK-NEXT: store double [[TMP44]], double* [[ARRAYIDX2_1]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_1:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 2			; CHECK-NEXT: [[INDVARS_IV_NEXT_1:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 2
	; CHECK-NEXT: [[ARRAYIDX_2:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV_NEXT_1]]			; CHECK-NEXT: [[ARRAYIDX_2:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV_NEXT_1]]
	; CHECK-NEXT: [[T0_2:%.]] = load double, double [[ARRAYIDX_2]], align 8, !tbaa !3			; CHECK-NEXT: [[T0_2:%.]] = load double, double [[ARRAYIDX_2]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[TMP45:%.*]] = fmul fast double [[T0_2]], [[TMP41]]			; CHECK-NEXT: [[TMP45:%.*]] = fmul fast double [[T0_2]], [[TMP41]]
	; CHECK-NEXT: [[ARRAYIDX2_2:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV_NEXT_1]]			; CHECK-NEXT: [[ARRAYIDX2_2:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV_NEXT_1]]
	; CHECK-NEXT: store double [[TMP45]], double* [[ARRAYIDX2_2]], align 8, !tbaa !3			; CHECK-NEXT: store double [[TMP45]], double* [[ARRAYIDX2_2]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_2:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 3			; CHECK-NEXT: [[INDVARS_IV_NEXT_2:%.*]] = add nuw nsw i64 [[INDVARS_IV]], 3
	; CHECK-NEXT: [[ARRAYIDX_3:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV_NEXT_2]]			; CHECK-NEXT: [[ARRAYIDX_3:%.]] = getelementptr inbounds double, double [[Y]], i64 [[INDVARS_IV_NEXT_2]]
	; CHECK-NEXT: [[T0_3:%.]] = load double, double [[ARRAYIDX_3]], align 8, !tbaa !3			; CHECK-NEXT: [[T0_3:%.]] = load double, double [[ARRAYIDX_3]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[TMP46:%.*]] = fmul fast double [[T0_3]], [[TMP42]]			; CHECK-NEXT: [[TMP46:%.*]] = fmul fast double [[T0_3]], [[TMP42]]
	; CHECK-NEXT: [[ARRAYIDX2_3:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV_NEXT_2]]			; CHECK-NEXT: [[ARRAYIDX2_3:%.]] = getelementptr inbounds double, double [[X]], i64 [[INDVARS_IV_NEXT_2]]
	; CHECK-NEXT: store double [[TMP46]], double* [[ARRAYIDX2_3]], align 8, !tbaa !3			; CHECK-NEXT: store double [[TMP46]], double* [[ARRAYIDX2_3]], align 8, [[TBAA3]]
	; CHECK-NEXT: [[INDVARS_IV_NEXT_3]] = add nuw nsw i64 [[INDVARS_IV]], 4			; CHECK-NEXT: [[INDVARS_IV_NEXT_3]] = add nuw nsw i64 [[INDVARS_IV]], 4
	; CHECK-NEXT: [[EXITCOND_3:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_3]], [[WIDE_TRIP_COUNT]]			; CHECK-NEXT: [[EXITCOND_NOT_3:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT_3]], [[WIDE_TRIP_COUNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_3]], label [[FOR_END]], label [[FOR_BODY]], !llvm.loop !17			; CHECK-NEXT: br i1 [[EXITCOND_NOT_3]], label [[FOR_END]], label [[FOR_BODY]], [[LOOP32:!llvm.loop !.*]]
	; CHECK: for.end:			; CHECK: for.end:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	%div = fdiv fast double 1.0, %a			%div = fdiv fast double 1.0, %a
	br label %for.cond			br label %for.cond

	for.cond:			for.cond:
	Show All 38 Lines

llvm/test/Transforms/PhaseOrdering/pr39282.ll

	Show All 14 Lines

	; Consider that %addr1 = %addr2 + 1, in which case %addr2i and %addr1i are			; Consider that %addr1 = %addr2 + 1, in which case %addr2i and %addr1i are
	; noalias within one iteration, but may alias across iterations.			; noalias within one iteration, but may alias across iterations.
	; TODO: This is a micompile.			; TODO: This is a micompile.
	define void @pr39282(i32* %addr1, i32* %addr2) {			define void @pr39282(i32* %addr1, i32* %addr2) {
	; CHECK-LABEL: @pr39282(			; CHECK-LABEL: @pr39282(
	; CHECK-NEXT: start:			; CHECK-NEXT: start:
	; CHECK-NEXT: [[X_I:%.]] = load i32, i32 [[ADDR1:%.*]], align 4, !alias.scope !0, !noalias !3			; CHECK-NEXT: [[X_I:%.]] = load i32, i32 [[ADDR1:%.*]], align 4, !alias.scope !0, !noalias !3
				; CHECK-NEXT: store i32 [[X_I]], i32* [[ADDR2:%.*]], align 4, !alias.scope !3, !noalias !0
	; CHECK-NEXT: [[ADDR1I_1:%.]] = getelementptr inbounds i32, i32 [[ADDR1]], i64 1			; CHECK-NEXT: [[ADDR1I_1:%.]] = getelementptr inbounds i32, i32 [[ADDR1]], i64 1
	; CHECK-NEXT: [[ADDR2I_1:%.]] = getelementptr inbounds i32, i32 [[ADDR2:%.*]], i64 1			; CHECK-NEXT: [[ADDR2I_1:%.]] = getelementptr inbounds i32, i32 [[ADDR2]], i64 1
	; CHECK-NEXT: [[X_I_1:%.]] = load i32, i32 [[ADDR1I_1]], align 4, !alias.scope !0, !noalias !3			; CHECK-NEXT: [[X_I_1:%.]] = load i32, i32 [[ADDR1I_1]], align 4, !alias.scope !5, !noalias !8
	; CHECK-NEXT: store i32 [[X_I]], i32* [[ADDR2]], align 4, !alias.scope !3, !noalias !0			; CHECK-NEXT: store i32 [[X_I_1]], i32* [[ADDR2I_1]], align 4, !alias.scope !8, !noalias !5
	; CHECK-NEXT: store i32 [[X_I_1]], i32* [[ADDR2I_1]], align 4, !alias.scope !3, !noalias !0			; CHECK-NEXT: [[X_I_2:%.]] = load i32, i32 [[ADDR1]], align 4, !alias.scope !10, !noalias !13
				; CHECK-NEXT: store i32 [[X_I_2]], i32* [[ADDR2]], align 4, !alias.scope !13, !noalias !10
				; CHECK-NEXT: [[X_I_3:%.]] = load i32, i32 [[ADDR1I_1]], align 4, !alias.scope !15, !noalias !18
				; CHECK-NEXT: store i32 [[X_I_3]], i32* [[ADDR2I_1]], align 4, !alias.scope !18, !noalias !15
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	start:			start:
	br label %body			br label %body

	body:			body:
	%i = phi i32 [ 0, %start ], [ %i.next, %body ]			%i = phi i32 [ 0, %start ], [ %i.next, %body ]
	%j = and i32 %i, 1			%j = and i32 %i, 1
	Show All 10 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Duplicate noalias metadataAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 302079

llvm/include/llvm/Transforms/Utils/Cloning.h

llvm/lib/Transforms/Utils/CloneFunction.cpp

llvm/lib/Transforms/Utils/InlineFunction.cpp

llvm/lib/Transforms/Utils/LoopUnroll.cpp

llvm/test/Transforms/LoopUnroll/noalias.ll

llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll

llvm/test/Transforms/PhaseOrdering/pr39282.ll

[LoopUnroll] Duplicate noalias metadata
AbandonedPublic