This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/Scalar/
-
llvm/
-
Transforms/
-
Scalar/
-
MemCpyOptimizer.h
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
1/2
MemCpyOptimizer.cpp
-
test/Transforms/MemCpyOpt/
-
Transforms/
-
MemCpyOpt/
-
callslot.ll
-
invariant.start.ll
-
memcpy-invoke-memcpy.ll
-
memcpy.ll
-
merge-into-memset.ll
-
mixed-sizes.ll
-
nonlocal-memcpy-memcpy.ll
-
stackrestore.ll

Differential D89207

[MemCpyOpt] Port to MemorySSA
ClosedPublic

Authored by nikic on Oct 11 2020, 8:22 AM.

Download Raw Diff

Details

Reviewers

asbirlea
fhahn

Commits

rG624af932a808: [MemCpyOpt] Port to MemorySSA

Summary

This is a straightforward port of MemCpyOpt to MemorySSA following the approach of D26739. MemDep queries are replaced with MSSA queries without changing the overall structure of the pass. Some care has to be taken to account for differences between these APIs (MemDep also returns reads, MSSA doesn't).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nikic created this revision.Oct 11 2020, 8:22 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 11 2020, 8:22 AM

Herald added subscribers: llvm-commits, george.burgess.iv, hiraditya, Prazek. · View Herald Transcript

nikic requested review of this revision.Oct 11 2020, 8:22 AM

Harbormaster completed remote builds in B74731: Diff 297467.Oct 11 2020, 8:23 AM

nikic added parent revisions: D89206: [MemCpyOpt] Add test scaffolding for MSSA based MemCpyOpt, D88782: [MemorySSA] Use provided memory location even if instruction is call, D88778: [MemCpyOpt] Fix MemorySSA preservation.Oct 11 2020, 8:23 AM

nikic added inline comments.Oct 11 2020, 8:26 AM

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
358	What would be a good way to do that?

MSxDOS added a subscriber: MSxDOS.Oct 14 2020, 2:21 AM

Some preliminary compile-time numbers: https://llvm-compile-time-tracker.com/index.php?branch=nikic/perf/memcpy-mssa The last commits are, from the bottom to top: Enabling MSSA DSE, enabling MSSA MemCpyOpt and moving one MemCpyOpt pass next to a DSE pass, to avoid an additional computation of MSSA. The last two commits together actually end up being a mild compile-time improvement. So at least this looks viable.

CMakeFiles/7zip-benchmark.dir/CPP/7zip/Compress/ShrinkDecoder.cpp.o 4KiB 4KiB (+0.44%)

Small codesize regression?

nikic mentioned this in D89206: [MemCpyOpt] Add test scaffolding for MSSA based MemCpyOpt.Oct 28 2020, 1:20 AM

Ping!

Thank you for working on this, the compile times look very promising and the tests results are great!

Since this is using MSSA walker API with a MemoryLocation (which does not cache), could this become costly in a pathologically constructed example? If so, could we have a (reasonably large) upper bound on the number of getClobbering (Access, Location) calls to avoid such cases?

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
358	There isn't a way to do that currently. This needs support added in the walker. Draft: D90660; it lacks support in `tryOptimizePhi`, which means it may make sense to extend the query. This is tricky though because it depends on the path that access Start is on. If the new API is used in a scenario where it doesn't dominate End, the given "StopAt" (Start) access may be bypassed entirely. So even if it's not the case in MemCpyOptimizer, the MSSAWalker needs to account for such a case.

Rebase.

In D89207#2370224, @asbirlea wrote:

Since this is using MSSA walker API with a MemoryLocation (which does not cache), could this become costly in a pathologically constructed example? If so, could we have a (reasonably large) upper bound on the number of getClobbering (Access, Location) calls to avoid such cases?

I've started implementing this (by returning a DoNothingMemorySSAWalker when a limit is hit), but ended up wondering if it really makes sense: I believe that the number of MSSA walker queries we do in this pass is roughly proportional to the number of memsets/memcpys in the IR. For very big functions we may end up performing many queries, but the overall runtime should still be linear (each individual query is limited by the walker limit). Having a limit would make sense to me if something here had quadratic runtime (like e.g. the AST limit in LICM), but I don't think that's the case here.

In D89207#2408995, @nikic wrote:

In D89207#2370224, @asbirlea wrote:

Since this is using MSSA walker API with a MemoryLocation (which does not cache), could this become costly in a pathologically constructed example? If so, could we have a (reasonably large) upper bound on the number of getClobbering (Access, Location) calls to avoid such cases?

I've started implementing this (by returning a DoNothingMemorySSAWalker when a limit is hit), but ended up wondering if it really makes sense: I believe that the number of MSSA walker queries we do in this pass is roughly proportional to the number of memsets/memcpys in the IR. For very big functions we may end up performing many queries, but the overall runtime should still be linear (each individual query is limited by the walker limit). Having a limit would make sense to me if something here had quadratic runtime (like e.g. the AST limit in LICM), but I don't think that's the case here.

Ok, let's leave as is and revisit if we encounter a pathological case where the cost of N memset/memcpy queries is too much and we need to limit that.

@asbirlea Is this ready to land in the current form?

Yes.

This revision is now accepted and ready to land.Nov 30 2020, 6:03 PM

jrmuizel added a subscriber: jrmuizel.Dec 1 2020, 7:57 AM

Closed by commit rG624af932a808: [MemCpyOpt] Port to MemorySSA (authored by nikic). · Explain WhyDec 1 2020, 8:57 AM

This revision was automatically updated to reflect the committed changes.

nikic added a commit: rG624af932a808: [MemCpyOpt] Port to MemorySSA.

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

Scalar/

MemCpyOptimizer.h

1 line

lib/

Transforms/

Scalar/

MemCpyOptimizer.cpp

382 lines

test/

Transforms/

MemCpyOpt/

callslot.ll

32 lines

invariant.start.ll

26 lines

memcpy-invoke-memcpy.ll

84 lines

memcpy.ll

53 lines

merge-into-memset.ll

44 lines

mixed-sizes.ll

61 lines

nonlocal-memcpy-memcpy.ll

190 lines

stackrestore.ll

40 lines

Diff 308678

llvm/include/llvm/Transforms/Scalar/MemCpyOptimizer.h

	Show All 37 Lines
	class Value;			class Value;

	class MemCpyOptPass : public PassInfoMixin<MemCpyOptPass> {			class MemCpyOptPass : public PassInfoMixin<MemCpyOptPass> {
	MemoryDependenceResults *MD = nullptr;			MemoryDependenceResults *MD = nullptr;
	TargetLibraryInfo *TLI = nullptr;			TargetLibraryInfo *TLI = nullptr;
	AliasAnalysis *AA = nullptr;			AliasAnalysis *AA = nullptr;
	AssumptionCache *AC = nullptr;			AssumptionCache *AC = nullptr;
	DominatorTree *DT = nullptr;			DominatorTree *DT = nullptr;
				MemorySSA *MSSA = nullptr;
	MemorySSAUpdater *MSSAU = nullptr;			MemorySSAUpdater *MSSAU = nullptr;

	public:			public:
	MemCpyOptPass() = default;			MemCpyOptPass() = default;

	PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);			PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);

	// Glue for the old PM.			// Glue for the old PM.
	Show All 28 Lines

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
#include <utility>		#include <utility>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "memcpyopt"		#define DEBUG_TYPE "memcpyopt"

// TODO: Actually implement MemorySSA-based MemCpyOpt.
static cl::opt<bool>		static cl::opt<bool>
EnableMemorySSA("enable-memcpyopt-memoryssa", cl::init(false), cl::Hidden,		EnableMemorySSA("enable-memcpyopt-memoryssa", cl::init(false), cl::Hidden,
cl::desc("Use MemorySSA-backed MemCpyOpt."));		cl::desc("Use MemorySSA-backed MemCpyOpt."));

STATISTIC(NumMemCpyInstr, "Number of memcpy instructions deleted");		STATISTIC(NumMemCpyInstr, "Number of memcpy instructions deleted");
STATISTIC(NumMemSetInfer, "Number of memsets inferred");		STATISTIC(NumMemSetInfer, "Number of memsets inferred");
STATISTIC(NumMoveToCpy, "Number of memmoves converted to memcpy");		STATISTIC(NumMoveToCpy, "Number of memmoves converted to memcpy");
STATISTIC(NumCpyToSet, "Number of memcpys converted to memset");		STATISTIC(NumCpyToSet, "Number of memcpys converted to memset");
▲ Show 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	private:
// This transformation requires dominator postdominator info		// This transformation requires dominator postdominator info
void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.setPreservesCFG();		AU.setPreservesCFG();
AU.addRequired<AssumptionCacheTracker>();		AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addPreserved<DominatorTreeWrapperPass>();		AU.addPreserved<DominatorTreeWrapperPass>();
AU.addPreserved<GlobalsAAWrapperPass>();		AU.addPreserved<GlobalsAAWrapperPass>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
		if (!EnableMemorySSA)
AU.addRequired<MemoryDependenceWrapperPass>();		AU.addRequired<MemoryDependenceWrapperPass>();
AU.addPreserved<MemoryDependenceWrapperPass>();		AU.addPreserved<MemoryDependenceWrapperPass>();
AU.addRequired<AAResultsWrapperPass>();		AU.addRequired<AAResultsWrapperPass>();
AU.addPreserved<AAResultsWrapperPass>();		AU.addPreserved<AAResultsWrapperPass>();
if (EnableMemorySSA)		if (EnableMemorySSA)
AU.addRequired<MemorySSAWrapperPass>();		AU.addRequired<MemorySSAWrapperPass>();
AU.addPreserved<MemorySSAWrapperPass>();		AU.addPreserved<MemorySSAWrapperPass>();
}		}
};		};
Show All 30 Lines	if (!Start->getFunction()->doesNotThrow() &&
}		}
}		}
return false;		return false;
}		}

void MemCpyOptPass::eraseInstruction(Instruction *I) {		void MemCpyOptPass::eraseInstruction(Instruction *I) {
if (MSSAU)		if (MSSAU)
MSSAU->removeMemoryAccess(I);		MSSAU->removeMemoryAccess(I);
		if (MD)
MD->removeInstruction(I);		MD->removeInstruction(I);
I->eraseFromParent();		I->eraseFromParent();
}		}

		// Check for mod or ref of Loc between Start and End, excluding both boundaries.
		// Start and End must be in the same block
		static bool accessedBetween(AliasAnalysis &AA, MemoryLocation Loc,
		const MemoryUseOrDef *Start,
		const MemoryUseOrDef *End) {
		assert(Start->getBlock() == End->getBlock() && "Only local supported");
		for (const MemoryAccess &MA :
		make_range(++Start->getIterator(), End->getIterator())) {
		if (isModOrRefSet(AA.getModRefInfo(cast<MemoryUseOrDef>(MA).getMemoryInst(),
		Loc)))
		return true;
		}
		return false;
		}

		// Check for mod of Loc between Start and End, excluding both boundaries.
		// Start and End can be in different blocks.
		static bool writtenBetween(MemorySSA *MSSA, MemoryLocation Loc,
		const MemoryUseOrDef *Start,
		const MemoryUseOrDef *End) {
		// TODO: Only walk until we hit Start.
		nikicAuthorUnsubmitted Done Reply Inline Actions What would be a good way to do that? nikic: What would be a good way to do that?
		asbirleaUnsubmitted Not Done Reply Inline Actions There isn't a way to do that currently. This needs support added in the walker. Draft: D90660; it lacks support in `tryOptimizePhi`, which means it may make sense to extend the query. This is tricky though because it depends on the path that access Start is on. If the new API is used in a scenario where it doesn't dominate End, the given "StopAt" (Start) access may be bypassed entirely. So even if it's not the case in MemCpyOptimizer, the MSSAWalker needs to account for such a case. asbirlea: There isn't a way to do that currently. This needs support added in the walker. Draft: D90660…
		MemoryAccess *Clobber = MSSA->getWalker()->getClobberingMemoryAccess(
		End->getDefiningAccess(), Loc);
		return !MSSA->dominates(Clobber, Start);
		}

/// When scanning forward over instructions, we look for some other patterns to		/// When scanning forward over instructions, we look for some other patterns to
/// fold away. In particular, this looks for stores to neighboring locations of		/// fold away. In particular, this looks for stores to neighboring locations of
/// memory. If it sees enough consecutive ones, it attempts to merge them		/// memory. If it sees enough consecutive ones, it attempts to merge them
/// together into a memcpy/memset.		/// together into a memcpy/memset.
Instruction MemCpyOptPass::tryMergingIntoMemset(Instruction StartInst,		Instruction MemCpyOptPass::tryMergingIntoMemset(Instruction StartInst,
Value *StartPtr,		Value *StartPtr,
Value *ByteVal) {		Value *ByteVal) {
const DataLayout &DL = StartInst->getModule()->getDataLayout();		const DataLayout &DL = StartInst->getModule()->getDataLayout();
▲ Show 20 Lines • Show All 295 Lines • ▼ Show 20 Lines	if (LI->isSimple() && LI->hasOneUse() &&
auto *T = LI->getType();		auto *T = LI->getType();
if (T->isAggregateType()) {		if (T->isAggregateType()) {
MemoryLocation LoadLoc = MemoryLocation::get(LI);		MemoryLocation LoadLoc = MemoryLocation::get(LI);

// We use alias analysis to check if an instruction may store to		// We use alias analysis to check if an instruction may store to
// the memory we load from in between the load and the store. If		// the memory we load from in between the load and the store. If
// such an instruction is found, we try to promote there instead		// such an instruction is found, we try to promote there instead
// of at the store position.		// of at the store position.
		// TODO: Can use MSSA for this.
Instruction *P = SI;		Instruction *P = SI;
for (auto &I : make_range(++LI->getIterator(), SI->getIterator())) {		for (auto &I : make_range(++LI->getIterator(), SI->getIterator())) {
if (isModSet(AA->getModRefInfo(&I, LoadLoc))) {		if (isModSet(AA->getModRefInfo(&I, LoadLoc))) {
P = &I;		P = &I;
break;		break;
}		}
}		}

▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	if (LI->isSimple() && LI->hasOneUse() &&
BBI = M->getIterator();		BBI = M->getIterator();
return true;		return true;
}		}
}		}

// Detect cases where we're performing call slot forwarding, but		// Detect cases where we're performing call slot forwarding, but
// happen to be using a load-store pair to implement it, rather than		// happen to be using a load-store pair to implement it, rather than
// a memcpy.		// a memcpy.
MemDepResult ldep = MD->getDependency(LI);
CallInst *C = nullptr;		CallInst *C = nullptr;
		if (EnableMemorySSA) {
		if (auto *LoadClobber = dyn_cast<MemoryUseOrDef>(
		MSSA->getWalker()->getClobberingMemoryAccess(LI))) {
		// The load most post-dom the call. Limit to the same block for now.
		// TODO: Support non-local call-slot optimization?
		if (LoadClobber->getBlock() == SI->getParent())
		C = dyn_cast_or_null<CallInst>(LoadClobber->getMemoryInst());
		}
		} else {
		MemDepResult ldep = MD->getDependency(LI);
if (ldep.isClobber() && !isa<MemCpyInst>(ldep.getInst()))		if (ldep.isClobber() && !isa<MemCpyInst>(ldep.getInst()))
C = dyn_cast<CallInst>(ldep.getInst());		C = dyn_cast<CallInst>(ldep.getInst());
		}

if (C) {		if (C) {
// Check that nothing touches the dest of the "copy" between		// Check that nothing touches the dest of the "copy" between
// the call and the store.		// the call and the store.
MemoryLocation StoreLoc = MemoryLocation::get(SI);		MemoryLocation StoreLoc = MemoryLocation::get(SI);
for (BasicBlock::iterator I = --SI->getIterator(), E = C->getIterator();		if (EnableMemorySSA) {
		if (accessedBetween(*AA, StoreLoc, MSSA->getMemoryAccess(C),
		MSSA->getMemoryAccess(SI)))
		C = nullptr;
		} else {
		for (BasicBlock::iterator I = --SI->getIterator(),
		E = C->getIterator();
I != E; --I) {		I != E; --I) {
if (isModOrRefSet(AA->getModRefInfo(&*I, StoreLoc))) {		if (isModOrRefSet(AA->getModRefInfo(&*I, StoreLoc))) {
C = nullptr;		C = nullptr;
break;		break;
}		}
}		}
}		}
		}

if (C) {		if (C) {
bool changed = performCallSlotOptzn(		bool changed = performCallSlotOptzn(
LI, SI, SI->getPointerOperand()->stripPointerCasts(),		LI, SI, SI->getPointerOperand()->stripPointerCasts(),
LI->getPointerOperand()->stripPointerCasts(),		LI->getPointerOperand()->stripPointerCasts(),
DL.getTypeStoreSize(SI->getOperand(0)->getType()),		DL.getTypeStoreSize(SI->getOperand(0)->getType()),
commonAlignment(SI->getAlign(), LI->getAlign()), C);		commonAlignment(SI->getAlign(), LI->getAlign()), C);
if (changed) {		if (changed) {
▲ Show 20 Lines • Show All 230 Lines • ▼ Show 20 Lines	bool MemCpyOptPass::performCallSlotOptzn(Instruction *cpyLoad,
// If the destination wasn't sufficiently aligned then increase its alignment.		// If the destination wasn't sufficiently aligned then increase its alignment.
if (!isDestSufficientlyAligned) {		if (!isDestSufficientlyAligned) {
assert(isa<AllocaInst>(cpyDest) && "Can only increase alloca alignment!");		assert(isa<AllocaInst>(cpyDest) && "Can only increase alloca alignment!");
cast<AllocaInst>(cpyDest)->setAlignment(srcAlign);		cast<AllocaInst>(cpyDest)->setAlignment(srcAlign);
}		}

// Drop any cached information about the call, because we may have changed		// Drop any cached information about the call, because we may have changed
// its dependence information by changing its parameter.		// its dependence information by changing its parameter.
		if (MD)
MD->removeInstruction(C);		MD->removeInstruction(C);

// Update AA metadata		// Update AA metadata
// FIXME: MD_tbaa_struct and MD_mem_parallel_loop_access should also be		// FIXME: MD_tbaa_struct and MD_mem_parallel_loop_access should also be
// handled here, but combineMetadata doesn't support them yet		// handled here, but combineMetadata doesn't support them yet
unsigned KnownIDs[] = {LLVMContext::MD_tbaa, LLVMContext::MD_alias_scope,		unsigned KnownIDs[] = {LLVMContext::MD_tbaa, LLVMContext::MD_alias_scope,
LLVMContext::MD_noalias,		LLVMContext::MD_noalias,
LLVMContext::MD_invariant_group,		LLVMContext::MD_invariant_group,
LLVMContext::MD_access_group};		LLVMContext::MD_access_group};
Show All 31 Lines	bool MemCpyOptPass::processMemCpyMemCpyDependence(MemCpyInst *M,
// transfers. For example, in:		// transfers. For example, in:
// memcpy(a <- b)		// memcpy(a <- b)
// *b = 42;		// *b = 42;
// memcpy(c <- a)		// memcpy(c <- a)
// It would be invalid to transform the second memcpy into memcpy(c <- b).		// It would be invalid to transform the second memcpy into memcpy(c <- b).
//		//
// TODO: If the code between M and MDep is transparent to the destination "c",		// TODO: If the code between M and MDep is transparent to the destination "c",
// then we could still perform the xform by moving M up to the first memcpy.		// then we could still perform the xform by moving M up to the first memcpy.
//		if (EnableMemorySSA) {
		// TODO: It would be sufficient to check the MDep source up to the memcpy
		// size of M, rather than MDep.
		if (writtenBetween(MSSA, MemoryLocation::getForSource(MDep),
		MSSA->getMemoryAccess(MDep), MSSA->getMemoryAccess(M)))
		return false;
		} else {
// NOTE: This is conservative, it will stop on any read from the source loc,		// NOTE: This is conservative, it will stop on any read from the source loc,
// not just the defining memcpy.		// not just the defining memcpy.
MemDepResult SourceDep =		MemDepResult SourceDep =
MD->getPointerDependencyFrom(MemoryLocation::getForSource(MDep), false,		MD->getPointerDependencyFrom(MemoryLocation::getForSource(MDep), false,
M->getIterator(), M->getParent());		M->getIterator(), M->getParent());
if (!SourceDep.isClobber() \|\| SourceDep.getInst() != MDep)		if (!SourceDep.isClobber() \|\| SourceDep.getInst() != MDep)
return false;		return false;
		}

// If the dest of the second might alias the source of the first, then the		// If the dest of the second might alias the source of the first, then the
// source and dest might overlap. We still want to eliminate the intermediate		// source and dest might overlap. We still want to eliminate the intermediate
// value, but we have to generate a memmove instead of memcpy.		// value, but we have to generate a memmove instead of memcpy.
bool UseMemMove = false;		bool UseMemMove = false;
if (!AA->isNoAlias(MemoryLocation::getForDest(M),		if (!AA->isNoAlias(MemoryLocation::getForDest(M),
MemoryLocation::getForSource(MDep)))		MemoryLocation::getForSource(MDep)))
UseMemMove = true;		UseMemMove = true;
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	bool MemCpyOptPass::processMemSetMemCpyDependence(MemCpyInst *MemCpy,
// Check that src and dst of the memcpy aren't the same. While memcpy		// Check that src and dst of the memcpy aren't the same. While memcpy
// operands cannot partially overlap, exact equality is allowed.		// operands cannot partially overlap, exact equality is allowed.
if (!AA->isNoAlias(MemoryLocation(MemCpy->getSource(),		if (!AA->isNoAlias(MemoryLocation(MemCpy->getSource(),
LocationSize::precise(1)),		LocationSize::precise(1)),
MemoryLocation(MemCpy->getDest(),		MemoryLocation(MemCpy->getDest(),
LocationSize::precise(1))))		LocationSize::precise(1))))
return false;		return false;

// Check that there are no other dependencies on the memset destination.		if (EnableMemorySSA) {
MemDepResult DstDepInfo =		// We know that dst up to src_size is not written. We now need to make sure
MD->getPointerDependencyFrom(MemoryLocation::getForDest(MemSet), false,		// that dst up to dst_size is not accessed. (If we did not move the memset,
MemCpy->getIterator(), MemCpy->getParent());		// checking for reads would be sufficient.)
		if (accessedBetween(*AA, MemoryLocation::getForDest(MemSet),
		MSSA->getMemoryAccess(MemSet),
		MSSA->getMemoryAccess(MemCpy))) {
		return false;
		}
		} else {
		// We have already checked that dst up to src_size is not accessed. We
		// need to make sure that there are no accesses up to dst_size either.
		MemDepResult DstDepInfo = MD->getPointerDependencyFrom(
		MemoryLocation::getForDest(MemSet), false, MemCpy->getIterator(),
		MemCpy->getParent());
if (DstDepInfo.getInst() != MemSet)		if (DstDepInfo.getInst() != MemSet)
return false;		return false;
		}

// Use the same i8* dest as the memcpy, killing the memset dest if different.		// Use the same i8* dest as the memcpy, killing the memset dest if different.
Value *Dest = MemCpy->getRawDest();		Value *Dest = MemCpy->getRawDest();
Value *DestSize = MemSet->getLength();		Value *DestSize = MemSet->getLength();
Value *SrcSize = MemCpy->getLength();		Value *SrcSize = MemCpy->getLength();

if (mayBeVisibleThroughUnwinding(Dest, MemSet, MemCpy))		if (mayBeVisibleThroughUnwinding(Dest, MemSet, MemCpy))
return false;		return false;
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I))
if (II->getIntrinsicID() == Intrinsic::lifetime_start)		if (II->getIntrinsicID() == Intrinsic::lifetime_start)
if (ConstantInt *LTSize = dyn_cast<ConstantInt>(II->getArgOperand(0)))		if (ConstantInt *LTSize = dyn_cast<ConstantInt>(II->getArgOperand(0)))
if (LTSize->getZExtValue() >= Size->getZExtValue())		if (LTSize->getZExtValue() >= Size->getZExtValue())
return true;		return true;

return false;		return false;
}		}

		static bool hasUndefContentsMSSA(MemorySSA MSSA, AliasAnalysis AA, Value *V,
		MemoryDef Def, ConstantInt Size) {
		if (MSSA->isLiveOnEntryDef(Def))
		return isa<AllocaInst>(getUnderlyingObject(V));

		if (IntrinsicInst *II =
		dyn_cast_or_null<IntrinsicInst>(Def->getMemoryInst())) {
		if (II->getIntrinsicID() == Intrinsic::lifetime_start) {
		ConstantInt *LTSize = cast<ConstantInt>(II->getArgOperand(0));
		if (AA->isMustAlias(V, II->getArgOperand(1)) &&
		LTSize->getZExtValue() >= Size->getZExtValue())
		return true;
		}
		}

		return false;
		}

/// Transform memcpy to memset when its source was just memset.		/// Transform memcpy to memset when its source was just memset.
/// In other words, turn:		/// In other words, turn:
/// \code		/// \code
/// memset(dst1, c, dst1_size);		/// memset(dst1, c, dst1_size);
/// memcpy(dst2, dst1, dst2_size);		/// memcpy(dst2, dst1, dst2_size);
/// \endcode		/// \endcode
/// into:		/// into:
/// \code		/// \code
Show All 19 Lines	bool MemCpyOptPass::performMemCpyToMemSetOptzn(MemCpyInst *MemCpy,
// Don't worry about sizes larger than i64.		// Don't worry about sizes larger than i64.
ConstantInt *CopySize = cast<ConstantInt>(MemCpy->getLength());		ConstantInt *CopySize = cast<ConstantInt>(MemCpy->getLength());
if (CopySize->getZExtValue() > MemSetSize->getZExtValue()) {		if (CopySize->getZExtValue() > MemSetSize->getZExtValue()) {
// If the memcpy is larger than the memset, but the memory was undef prior		// If the memcpy is larger than the memset, but the memory was undef prior
// to the memset, we can just ignore the tail. Technically we're only		// to the memset, we can just ignore the tail. Technically we're only
// interested in the bytes from MemSetSize..CopySize here, but as we can't		// interested in the bytes from MemSetSize..CopySize here, but as we can't
// easily represent this location, we use the full 0..CopySize range.		// easily represent this location, we use the full 0..CopySize range.
MemoryLocation MemCpyLoc = MemoryLocation::getForSource(MemCpy);		MemoryLocation MemCpyLoc = MemoryLocation::getForSource(MemCpy);
		bool CanReduceSize = false;
		if (EnableMemorySSA) {
		MemoryUseOrDef *MemSetAccess = MSSA->getMemoryAccess(MemSet);
		MemoryAccess *Clobber = MSSA->getWalker()->getClobberingMemoryAccess(
		MemSetAccess->getDefiningAccess(), MemCpyLoc);
		if (auto *MD = dyn_cast<MemoryDef>(Clobber))
		if (hasUndefContentsMSSA(MSSA, AA, MemCpy->getSource(), MD, CopySize))
		CanReduceSize = true;
		} else {
MemDepResult DepInfo = MD->getPointerDependencyFrom(		MemDepResult DepInfo = MD->getPointerDependencyFrom(
MemCpyLoc, true, MemSet->getIterator(), MemSet->getParent());		MemCpyLoc, true, MemSet->getIterator(), MemSet->getParent());
if (DepInfo.isDef() && hasUndefContents(DepInfo.getInst(), CopySize))		if (DepInfo.isDef() && hasUndefContents(DepInfo.getInst(), CopySize))
CopySize = MemSetSize;		CanReduceSize = true;
else		}

		if (!CanReduceSize)
return false;		return false;
		CopySize = MemSetSize;
}		}

IRBuilder<> Builder(MemCpy);		IRBuilder<> Builder(MemCpy);
Instruction *NewM =		Instruction *NewM =
Builder.CreateMemSet(MemCpy->getRawDest(), MemSet->getOperand(1),		Builder.CreateMemSet(MemCpy->getRawDest(), MemSet->getOperand(1),
CopySize, MaybeAlign(MemCpy->getDestAlignment()));		CopySize, MaybeAlign(MemCpy->getDestAlignment()));
if (MSSAU) {		if (MSSAU) {
auto *LastDef =		auto *LastDef =
Show All 38 Lines	if (GV->isConstant() && GV->hasDefinitiveInitializer())
MSSAU->insertDef(cast<MemoryDef>(NewAccess), /RenameUses=/true);		MSSAU->insertDef(cast<MemoryDef>(NewAccess), /RenameUses=/true);
}		}

eraseInstruction(M);		eraseInstruction(M);
++NumCpyToSet;		++NumCpyToSet;
return true;		return true;
}		}

		if (EnableMemorySSA) {
		MemoryUseOrDef *MA = MSSA->getMemoryAccess(M);
		MemoryAccess *AnyClobber = MSSA->getWalker()->getClobberingMemoryAccess(MA);
		MemoryLocation DestLoc = MemoryLocation::getForDest(M);
		const MemoryAccess *DestClobber =
		MSSA->getWalker()->getClobberingMemoryAccess(AnyClobber, DestLoc);

		// Try to turn a partially redundant memset + memcpy into
		// memcpy + smaller memset. We don't need the memcpy size for this.
		// The memcpy most post-dom the memset, so limit this to the same basic
		// block. A non-local generalization is likely not worthwhile.
		if (auto *MD = dyn_cast<MemoryDef>(DestClobber))
		if (auto *MDep = dyn_cast_or_null<MemSetInst>(MD->getMemoryInst()))
		if (DestClobber->getBlock() == M->getParent())
		if (processMemSetMemCpyDependence(M, MDep))
		return true;

		// The optimizations after this point require the memcpy size.
		ConstantInt *CopySize = dyn_cast<ConstantInt>(M->getLength());
		if (!CopySize) return false;

		MemoryAccess *SrcClobber = MSSA->getWalker()->getClobberingMemoryAccess(
		AnyClobber, MemoryLocation::getForSource(M));

		// There are four possible optimizations we can do for memcpy:
		// a) memcpy-memcpy xform which exposes redundance for DSE.
		// b) call-memcpy xform for return slot optimization.
		// c) memcpy from freshly alloca'd space or space that has just started
		// its lifetime copies undefined data, and we can therefore eliminate
		// the memcpy in favor of the data that was already at the destination.
		// d) memcpy from a just-memset'd source can be turned into memset.
		if (auto *MD = dyn_cast<MemoryDef>(SrcClobber)) {
		if (Instruction *MI = MD->getMemoryInst()) {
		if (auto *C = dyn_cast<CallInst>(MI)) {
		// The memcpy must post-dom the call. Limit to the same block for now.
		// Additionally, we need to ensure that there are no accesses to dest
		// between the call and the memcpy. Accesses to src will be checked
		// by performCallSlotOptzn().
		// TODO: Support non-local call-slot optimization?
		if (C->getParent() == M->getParent() &&
		!accessedBetween(*AA, DestLoc, MD, MA)) {
		// FIXME: Can we pass in either of dest/src alignment here instead
		// of conservatively taking the minimum?
		Align Alignment = std::min(M->getDestAlign().valueOrOne(),
		M->getSourceAlign().valueOrOne());
		if (performCallSlotOptzn(M, M, M->getDest(), M->getSource(),
		CopySize->getZExtValue(), Alignment, C)) {
		LLVM_DEBUG(dbgs() << "Performed call slot optimization:\n"
		<< " call: " << *C << "\n"
		<< " memcpy: " << *M << "\n");
		eraseInstruction(M);
		++NumMemCpyInstr;
		return true;
		}
		}
		}
		if (auto *MDep = dyn_cast<MemCpyInst>(MI))
		return processMemCpyMemCpyDependence(M, MDep);
		if (auto *MDep = dyn_cast<MemSetInst>(MI)) {
		if (performMemCpyToMemSetOptzn(M, MDep)) {
		LLVM_DEBUG(dbgs() << "Converted memcpy to memset\n");
		eraseInstruction(M);
		++NumCpyToSet;
		return true;
		}
		}
		}

		if (hasUndefContentsMSSA(MSSA, AA, M->getSource(), MD, CopySize)) {
		LLVM_DEBUG(dbgs() << "Removed memcpy from undef\n");
		eraseInstruction(M);
		++NumMemCpyInstr;
		return true;
		}
		}
		} else {
MemDepResult DepInfo = MD->getDependency(M);		MemDepResult DepInfo = MD->getDependency(M);

// Try to turn a partially redundant memset + memcpy into		// Try to turn a partially redundant memset + memcpy into
// memcpy + smaller memset. We don't need the memcpy size for this.		// memcpy + smaller memset. We don't need the memcpy size for this.
if (DepInfo.isClobber())		if (DepInfo.isClobber())
if (MemSetInst *MDep = dyn_cast<MemSetInst>(DepInfo.getInst()))		if (MemSetInst *MDep = dyn_cast<MemSetInst>(DepInfo.getInst()))
if (processMemSetMemCpyDependence(M, MDep))		if (processMemSetMemCpyDependence(M, MDep))
return true;		return true;

// The optimizations after this point require the memcpy size.		// The optimizations after this point require the memcpy size.
ConstantInt *CopySize = dyn_cast<ConstantInt>(M->getLength());		ConstantInt *CopySize = dyn_cast<ConstantInt>(M->getLength());
if (!CopySize) return false;		if (!CopySize) return false;

// There are four possible optimizations we can do for memcpy:		// There are four possible optimizations we can do for memcpy:
// a) memcpy-memcpy xform which exposes redundance for DSE.		// a) memcpy-memcpy xform which exposes redundance for DSE.
// b) call-memcpy xform for return slot optimization.		// b) call-memcpy xform for return slot optimization.
// c) memcpy from freshly alloca'd space or space that has just started its		// c) memcpy from freshly alloca'd space or space that has just started
// lifetime copies undefined data, and we can therefore eliminate the		// its lifetime copies undefined data, and we can therefore eliminate
// memcpy in favor of the data that was already at the destination.		// the memcpy in favor of the data that was already at the destination.
// d) memcpy from a just-memset'd source can be turned into memset.		// d) memcpy from a just-memset'd source can be turned into memset.
if (DepInfo.isClobber()) {		if (DepInfo.isClobber()) {
if (CallInst *C = dyn_cast<CallInst>(DepInfo.getInst())) {		if (CallInst *C = dyn_cast<CallInst>(DepInfo.getInst())) {
// FIXME: Can we pass in either of dest/src alignment here instead		// FIXME: Can we pass in either of dest/src alignment here instead
// of conservatively taking the minimum?		// of conservatively taking the minimum?
Align Alignment = std::min(M->getDestAlign().valueOrOne(),		Align Alignment = std::min(M->getDestAlign().valueOrOne(),
M->getSourceAlign().valueOrOne());		M->getSourceAlign().valueOrOne());
if (performCallSlotOptzn(M, M, M->getDest(), M->getSource(),		if (performCallSlotOptzn(M, M, M->getDest(), M->getSource(),
CopySize->getZExtValue(), Alignment, C)) {		CopySize->getZExtValue(), Alignment, C)) {
eraseInstruction(M);		eraseInstruction(M);
++NumMemCpyInstr;		++NumMemCpyInstr;
return true;		return true;
}		}
}		}
}		}

MemoryLocation SrcLoc = MemoryLocation::getForSource(M);		MemoryLocation SrcLoc = MemoryLocation::getForSource(M);
MemDepResult SrcDepInfo = MD->getPointerDependencyFrom(		MemDepResult SrcDepInfo = MD->getPointerDependencyFrom(
SrcLoc, true, M->getIterator(), M->getParent());		SrcLoc, true, M->getIterator(), M->getParent());

if (SrcDepInfo.isClobber()) {		if (SrcDepInfo.isClobber()) {
if (MemCpyInst *MDep = dyn_cast<MemCpyInst>(SrcDepInfo.getInst()))		if (MemCpyInst *MDep = dyn_cast<MemCpyInst>(SrcDepInfo.getInst()))
return processMemCpyMemCpyDependence(M, MDep);		return processMemCpyMemCpyDependence(M, MDep);
} else if (SrcDepInfo.isDef()) {		} else if (SrcDepInfo.isDef()) {
if (hasUndefContents(SrcDepInfo.getInst(), CopySize)) {		if (hasUndefContents(SrcDepInfo.getInst(), CopySize)) {
eraseInstruction(M);		eraseInstruction(M);
++NumMemCpyInstr;		++NumMemCpyInstr;
return true;		return true;
}		}
}		}

if (SrcDepInfo.isClobber())		if (SrcDepInfo.isClobber())
if (MemSetInst *MDep = dyn_cast<MemSetInst>(SrcDepInfo.getInst()))		if (MemSetInst *MDep = dyn_cast<MemSetInst>(SrcDepInfo.getInst()))
if (performMemCpyToMemSetOptzn(M, MDep)) {		if (performMemCpyToMemSetOptzn(M, MDep)) {
eraseInstruction(M);		eraseInstruction(M);
++NumCpyToSet;		++NumCpyToSet;
return true;		return true;
}		}
		}

return false;		return false;
}		}

/// Transforms memmove calls to memcpy calls when the src/dst are guaranteed		/// Transforms memmove calls to memcpy calls when the src/dst are guaranteed
/// not to alias.		/// not to alias.
bool MemCpyOptPass::processMemMove(MemMoveInst *M) {		bool MemCpyOptPass::processMemMove(MemMoveInst *M) {
if (!TLI->has(LibFunc_memmove))		if (!TLI->has(LibFunc_memmove))
Show All 14 Lines	bool MemCpyOptPass::processMemMove(MemMoveInst *M) {
M->setCalledFunction(Intrinsic::getDeclaration(M->getModule(),		M->setCalledFunction(Intrinsic::getDeclaration(M->getModule(),
Intrinsic::memcpy, ArgTys));		Intrinsic::memcpy, ArgTys));

// For MemorySSA nothing really changes (except that memcpy may imply stricter		// For MemorySSA nothing really changes (except that memcpy may imply stricter
// aliasing guarantees).		// aliasing guarantees).

// MemDep may have over conservative information about this instruction, just		// MemDep may have over conservative information about this instruction, just
// conservatively flush it from the cache.		// conservatively flush it from the cache.
		if (MD)
MD->removeInstruction(M);		MD->removeInstruction(M);

++NumMoveToCpy;		++NumMoveToCpy;
return true;		return true;
}		}

/// This is called on every byval argument in call sites.		/// This is called on every byval argument in call sites.
bool MemCpyOptPass::processByValArgument(CallBase &CB, unsigned ArgNo) {		bool MemCpyOptPass::processByValArgument(CallBase &CB, unsigned ArgNo) {
const DataLayout &DL = CB.getCaller()->getParent()->getDataLayout();		const DataLayout &DL = CB.getCaller()->getParent()->getDataLayout();
// Find out what feeds this byval argument.		// Find out what feeds this byval argument.
Value *ByValArg = CB.getArgOperand(ArgNo);		Value *ByValArg = CB.getArgOperand(ArgNo);
Type *ByValTy = cast<PointerType>(ByValArg->getType())->getElementType();		Type *ByValTy = cast<PointerType>(ByValArg->getType())->getElementType();
uint64_t ByValSize = DL.getTypeAllocSize(ByValTy);		uint64_t ByValSize = DL.getTypeAllocSize(ByValTy);
		MemoryLocation Loc(ByValArg, LocationSize::precise(ByValSize));
		MemCpyInst *MDep = nullptr;
		if (EnableMemorySSA) {
		MemoryUseOrDef *CallAccess = MSSA->getMemoryAccess(&CB);
		MemoryAccess *Clobber = MSSA->getWalker()->getClobberingMemoryAccess(
		CallAccess->getDefiningAccess(), Loc);
		if (auto *MD = dyn_cast<MemoryDef>(Clobber))
		MDep = dyn_cast_or_null<MemCpyInst>(MD->getMemoryInst());
		} else {
MemDepResult DepInfo = MD->getPointerDependencyFrom(		MemDepResult DepInfo = MD->getPointerDependencyFrom(
MemoryLocation(ByValArg, LocationSize::precise(ByValSize)), true,		Loc, true, CB.getIterator(), CB.getParent());
CB.getIterator(), CB.getParent());
if (!DepInfo.isClobber())		if (!DepInfo.isClobber())
return false;		return false;
		MDep = dyn_cast<MemCpyInst>(DepInfo.getInst());
		}

// If the byval argument isn't fed by a memcpy, ignore it. If it is fed by		// If the byval argument isn't fed by a memcpy, ignore it. If it is fed by
// a memcpy, see if we can byval from the source of the memcpy instead of the		// a memcpy, see if we can byval from the source of the memcpy instead of the
// result.		// result.
MemCpyInst *MDep = dyn_cast<MemCpyInst>(DepInfo.getInst());
if (!MDep \|\| MDep->isVolatile() \|\|		if (!MDep \|\| MDep->isVolatile() \|\|
ByValArg->stripPointerCasts() != MDep->getDest())		ByValArg->stripPointerCasts() != MDep->getDest())
return false;		return false;

// The length of the memcpy must be larger or equal to the size of the byval.		// The length of the memcpy must be larger or equal to the size of the byval.
ConstantInt *C1 = dyn_cast<ConstantInt>(MDep->getLength());		ConstantInt *C1 = dyn_cast<ConstantInt>(MDep->getLength());
if (!C1 \|\| C1->getValue().getZExtValue() < ByValSize)		if (!C1 \|\| C1->getValue().getZExtValue() < ByValSize)
return false;		return false;
Show All 17 Lines	if (MDep->getSource()->getType()->getPointerAddressSpace() !=
return false;		return false;

// Verify that the copied-from memory doesn't change in between the memcpy and		// Verify that the copied-from memory doesn't change in between the memcpy and
// the byval call.		// the byval call.
// memcpy(a <- b)		// memcpy(a <- b)
// *b = 42;		// *b = 42;
// foo(*a)		// foo(*a)
// It would be invalid to transform the second memcpy into foo(*b).		// It would be invalid to transform the second memcpy into foo(*b).
//		if (EnableMemorySSA) {
		if (writtenBetween(MSSA, MemoryLocation::getForSource(MDep),
		MSSA->getMemoryAccess(MDep), MSSA->getMemoryAccess(&CB)))
		return false;
		} else {
// NOTE: This is conservative, it will stop on any read from the source loc,		// NOTE: This is conservative, it will stop on any read from the source loc,
// not just the defining memcpy.		// not just the defining memcpy.
MemDepResult SourceDep = MD->getPointerDependencyFrom(		MemDepResult SourceDep = MD->getPointerDependencyFrom(
MemoryLocation::getForSource(MDep), false,		MemoryLocation::getForSource(MDep), false,
CB.getIterator(), MDep->getParent());		CB.getIterator(), MDep->getParent());
if (!SourceDep.isClobber() \|\| SourceDep.getInst() != MDep)		if (!SourceDep.isClobber() \|\| SourceDep.getInst() != MDep)
return false;		return false;
		}

Value *TmpCast = MDep->getSource();		Value *TmpCast = MDep->getSource();
if (MDep->getSource()->getType() != ByValArg->getType()) {		if (MDep->getSource()->getType() != ByValArg->getType()) {
BitCastInst *TmpBitCast = new BitCastInst(MDep->getSource(), ByValArg->getType(),		BitCastInst *TmpBitCast = new BitCastInst(MDep->getSource(), ByValArg->getType(),
"tmpcast", &CB);		"tmpcast", &CB);
// Set the tmpcast's DebugLoc to MDep's		// Set the tmpcast's DebugLoc to MDep's
TmpBitCast->setDebugLoc(MDep->getDebugLoc());		TmpBitCast->setDebugLoc(MDep->getDebugLoc());
TmpCast = TmpBitCast;		TmpCast = TmpBitCast;
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator BI = BB.begin(), BE = BB.end(); BI != BE;) {
}		}
}		}
}		}

return MadeChange;		return MadeChange;
}		}

PreservedAnalyses MemCpyOptPass::run(Function &F, FunctionAnalysisManager &AM) {		PreservedAnalyses MemCpyOptPass::run(Function &F, FunctionAnalysisManager &AM) {
auto &MD = AM.getResult<MemoryDependenceAnalysis>(F);		auto *MD = !EnableMemorySSA ? &AM.getResult<MemoryDependenceAnalysis>(F)
		: AM.getCachedResult<MemoryDependenceAnalysis>(F);
auto &TLI = AM.getResult<TargetLibraryAnalysis>(F);		auto &TLI = AM.getResult<TargetLibraryAnalysis>(F);
auto *AA = &AM.getResult<AAManager>(F);		auto *AA = &AM.getResult<AAManager>(F);
auto *AC = &AM.getResult<AssumptionAnalysis>(F);		auto *AC = &AM.getResult<AssumptionAnalysis>(F);
auto *DT = &AM.getResult<DominatorTreeAnalysis>(F);		auto *DT = &AM.getResult<DominatorTreeAnalysis>(F);
auto *MSSA = EnableMemorySSA ? &AM.getResult<MemorySSAAnalysis>(F)		auto *MSSA = EnableMemorySSA ? &AM.getResult<MemorySSAAnalysis>(F)
: AM.getCachedResult<MemorySSAAnalysis>(F);		: AM.getCachedResult<MemorySSAAnalysis>(F);

bool MadeChange =		bool MadeChange =
runImpl(F, &MD, &TLI, AA, AC, DT, MSSA ? &MSSA->getMSSA() : nullptr);		runImpl(F, MD, &TLI, AA, AC, DT, MSSA ? &MSSA->getMSSA() : nullptr);
if (!MadeChange)		if (!MadeChange)
return PreservedAnalyses::all();		return PreservedAnalyses::all();

PreservedAnalyses PA;		PreservedAnalyses PA;
PA.preserveSet<CFGAnalyses>();		PA.preserveSet<CFGAnalyses>();
PA.preserve<GlobalsAA>();		PA.preserve<GlobalsAA>();
		if (MD)
PA.preserve<MemoryDependenceAnalysis>();		PA.preserve<MemoryDependenceAnalysis>();
if (MSSA)		if (MSSA)
PA.preserve<MemorySSAAnalysis>();		PA.preserve<MemorySSAAnalysis>();
return PA;		return PA;
}		}

bool MemCpyOptPass::runImpl(Function &F, MemoryDependenceResults *MD_,		bool MemCpyOptPass::runImpl(Function &F, MemoryDependenceResults *MD_,
TargetLibraryInfo TLI_, AliasAnalysis AA_,		TargetLibraryInfo TLI_, AliasAnalysis AA_,
AssumptionCache AC_, DominatorTree DT_,		AssumptionCache AC_, DominatorTree DT_,
MemorySSA *MSSA_) {		MemorySSA *MSSA_) {
bool MadeChange = false;		bool MadeChange = false;
MD = MD_;		MD = MD_;
TLI = TLI_;		TLI = TLI_;
AA = AA_;		AA = AA_;
AC = AC_;		AC = AC_;
DT = DT_;		DT = DT_;
		MSSA = MSSA_;
MemorySSAUpdater MSSAU_(MSSA_);		MemorySSAUpdater MSSAU_(MSSA_);
MSSAU = MSSA_ ? &MSSAU_ : nullptr;		MSSAU = MSSA_ ? &MSSAU_ : nullptr;
// If we don't have at least memset and memcpy, there is little point of doing		// If we don't have at least memset and memcpy, there is little point of doing
// anything here. These are required by a freestanding implementation, so if		// anything here. These are required by a freestanding implementation, so if
// even they are disabled, there is no point in trying hard.		// even they are disabled, there is no point in trying hard.
if (!TLI->has(LibFunc_memset) \|\| !TLI->has(LibFunc_memcpy))		if (!TLI->has(LibFunc_memset) \|\| !TLI->has(LibFunc_memcpy))
return false;		return false;

Show All 10 Lines	bool MemCpyOptPass::runImpl(Function &F, MemoryDependenceResults *MD_,
return MadeChange;		return MadeChange;
}		}

/// This is the main transformation entry point for a function.		/// This is the main transformation entry point for a function.
bool MemCpyOptLegacyPass::runOnFunction(Function &F) {		bool MemCpyOptLegacyPass::runOnFunction(Function &F) {
if (skipFunction(F))		if (skipFunction(F))
return false;		return false;

auto *MD = &getAnalysis<MemoryDependenceWrapperPass>().getMemDep();		auto *MDWP = !EnableMemorySSA
		? &getAnalysis<MemoryDependenceWrapperPass>()
		: getAnalysisIfAvailable<MemoryDependenceWrapperPass>();
auto *TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);		auto *TLI = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
auto *AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();		auto *AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
auto *AC = &getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);		auto *AC = &getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
auto *DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();		auto *DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
auto *MSSAWP = EnableMemorySSA		auto *MSSAWP = EnableMemorySSA
? &getAnalysis<MemorySSAWrapperPass>()		? &getAnalysis<MemorySSAWrapperPass>()
: getAnalysisIfAvailable<MemorySSAWrapperPass>();		: getAnalysisIfAvailable<MemorySSAWrapperPass>();

return Impl.runImpl(F, MD, TLI, AA, AC, DT,		return Impl.runImpl(F, MDWP ? & MDWP->getMemDep() : nullptr, TLI, AA, AC, DT,
MSSAWP ? &MSSAWP->getMSSA() : nullptr);		MSSAWP ? &MSSAWP->getMSSA() : nullptr);
}		}

llvm/test/Transforms/MemCpyOpt/callslot.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt -S -memcpyopt < %s -enable-memcpyopt-memoryssa=0 \| FileCheck %s		; RUN: opt -S -memcpyopt < %s -enable-memcpyopt-memoryssa=0 \| FileCheck %s --check-prefixes=CHECK,NO_MSSA
; RUN: opt -S -memcpyopt < %s -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s		; RUN: opt -S -memcpyopt < %s -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s --check-prefixes=CHECK,MSSA

define i8 @read_dest_between_call_and_memcpy() {		define i8 @read_dest_between_call_and_memcpy() {
; CHECK-LABEL: @read_dest_between_call_and_memcpy(		; CHECK-LABEL: @read_dest_between_call_and_memcpy(
; CHECK-NEXT: [[DEST:%.*]] = alloca [16 x i8], align 1		; CHECK-NEXT: [[DEST:%.*]] = alloca [16 x i8], align 1
; CHECK-NEXT: [[SRC:%.*]] = alloca [16 x i8], align 1		; CHECK-NEXT: [[SRC:%.*]] = alloca [16 x i8], align 1
; CHECK-NEXT: [[DEST_I8:%.]] = bitcast [16 x i8] [[DEST]] to i8*		; CHECK-NEXT: [[DEST_I8:%.]] = bitcast [16 x i8] [[DEST]] to i8*
; CHECK-NEXT: [[SRC_I8:%.]] = bitcast [16 x i8] [[SRC]] to i8*		; CHECK-NEXT: [[SRC_I8:%.]] = bitcast [16 x i8] [[SRC]] to i8*
; CHECK-NEXT: store i8 1, i8* [[DEST_I8]], align 1		; CHECK-NEXT: store i8 1, i8* [[DEST_I8]], align 1
Show All 9 Lines	;
store i8 1, i8* %dest.i8		store i8 1, i8* %dest.i8
call void @llvm.memset.p0i8.i64(i8* %src.i8, i8 0, i64 16, i1 false)		call void @llvm.memset.p0i8.i64(i8* %src.i8, i8 0, i64 16, i1 false)
%x = load i8, i8* %dest.i8		%x = load i8, i8* %dest.i8
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dest.i8, i8* %src.i8, i64 16, i1 false)		call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dest.i8, i8* %src.i8, i64 16, i1 false)
ret i8 %x		ret i8 %x
}		}

define i8 @read_src_between_call_and_memcpy() {		define i8 @read_src_between_call_and_memcpy() {
; CHECK-LABEL: @read_src_between_call_and_memcpy(		; NO_MSSA-LABEL: @read_src_between_call_and_memcpy(
; CHECK-NEXT: [[DEST:%.*]] = alloca [16 x i8], align 1		; NO_MSSA-NEXT: [[DEST:%.*]] = alloca [16 x i8], align 1
; CHECK-NEXT: [[SRC:%.*]] = alloca [16 x i8], align 1		; NO_MSSA-NEXT: [[SRC:%.*]] = alloca [16 x i8], align 1
; CHECK-NEXT: [[DEST_I8:%.]] = bitcast [16 x i8] [[DEST]] to i8*		; NO_MSSA-NEXT: [[DEST_I8:%.]] = bitcast [16 x i8] [[DEST]] to i8*
; CHECK-NEXT: [[SRC_I8:%.]] = bitcast [16 x i8] [[SRC]] to i8*		; NO_MSSA-NEXT: [[SRC_I8:%.]] = bitcast [16 x i8] [[SRC]] to i8*
; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* [[SRC_I8]], i8 0, i64 16, i1 false)		; NO_MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* [[SRC_I8]], i8 0, i64 16, i1 false)
; CHECK-NEXT: [[X:%.]] = load i8, i8 [[SRC_I8]], align 1		; NO_MSSA-NEXT: [[X:%.]] = load i8, i8 [[SRC_I8]], align 1
; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* [[DEST_I8]], i8* [[SRC_I8]], i64 16, i1 false)		; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* [[DEST_I8]], i8* [[SRC_I8]], i64 16, i1 false)
; CHECK-NEXT: ret i8 [[X]]		; NO_MSSA-NEXT: ret i8 [[X]]
		;
		; MSSA-LABEL: @read_src_between_call_and_memcpy(
		; MSSA-NEXT: [[DEST:%.*]] = alloca [16 x i8], align 1
		; MSSA-NEXT: [[SRC:%.*]] = alloca [16 x i8], align 1
		; MSSA-NEXT: [[DEST_I8:%.]] = bitcast [16 x i8] [[DEST]] to i8*
		; MSSA-NEXT: [[SRC_I8:%.]] = bitcast [16 x i8] [[SRC]] to i8*
		; MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* [[SRC_I8]], i8 0, i64 16, i1 false)
		; MSSA-NEXT: [[X:%.]] = load i8, i8 [[SRC_I8]], align 1
		; MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* [[DEST_I8]], i8 0, i64 16, i1 false)
		; MSSA-NEXT: ret i8 [[X]]
;		;
%dest = alloca [16 x i8]		%dest = alloca [16 x i8]
%src = alloca [16 x i8]		%src = alloca [16 x i8]
%dest.i8 = bitcast [16 x i8]* %dest to i8*		%dest.i8 = bitcast [16 x i8]* %dest to i8*
%src.i8 = bitcast [16 x i8]* %src to i8*		%src.i8 = bitcast [16 x i8]* %src to i8*
call void @llvm.memset.p0i8.i64(i8* %src.i8, i8 0, i64 16, i1 false)		call void @llvm.memset.p0i8.i64(i8* %src.i8, i8 0, i64 16, i1 false)
%x = load i8, i8* %src.i8		%x = load i8, i8* %src.i8
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dest.i8, i8* %src.i8, i64 16, i1 false)		call void @llvm.memcpy.p0i8.p0i8.i64(i8* %dest.i8, i8* %src.i8, i64 16, i1 false)
▲ Show 20 Lines • Show All 190 Lines • Show Last 20 Lines

llvm/test/Transforms/MemCpyOpt/invariant.start.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; MemCpy optimizations should take place even in presence of invariant.start			; MemCpy optimizations should take place even in presence of invariant.start
	; RUN: opt < %s -basic-aa -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s			; RUN: opt < %s -basic-aa -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s --check-prefixes=CHECK,NO_MSSA
	; RUN: opt < %s -basic-aa -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s			; RUN: opt < %s -basic-aa -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s --check-prefixes=CHECK,MSSA

	target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"			target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128"

	target triple = "i686-apple-darwin9"			target triple = "i686-apple-darwin9"

	%0 = type { x86_fp80, x86_fp80 }			%0 = type { x86_fp80, x86_fp80 }
	declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i1) nounwind			declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i1) nounwind
	declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i1)			declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i1)
	declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i1)			declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i1)

	declare {}* @llvm.invariant.start.p0i8(i64, i8* nocapture) nounwind readonly			declare {}* @llvm.invariant.start.p0i8(i64, i8* nocapture) nounwind readonly

	; FIXME: The invariant.start does not modify %P.			; FIXME: The invariant.start does not modify %P.
	; The intermediate alloca and one of the memcpy's should be eliminated, the			; The intermediate alloca and one of the memcpy's should be eliminated, the
	; other should be transformed to a memmove.			; other should be transformed to a memmove.
	define void @test1(i8* %P, i8* %Q) nounwind {			define void @test1(i8* %P, i8* %Q) nounwind {
	; CHECK-LABEL: @test1(			; NO_MSSA-LABEL: @test1(
	; CHECK-NEXT: [[MEMTMP:%.]] = alloca [[TMP0:%.]], align 16			; NO_MSSA-NEXT: [[MEMTMP:%.]] = alloca [[TMP0:%.]], align 16
	; CHECK-NEXT: [[R:%.]] = bitcast %0 [[MEMTMP]] to i8*			; NO_MSSA-NEXT: [[R:%.]] = bitcast %0 [[MEMTMP]] to i8*
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 [[R]], i8* align 16 [[P:%.*]], i32 32, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 [[R]], i8* align 16 [[P:%.*]], i32 32, i1 false)
	; CHECK-NEXT: [[I:%.]] = call {} @llvm.invariant.start.p0i8(i64 32, i8* [[P]])			; NO_MSSA-NEXT: [[I:%.]] = call {} @llvm.invariant.start.p0i8(i64 32, i8* [[P]])
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 [[Q:%.]], i8 align 16 [[R]], i32 32, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 [[Q:%.]], i8 align 16 [[R]], i32 32, i1 false)
	; CHECK-NEXT: ret void			; NO_MSSA-NEXT: ret void
				;
				; MSSA-LABEL: @test1(
				; MSSA-NEXT: [[MEMTMP:%.]] = alloca [[TMP0:%.]], align 16
				; MSSA-NEXT: [[R:%.]] = bitcast %0 [[MEMTMP]] to i8*
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 [[R]], i8* align 16 [[P:%.*]], i32 32, i1 false)
				; MSSA-NEXT: [[I:%.]] = call {} @llvm.invariant.start.p0i8(i64 32, i8* [[P]])
				; MSSA-NEXT: call void @llvm.memmove.p0i8.p0i8.i32(i8* align 16 [[Q:%.]], i8 align 16 [[P]], i32 32, i1 false)
				; MSSA-NEXT: ret void
	;			;
	%memtmp = alloca %0, align 16			%memtmp = alloca %0, align 16
	%R = bitcast %0* %memtmp to i8*			%R = bitcast %0* %memtmp to i8*
	call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 %R, i8* align 16 %P, i32 32, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 %R, i8* align 16 %P, i32 32, i1 false)
	%i = call {}* @llvm.invariant.start.p0i8(i64 32, i8* %P)			%i = call {}* @llvm.invariant.start.p0i8(i64 32, i8* %P)
	call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 %Q, i8* align 16 %R, i32 32, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 16 %Q, i8* align 16 %R, i32 32, i1 false)
	ret void			ret void
	}			}
	Show All 16 Lines

llvm/test/Transforms/MemCpyOpt/memcpy-invoke-memcpy.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s			; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s --check-prefixes=CHECK,NO_MSSA
	; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s			; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s --check-prefixes=CHECK,MSSA

	; Test memcpy-memcpy dependencies across invoke edges.			; Test memcpy-memcpy dependencies across invoke edges.

	; Test that memcpyopt works across the non-unwind edge of an invoke.			; Test that memcpyopt works across the non-unwind edge of an invoke.
	; TODO: Not supported yet.			; TODO: Not supported yet.

	define hidden void @test_normal(i8* noalias %dst, i8* %src) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define hidden void @test_normal(i8* noalias %dst, i8* %src) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	; CHECK-LABEL: @test_normal(			; NO_MSSA-LABEL: @test_normal(
	; CHECK-NEXT: entry:			; NO_MSSA-NEXT: entry:
	; CHECK-NEXT: [[TEMP:%.*]] = alloca i8, i32 64, align 1			; NO_MSSA-NEXT: [[TEMP:%.*]] = alloca i8, i32 64, align 1
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TEMP]], i8* nonnull align 8 [[SRC:%.*]], i64 64, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TEMP]], i8* nonnull align 8 [[SRC:%.*]], i64 64, i1 false)
	; CHECK-NEXT: invoke void @invoke_me()			; NO_MSSA-NEXT: invoke void @invoke_me()
	; CHECK-NEXT: to label [[TRY_CONT:%.]] unwind label [[LPAD:%.]]			; NO_MSSA-NEXT: to label [[TRY_CONT:%.]] unwind label [[LPAD:%.]]
	; CHECK: lpad:			; NO_MSSA: lpad:
	; CHECK-NEXT: [[TMP0:%.]] = landingpad { i8, i32 }			; NO_MSSA-NEXT: [[TMP0:%.]] = landingpad { i8, i32 }
	; CHECK-NEXT: catch i8* null			; NO_MSSA-NEXT: catch i8* null
	; CHECK-NEXT: ret void			; NO_MSSA-NEXT: ret void
	; CHECK: try.cont:			; NO_MSSA: try.cont:
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[DST:%.]], i8 align 8 [[TEMP]], i64 64, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[DST:%.]], i8 align 8 [[TEMP]], i64 64, i1 false)
	; CHECK-NEXT: ret void			; NO_MSSA-NEXT: ret void
				;
				; MSSA-LABEL: @test_normal(
				; MSSA-NEXT: entry:
				; MSSA-NEXT: [[TEMP:%.*]] = alloca i8, i32 64, align 1
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TEMP]], i8* nonnull align 8 [[SRC:%.*]], i64 64, i1 false)
				; MSSA-NEXT: invoke void @invoke_me()
				; MSSA-NEXT: to label [[TRY_CONT:%.]] unwind label [[LPAD:%.]]
				; MSSA: lpad:
				; MSSA-NEXT: [[TMP0:%.]] = landingpad { i8, i32 }
				; MSSA-NEXT: catch i8* null
				; MSSA-NEXT: ret void
				; MSSA: try.cont:
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[DST:%.]], i8 align 8 [[SRC]], i64 64, i1 false)
				; MSSA-NEXT: ret void
	;			;
	entry:			entry:
	%temp = alloca i8, i32 64			%temp = alloca i8, i32 64
	call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %temp, i8* nonnull align 8 %src, i64 64, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %temp, i8* nonnull align 8 %src, i64 64, i1 false)
	invoke void @invoke_me()			invoke void @invoke_me()
	to label %try.cont unwind label %lpad			to label %try.cont unwind label %lpad

	lpad:			lpad:
	landingpad { i8*, i32 }			landingpad { i8*, i32 }
	catch i8* null			catch i8* null
	ret void			ret void

	try.cont:			try.cont:
	call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %dst, i8* align 8 %temp, i64 64, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %dst, i8* align 8 %temp, i64 64, i1 false)
	ret void			ret void
	}			}

	; Test that memcpyopt works across the unwind edge of an invoke.			; Test that memcpyopt works across the unwind edge of an invoke.
	; TODO: Not supported yet.			; TODO: Not supported yet.

	define hidden void @test_unwind(i8* noalias %dst, i8* %src) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define hidden void @test_unwind(i8* noalias %dst, i8* %src) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	; CHECK-LABEL: @test_unwind(			; NO_MSSA-LABEL: @test_unwind(
	; CHECK-NEXT: entry:			; NO_MSSA-NEXT: entry:
	; CHECK-NEXT: [[TEMP:%.*]] = alloca i8, i32 64, align 1			; NO_MSSA-NEXT: [[TEMP:%.*]] = alloca i8, i32 64, align 1
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TEMP]], i8* nonnull align 8 [[SRC:%.*]], i64 64, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TEMP]], i8* nonnull align 8 [[SRC:%.*]], i64 64, i1 false)
	; CHECK-NEXT: invoke void @invoke_me()			; NO_MSSA-NEXT: invoke void @invoke_me()
	; CHECK-NEXT: to label [[TRY_CONT:%.]] unwind label [[LPAD:%.]]			; NO_MSSA-NEXT: to label [[TRY_CONT:%.]] unwind label [[LPAD:%.]]
	; CHECK: lpad:			; NO_MSSA: lpad:
	; CHECK-NEXT: [[TMP0:%.]] = landingpad { i8, i32 }			; NO_MSSA-NEXT: [[TMP0:%.]] = landingpad { i8, i32 }
	; CHECK-NEXT: catch i8* null			; NO_MSSA-NEXT: catch i8* null
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[DST:%.]], i8 align 8 [[TEMP]], i64 64, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[DST:%.]], i8 align 8 [[TEMP]], i64 64, i1 false)
	; CHECK-NEXT: ret void			; NO_MSSA-NEXT: ret void
	; CHECK: try.cont:			; NO_MSSA: try.cont:
	; CHECK-NEXT: ret void			; NO_MSSA-NEXT: ret void
				;
				; MSSA-LABEL: @test_unwind(
				; MSSA-NEXT: entry:
				; MSSA-NEXT: [[TEMP:%.*]] = alloca i8, i32 64, align 1
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TEMP]], i8* nonnull align 8 [[SRC:%.*]], i64 64, i1 false)
				; MSSA-NEXT: invoke void @invoke_me()
				; MSSA-NEXT: to label [[TRY_CONT:%.]] unwind label [[LPAD:%.]]
				; MSSA: lpad:
				; MSSA-NEXT: [[TMP0:%.]] = landingpad { i8, i32 }
				; MSSA-NEXT: catch i8* null
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[DST:%.]], i8 align 8 [[SRC]], i64 64, i1 false)
				; MSSA-NEXT: ret void
				; MSSA: try.cont:
				; MSSA-NEXT: ret void
	;			;
	entry:			entry:
	%temp = alloca i8, i32 64			%temp = alloca i8, i32 64
	call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %temp, i8* nonnull align 8 %src, i64 64, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %temp, i8* nonnull align 8 %src, i64 64, i1 false)
	invoke void @invoke_me()			invoke void @invoke_me()
	to label %try.cont unwind label %lpad			to label %try.cont unwind label %lpad

	lpad:			lpad:
	Show All 12 Lines

llvm/test/Transforms/MemCpyOpt/memcpy.ll

Show First 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	;
%a2 = bitcast %1* %a1 to i8*		%a2 = bitcast %1* %a1 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a2, i8* align 4 %P, i64 8, i1 false)		call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a2, i8* align 4 %P, i64 8, i1 false)
store i8 0, i8* %a2		store i8 0, i8* %a2
call void @test4a(i8* align 1 byval(i8) %a2)		call void @test4a(i8* align 1 byval(i8) %a2)
ret void		ret void
}		}

define i8 @test4_read_between(i8 *%P) {		define i8 @test4_read_between(i8 *%P) {
; CHECK-LABEL: @test4_read_between(		; NO_MSSA-LABEL: @test4_read_between(
; CHECK-NEXT: [[A1:%.]] = alloca [[TMP1:%.]], align 8		; NO_MSSA-NEXT: [[A1:%.]] = alloca [[TMP1:%.]], align 8
; CHECK-NEXT: [[A2:%.]] = bitcast %1 [[A1]] to i8*		; NO_MSSA-NEXT: [[A2:%.]] = bitcast %1 [[A1]] to i8*
; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[A2]], i8* align 4 [[P:%.*]], i64 8, i1 false)		; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[A2]], i8* align 4 [[P:%.*]], i64 8, i1 false)
; CHECK-NEXT: [[X:%.]] = load i8, i8 [[A2]], align 1		; NO_MSSA-NEXT: [[X:%.]] = load i8, i8 [[A2]], align 1
; CHECK-NEXT: call void @test4a(i8* byval(i8) align 1 [[A2]])		; NO_MSSA-NEXT: call void @test4a(i8* byval align 1 [[A2]])
; CHECK-NEXT: ret i8 [[X]]		; NO_MSSA-NEXT: ret i8 [[X]]
		;
		; MSSA-LABEL: @test4_read_between(
		; MSSA-NEXT: [[A1:%.]] = alloca [[TMP1:%.]], align 8
		; MSSA-NEXT: [[A2:%.]] = bitcast %1 [[A1]] to i8*
		; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[A2]], i8* align 4 [[P:%.*]], i64 8, i1 false)
		; MSSA-NEXT: [[X:%.]] = load i8, i8 [[A2]], align 1
		; MSSA-NEXT: call void @test4a(i8* byval align 1 [[P]])
		; MSSA-NEXT: ret i8 [[X]]
;		;
%a1 = alloca %1		%a1 = alloca %1
%a2 = bitcast %1* %a1 to i8*		%a2 = bitcast %1* %a1 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a2, i8* align 4 %P, i64 8, i1 false)		call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a2, i8* align 4 %P, i64 8, i1 false)
%x = load i8, i8* %a2		%x = load i8, i8* %a2
call void @test4a(i8* align 1 byval(i8) %a2)		call void @test4a(i8* align 1 byval(i8) %a2)
ret i8 %x		ret i8 %x
}		}

define void @test4_non_local(i8 *%P, i1 %c) {		define void @test4_non_local(i8 *%P, i1 %c) {
; CHECK-LABEL: @test4_non_local(		; NO_MSSA-LABEL: @test4_non_local(
; CHECK-NEXT: [[A1:%.]] = alloca [[TMP1:%.]], align 8		; NO_MSSA-NEXT: [[A1:%.]] = alloca [[TMP1:%.]], align 8
; CHECK-NEXT: [[A2:%.]] = bitcast %1 [[A1]] to i8*		; NO_MSSA-NEXT: [[A2:%.]] = bitcast %1 [[A1]] to i8*
; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[A2]], i8* align 4 [[P:%.*]], i64 8, i1 false)		; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[A2]], i8* align 4 [[P:%.*]], i64 8, i1 false)
; CHECK-NEXT: br i1 [[C:%.]], label [[CALL:%.]], label [[EXIT:%.*]]		; NO_MSSA-NEXT: br i1 [[C:%.]], label [[CALL:%.]], label [[EXIT:%.*]]
; CHECK: call:		; NO_MSSA: call:
; CHECK-NEXT: call void @test4a(i8* byval(i8) align 1 [[A2]])		; NO_MSSA-NEXT: call void @test4a(i8* byval align 1 [[A2]])
; CHECK-NEXT: br label [[EXIT]]		; NO_MSSA-NEXT: br label [[EXIT]]
; CHECK: exit:		; NO_MSSA: exit:
; CHECK-NEXT: ret void		; NO_MSSA-NEXT: ret void
		;
		; MSSA-LABEL: @test4_non_local(
		; MSSA-NEXT: [[A1:%.]] = alloca [[TMP1:%.]], align 8
		; MSSA-NEXT: [[A2:%.]] = bitcast %1 [[A1]] to i8*
		; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[A2]], i8* align 4 [[P:%.*]], i64 8, i1 false)
		; MSSA-NEXT: br i1 [[C:%.]], label [[CALL:%.]], label [[EXIT:%.*]]
		; MSSA: call:
		; MSSA-NEXT: call void @test4a(i8* byval align 1 [[P]])
		; MSSA-NEXT: br label [[EXIT]]
		; MSSA: exit:
		; MSSA-NEXT: ret void
;		;
%a1 = alloca %1		%a1 = alloca %1
%a2 = bitcast %1* %a1 to i8*		%a2 = bitcast %1* %a1 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a2, i8* align 4 %P, i64 8, i1 false)		call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a2, i8* align 4 %P, i64 8, i1 false)
br i1 %c, label %call, label %exit		br i1 %c, label %call, label %exit

call:		call:
call void @test4a(i8* align 1 byval(i8) %a2)		call void @test4a(i8* align 1 byval(i8) %a2)
▲ Show 20 Lines • Show All 191 Lines • Show Last 20 Lines

llvm/test/Transforms/MemCpyOpt/merge-into-memset.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s			; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s --check-prefix=NO_MSSA
	; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s			; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s --check-prefix=MSSA

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	; Don't delete the memcpy in %if.then, even though it depends on an instruction			; Don't delete the memcpy in %if.then, even though it depends on an instruction
	; which will be deleted.			; which will be deleted.

	define void @foo(i1 %c, i8* %d, i8* %e, i8* %f) {			define void @foo(i1 %c, i8* %d, i8* %e, i8* %f) {
	; CHECK-LABEL: @foo(			; NO_MSSA-LABEL: @foo(
	; CHECK-NEXT: entry:			; NO_MSSA-NEXT: entry:
	; CHECK-NEXT: [[TMP:%.*]] = alloca [50 x i8], align 8			; NO_MSSA-NEXT: [[TMP:%.*]] = alloca [50 x i8], align 8
	; CHECK-NEXT: [[TMP4:%.]] = bitcast [50 x i8] [[TMP]] to i8*			; NO_MSSA-NEXT: [[TMP4:%.]] = bitcast [50 x i8] [[TMP]] to i8*
	; CHECK-NEXT: [[TMP1:%.]] = getelementptr inbounds i8, i8 [[TMP4]], i64 1			; NO_MSSA-NEXT: [[TMP1:%.]] = getelementptr inbounds i8, i8 [[TMP4]], i64 1
	; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull [[D:%.*]], i8 0, i64 10, i1 false)			; NO_MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull [[D:%.*]], i8 0, i64 10, i1 false)
	; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[TMP4]], i8 0, i64 11, i1 false)			; NO_MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[TMP4]], i8 0, i64 11, i1 false)
	; CHECK-NEXT: br i1 [[C:%.]], label [[IF_THEN:%.]], label [[EXIT:%.*]]			; NO_MSSA-NEXT: br i1 [[C:%.]], label [[IF_THEN:%.]], label [[EXIT:%.*]]
	; CHECK: if.then:			; NO_MSSA: if.then:
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[F:%.]], i8 nonnull align 8 [[TMP4]], i64 30, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[F:%.]], i8 nonnull align 8 [[TMP4]], i64 30, i1 false)
	; CHECK-NEXT: br label [[EXIT]]			; NO_MSSA-NEXT: br label [[EXIT]]
	; CHECK: exit:			; NO_MSSA: exit:
	; CHECK-NEXT: ret void			; NO_MSSA-NEXT: ret void
				;
				; MSSA-LABEL: @foo(
				; MSSA-NEXT: entry:
				; MSSA-NEXT: [[TMP:%.*]] = alloca [50 x i8], align 8
				; MSSA-NEXT: [[TMP4:%.]] = bitcast [50 x i8] [[TMP]] to i8*
				; MSSA-NEXT: [[TMP1:%.]] = getelementptr inbounds i8, i8 [[TMP4]], i64 1
				; MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull [[D:%.*]], i8 0, i64 10, i1 false)
				; MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[TMP4]], i8 0, i64 11, i1 false)
				; MSSA-NEXT: br i1 [[C:%.]], label [[IF_THEN:%.]], label [[EXIT:%.*]]
				; MSSA: if.then:
				; MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[F:%.*]], i8 0, i64 11, i1 false)
				; MSSA-NEXT: br label [[EXIT]]
				; MSSA: exit:
				; MSSA-NEXT: ret void
	;			;
	entry:			entry:
	%tmp = alloca [50 x i8], align 8			%tmp = alloca [50 x i8], align 8
	%tmp4 = bitcast [50 x i8]* %tmp to i8*			%tmp4 = bitcast [50 x i8]* %tmp to i8*
	%tmp1 = getelementptr inbounds i8, i8* %tmp4, i64 1			%tmp1 = getelementptr inbounds i8, i8* %tmp4, i64 1
	call void @llvm.memset.p0i8.i64(i8* nonnull %d, i8 0, i64 10, i1 false)			call void @llvm.memset.p0i8.i64(i8* nonnull %d, i8 0, i64 10, i1 false)
	store i8 0, i8* %tmp4, align 8			store i8 0, i8* %tmp4, align 8
	call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull %tmp1, i8* nonnull %d, i64 10, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull %tmp1, i8* nonnull %d, i64 10, i1 false)
	Show All 12 Lines

llvm/test/Transforms/MemCpyOpt/mixed-sizes.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s			; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s --check-prefix=NO_MSSA
	; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s			; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s --check-prefix=MSSA
	; Handle memcpy-memcpy dependencies of differing sizes correctly.			; Handle memcpy-memcpy dependencies of differing sizes correctly.

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	; Don't delete the second memcpy, even though there's an earlier			; Don't delete the second memcpy, even though there's an earlier
	; memcpy with a larger size from the same address.			; memcpy with a larger size from the same address.

	define i32 @foo(i1 %z) {			define i32 @foo(i1 %z) {
	; CHECK-LABEL: @foo(			; NO_MSSA-LABEL: @foo(
	; CHECK-NEXT: entry:			; NO_MSSA-NEXT: entry:
	; CHECK-NEXT: [[A:%.*]] = alloca [10 x i32], align 4			; NO_MSSA-NEXT: [[A:%.*]] = alloca [10 x i32], align 4
	; CHECK-NEXT: [[S:%.*]] = alloca [10 x i32], align 4			; NO_MSSA-NEXT: [[S:%.*]] = alloca [10 x i32], align 4
	; CHECK-NEXT: [[TMP0:%.]] = bitcast [10 x i32] [[A]] to i8*			; NO_MSSA-NEXT: [[TMP0:%.]] = bitcast [10 x i32] [[A]] to i8*
	; CHECK-NEXT: [[TMP1:%.]] = bitcast [10 x i32] [[S]] to i8*			; NO_MSSA-NEXT: [[TMP1:%.]] = bitcast [10 x i32] [[S]] to i8*
	; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 16 [[TMP1]], i8 0, i64 40, i1 false)			; NO_MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 16 [[TMP1]], i8 0, i64 40, i1 false)
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [10 x i32], [10 x i32] [[A]], i64 0, i64 0			; NO_MSSA-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [10 x i32], [10 x i32] [[A]], i64 0, i64 0
	; CHECK-NEXT: store i32 1, i32* [[ARRAYIDX]], align 4			; NO_MSSA-NEXT: store i32 1, i32* [[ARRAYIDX]], align 4
	; CHECK-NEXT: [[SCEVGEP:%.]] = getelementptr [10 x i32], [10 x i32] [[S]], i64 0, i64 1			; NO_MSSA-NEXT: [[SCEVGEP:%.]] = getelementptr [10 x i32], [10 x i32] [[S]], i64 0, i64 1
	; CHECK-NEXT: [[SCEVGEP7:%.]] = bitcast i32 [[SCEVGEP]] to i8*			; NO_MSSA-NEXT: [[SCEVGEP7:%.]] = bitcast i32 [[SCEVGEP]] to i8*
	; CHECK-NEXT: br i1 [[Z:%.]], label [[FOR_BODY3_LR_PH:%.]], label [[FOR_INC7_1:%.*]]			; NO_MSSA-NEXT: br i1 [[Z:%.]], label [[FOR_BODY3_LR_PH:%.]], label [[FOR_INC7_1:%.*]]
	; CHECK: for.body3.lr.ph:			; NO_MSSA: for.body3.lr.ph:
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[TMP0]], i8* align 4 [[SCEVGEP7]], i64 17179869180, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[TMP0]], i8* align 4 [[SCEVGEP7]], i64 17179869180, i1 false)
	; CHECK-NEXT: br label [[FOR_INC7_1]]			; NO_MSSA-NEXT: br label [[FOR_INC7_1]]
	; CHECK: for.inc7.1:			; NO_MSSA: for.inc7.1:
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[TMP0]], i8* align 4 [[SCEVGEP7]], i64 4, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[TMP0]], i8* align 4 [[SCEVGEP7]], i64 4, i1 false)
	; CHECK-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX]], align 4			; NO_MSSA-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX]], align 4
	; CHECK-NEXT: ret i32 [[TMP2]]			; NO_MSSA-NEXT: ret i32 [[TMP2]]
				;
				; MSSA-LABEL: @foo(
				; MSSA-NEXT: entry:
				; MSSA-NEXT: [[A:%.*]] = alloca [10 x i32], align 4
				; MSSA-NEXT: [[S:%.*]] = alloca [10 x i32], align 4
				; MSSA-NEXT: [[TMP0:%.]] = bitcast [10 x i32] [[A]] to i8*
				; MSSA-NEXT: [[TMP1:%.]] = bitcast [10 x i32] [[S]] to i8*
				; MSSA-NEXT: call void @llvm.memset.p0i8.i64(i8* nonnull align 16 [[TMP1]], i8 0, i64 40, i1 false)
				; MSSA-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [10 x i32], [10 x i32] [[A]], i64 0, i64 0
				; MSSA-NEXT: store i32 1, i32* [[ARRAYIDX]], align 4
				; MSSA-NEXT: [[SCEVGEP:%.]] = getelementptr [10 x i32], [10 x i32] [[S]], i64 0, i64 1
				; MSSA-NEXT: [[SCEVGEP7:%.]] = bitcast i32 [[SCEVGEP]] to i8*
				; MSSA-NEXT: br i1 [[Z:%.]], label [[FOR_BODY3_LR_PH:%.]], label [[FOR_INC7_1:%.*]]
				; MSSA: for.body3.lr.ph:
				; MSSA-NEXT: br label [[FOR_INC7_1]]
				; MSSA: for.inc7.1:
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[TMP0]], i8* align 4 [[SCEVGEP7]], i64 4, i1 false)
				; MSSA-NEXT: [[TMP2:%.]] = load i32, i32 [[ARRAYIDX]], align 4
				; MSSA-NEXT: ret i32 [[TMP2]]
	;			;
	entry:			entry:
	%a = alloca [10 x i32]			%a = alloca [10 x i32]
	%s = alloca [10 x i32]			%s = alloca [10 x i32]
	%0 = bitcast [10 x i32]* %a to i8*			%0 = bitcast [10 x i32]* %a to i8*
	%1 = bitcast [10 x i32]* %s to i8*			%1 = bitcast [10 x i32]* %s to i8*
	call void @llvm.memset.p0i8.i64(i8* nonnull align 16 %1, i8 0, i64 40, i1 false)			call void @llvm.memset.p0i8.i64(i8* nonnull align 16 %1, i8 0, i64 40, i1 false)
	%arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0, i64 0			%arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0, i64 0
	Show All 17 Lines

llvm/test/Transforms/MemCpyOpt/nonlocal-memcpy-memcpy.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s			; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=0 \| FileCheck %s --check-prefix=NO_MSSA
	; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s			; RUN: opt < %s -memcpyopt -S -enable-memcpyopt-memoryssa=1 -verify-memoryssa \| FileCheck %s --check-prefix=MSSA

	; Test whether memcpy-memcpy dependence is optimized across			; Test whether memcpy-memcpy dependence is optimized across
	; basic blocks (conditional branches and invokes).			; basic blocks (conditional branches and invokes).
	; TODO: This is not supported yet.			; TODO: This is not supported yet.

	%struct.s = type { i32, i32 }			%struct.s = type { i32, i32 }

	@s_foo = private unnamed_addr constant %struct.s { i32 1, i32 2 }, align 4			@s_foo = private unnamed_addr constant %struct.s { i32 1, i32 2 }, align 4
	@s_baz = private unnamed_addr constant %struct.s { i32 1, i32 2 }, align 4			@s_baz = private unnamed_addr constant %struct.s { i32 1, i32 2 }, align 4
	@i = external constant i8*			@i = external constant i8*

	declare void @qux()			declare void @qux()
	declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i1)			declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i1)
	declare void @__cxa_throw(i8, i8, i8*)			declare void @__cxa_throw(i8, i8, i8*)
	declare i32 @__gxx_personality_v0(...)			declare i32 @__gxx_personality_v0(...)
	declare i8* @__cxa_begin_catch(i8*)			declare i8* @__cxa_begin_catch(i8*)

	; A simple partial redundancy. Test that the second memcpy is optimized			; A simple partial redundancy. Test that the second memcpy is optimized
	; to copy directly from the original source rather than from the temporary.			; to copy directly from the original source rather than from the temporary.

	define void @wobble(i8* noalias %dst, i8* %src, i1 %some_condition) {			define void @wobble(i8* noalias %dst, i8* %src, i1 %some_condition) {
	; CHECK-LABEL: @wobble(			; NO_MSSA-LABEL: @wobble(
	; CHECK-NEXT: bb:			; NO_MSSA-NEXT: bb:
	; CHECK-NEXT: [[TEMP:%.*]] = alloca i8, i32 64, align 1			; NO_MSSA-NEXT: [[TEMP:%.*]] = alloca i8, i32 64, align 1
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TEMP]], i8* nonnull align 8 [[SRC:%.*]], i64 64, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TEMP]], i8* nonnull align 8 [[SRC:%.*]], i64 64, i1 false)
	; CHECK-NEXT: br i1 [[SOME_CONDITION:%.]], label [[MORE:%.]], label [[OUT:%.*]]			; NO_MSSA-NEXT: br i1 [[SOME_CONDITION:%.]], label [[MORE:%.]], label [[OUT:%.*]]
	; CHECK: out:			; NO_MSSA: out:
	; CHECK-NEXT: call void @qux()			; NO_MSSA-NEXT: call void @qux()
	; CHECK-NEXT: unreachable			; NO_MSSA-NEXT: unreachable
	; CHECK: more:			; NO_MSSA: more:
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[DST:%.]], i8 align 8 [[TEMP]], i64 64, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[DST:%.]], i8 align 8 [[TEMP]], i64 64, i1 false)
	; CHECK-NEXT: ret void			; NO_MSSA-NEXT: ret void
				;
				; MSSA-LABEL: @wobble(
				; MSSA-NEXT: bb:
				; MSSA-NEXT: [[TEMP:%.*]] = alloca i8, i32 64, align 1
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[TEMP]], i8* nonnull align 8 [[SRC:%.*]], i64 64, i1 false)
				; MSSA-NEXT: br i1 [[SOME_CONDITION:%.]], label [[MORE:%.]], label [[OUT:%.*]]
				; MSSA: out:
				; MSSA-NEXT: call void @qux()
				; MSSA-NEXT: unreachable
				; MSSA: more:
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 [[DST:%.]], i8 align 8 [[SRC]], i64 64, i1 false)
				; MSSA-NEXT: ret void
	;			;
	bb:			bb:
	%temp = alloca i8, i32 64			%temp = alloca i8, i32 64
	call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %temp, i8* nonnull align 8%src, i64 64, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %temp, i8* nonnull align 8%src, i64 64, i1 false)
	br i1 %some_condition, label %more, label %out			br i1 %some_condition, label %more, label %out

	out:			out:
	call void @qux()			call void @qux()
	unreachable			unreachable

	more:			more:
	call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %dst, i8* align 8 %temp, i64 64, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %dst, i8* align 8 %temp, i64 64, i1 false)
	ret void			ret void
	}			}

	; A CFG triangle with a partial redundancy targeting an alloca. Test that the			; A CFG triangle with a partial redundancy targeting an alloca. Test that the
	; memcpy inside the triangle is optimized to copy directly from the original			; memcpy inside the triangle is optimized to copy directly from the original
	; source rather than from the temporary.			; source rather than from the temporary.

	define i32 @foo(i1 %t3) {			define i32 @foo(i1 %t3) {
	; CHECK-LABEL: @foo(			; NO_MSSA-LABEL: @foo(
	; CHECK-NEXT: bb:			; NO_MSSA-NEXT: bb:
	; CHECK-NEXT: [[S:%.]] = alloca [[STRUCT_S:%.]], align 4			; NO_MSSA-NEXT: [[S:%.]] = alloca [[STRUCT_S:%.]], align 4
	; CHECK-NEXT: [[T:%.*]] = alloca [[STRUCT_S]], align 4			; NO_MSSA-NEXT: [[T:%.*]] = alloca [[STRUCT_S]], align 4
	; CHECK-NEXT: [[S1:%.]] = bitcast %struct.s [[S]] to i8*			; NO_MSSA-NEXT: [[S1:%.]] = bitcast %struct.s [[S]] to i8*
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[S1]], i8* align 4 bitcast (%struct.s* @s_foo to i8*), i64 8, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[S1]], i8* align 4 bitcast (%struct.s* @s_foo to i8*), i64 8, i1 false)
	; CHECK-NEXT: br i1 [[T3:%.]], label [[BB4:%.]], label [[BB7:%.*]]			; NO_MSSA-NEXT: br i1 [[T3:%.]], label [[BB4:%.]], label [[BB7:%.*]]
	; CHECK: bb4:			; NO_MSSA: bb4:
	; CHECK-NEXT: [[T5:%.]] = bitcast %struct.s [[T]] to i8*			; NO_MSSA-NEXT: [[T5:%.]] = bitcast %struct.s [[T]] to i8*
	; CHECK-NEXT: [[S6:%.]] = bitcast %struct.s [[S]] to i8*			; NO_MSSA-NEXT: [[S6:%.]] = bitcast %struct.s [[S]] to i8*
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[T5]], i8* align 4 [[S6]], i64 8, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[T5]], i8* align 4 [[S6]], i64 8, i1 false)
	; CHECK-NEXT: br label [[BB7]]			; NO_MSSA-NEXT: br label [[BB7]]
	; CHECK: bb7:			; NO_MSSA: bb7:
	; CHECK-NEXT: [[T8:%.]] = getelementptr [[STRUCT_S]], %struct.s [[T]], i32 0, i32 0			; NO_MSSA-NEXT: [[T8:%.]] = getelementptr [[STRUCT_S]], %struct.s [[T]], i32 0, i32 0
	; CHECK-NEXT: [[T9:%.]] = load i32, i32 [[T8]], align 4			; NO_MSSA-NEXT: [[T9:%.]] = load i32, i32 [[T8]], align 4
	; CHECK-NEXT: [[T10:%.]] = getelementptr [[STRUCT_S]], %struct.s [[T]], i32 0, i32 1			; NO_MSSA-NEXT: [[T10:%.]] = getelementptr [[STRUCT_S]], %struct.s [[T]], i32 0, i32 1
	; CHECK-NEXT: [[T11:%.]] = load i32, i32 [[T10]], align 4			; NO_MSSA-NEXT: [[T11:%.]] = load i32, i32 [[T10]], align 4
	; CHECK-NEXT: [[T12:%.*]] = add i32 [[T9]], [[T11]]			; NO_MSSA-NEXT: [[T12:%.*]] = add i32 [[T9]], [[T11]]
	; CHECK-NEXT: ret i32 [[T12]]			; NO_MSSA-NEXT: ret i32 [[T12]]
				;
				; MSSA-LABEL: @foo(
				; MSSA-NEXT: bb:
				; MSSA-NEXT: [[S:%.]] = alloca [[STRUCT_S:%.]], align 4
				; MSSA-NEXT: [[T:%.*]] = alloca [[STRUCT_S]], align 4
				; MSSA-NEXT: [[S1:%.]] = bitcast %struct.s [[S]] to i8*
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[S1]], i8* align 4 bitcast (%struct.s* @s_foo to i8*), i64 8, i1 false)
				; MSSA-NEXT: br i1 [[T3:%.]], label [[BB4:%.]], label [[BB7:%.*]]
				; MSSA: bb4:
				; MSSA-NEXT: [[T5:%.]] = bitcast %struct.s [[T]] to i8*
				; MSSA-NEXT: [[S6:%.]] = bitcast %struct.s [[S]] to i8*
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[T5]], i8* align 4 bitcast (%struct.s* @s_foo to i8*), i64 8, i1 false)
				; MSSA-NEXT: br label [[BB7]]
				; MSSA: bb7:
				; MSSA-NEXT: [[T8:%.]] = getelementptr [[STRUCT_S]], %struct.s [[T]], i32 0, i32 0
				; MSSA-NEXT: [[T9:%.]] = load i32, i32 [[T8]], align 4
				; MSSA-NEXT: [[T10:%.]] = getelementptr [[STRUCT_S]], %struct.s [[T]], i32 0, i32 1
				; MSSA-NEXT: [[T11:%.]] = load i32, i32 [[T10]], align 4
				; MSSA-NEXT: [[T12:%.*]] = add i32 [[T9]], [[T11]]
				; MSSA-NEXT: ret i32 [[T12]]
	;			;
	bb:			bb:
	%s = alloca %struct.s, align 4			%s = alloca %struct.s, align 4
	%t = alloca %struct.s, align 4			%t = alloca %struct.s, align 4
	%s1 = bitcast %struct.s* %s to i8*			%s1 = bitcast %struct.s* %s to i8*
	call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %s1, i8* align 4 bitcast (%struct.s* @s_foo to i8*), i64 8, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %s1, i8* align 4 bitcast (%struct.s* @s_foo to i8*), i64 8, i1 false)
	br i1 %t3, label %bb4, label %bb7			br i1 %t3, label %bb4, label %bb7

	Show All 14 Lines

	; A CFG diamond with an invoke on one side, and a partially redundant memcpy			; A CFG diamond with an invoke on one side, and a partially redundant memcpy
	; into an alloca on the other. Test that the memcpy inside the diamond is			; into an alloca on the other. Test that the memcpy inside the diamond is
	; optimized to copy ; directly from the original source rather than from the			; optimized to copy ; directly from the original source rather than from the
	; temporary. This more complex test represents a relatively common usage			; temporary. This more complex test represents a relatively common usage
	; pattern.			; pattern.

	define i32 @baz(i1 %t5) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {			define i32 @baz(i1 %t5) personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	; CHECK-LABEL: @baz(			; NO_MSSA-LABEL: @baz(
	; CHECK-NEXT: bb:			; NO_MSSA-NEXT: bb:
	; CHECK-NEXT: [[S:%.]] = alloca [[STRUCT_S:%.]], align 4			; NO_MSSA-NEXT: [[S:%.]] = alloca [[STRUCT_S:%.]], align 4
	; CHECK-NEXT: [[T:%.*]] = alloca [[STRUCT_S]], align 4			; NO_MSSA-NEXT: [[T:%.*]] = alloca [[STRUCT_S]], align 4
	; CHECK-NEXT: [[S3:%.]] = bitcast %struct.s [[S]] to i8*			; NO_MSSA-NEXT: [[S3:%.]] = bitcast %struct.s [[S]] to i8*
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[S3]], i8* align 4 bitcast (%struct.s* @s_baz to i8*), i64 8, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[S3]], i8* align 4 bitcast (%struct.s* @s_baz to i8*), i64 8, i1 false)
	; CHECK-NEXT: br i1 [[T5:%.]], label [[BB6:%.]], label [[BB22:%.*]]			; NO_MSSA-NEXT: br i1 [[T5:%.]], label [[BB6:%.]], label [[BB22:%.*]]
	; CHECK: bb6:			; NO_MSSA: bb6:
	; CHECK-NEXT: invoke void @__cxa_throw(i8* null, i8* bitcast (i8** @i to i8), i8 null)			; NO_MSSA-NEXT: invoke void @__cxa_throw(i8* null, i8* bitcast (i8** @i to i8), i8 null)
	; CHECK-NEXT: to label [[BB25:%.]] unwind label [[BB9:%.]]			; NO_MSSA-NEXT: to label [[BB25:%.]] unwind label [[BB9:%.]]
	; CHECK: bb9:			; NO_MSSA: bb9:
	; CHECK-NEXT: [[T10:%.]] = landingpad { i8, i32 }			; NO_MSSA-NEXT: [[T10:%.]] = landingpad { i8, i32 }
	; CHECK-NEXT: catch i8* null			; NO_MSSA-NEXT: catch i8* null
	; CHECK-NEXT: br label [[BB13:%.*]]			; NO_MSSA-NEXT: br label [[BB13:%.*]]
	; CHECK: bb13:			; NO_MSSA: bb13:
	; CHECK-NEXT: [[T15:%.]] = call i8 @__cxa_begin_catch(i8* null)			; NO_MSSA-NEXT: [[T15:%.]] = call i8 @__cxa_begin_catch(i8* null)
	; CHECK-NEXT: br label [[BB23:%.*]]			; NO_MSSA-NEXT: br label [[BB23:%.*]]
	; CHECK: bb22:			; NO_MSSA: bb22:
	; CHECK-NEXT: [[T23:%.]] = bitcast %struct.s [[T]] to i8*			; NO_MSSA-NEXT: [[T23:%.]] = bitcast %struct.s [[T]] to i8*
	; CHECK-NEXT: [[S24:%.]] = bitcast %struct.s [[S]] to i8*			; NO_MSSA-NEXT: [[S24:%.]] = bitcast %struct.s [[S]] to i8*
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[T23]], i8* align 4 [[S24]], i64 8, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[T23]], i8* align 4 [[S24]], i64 8, i1 false)
	; CHECK-NEXT: br label [[BB23]]			; NO_MSSA-NEXT: br label [[BB23]]
	; CHECK: bb23:			; NO_MSSA: bb23:
	; CHECK-NEXT: [[T17:%.]] = getelementptr inbounds [[STRUCT_S]], %struct.s [[T]], i32 0, i32 0			; NO_MSSA-NEXT: [[T17:%.]] = getelementptr inbounds [[STRUCT_S]], %struct.s [[T]], i32 0, i32 0
	; CHECK-NEXT: [[T18:%.]] = load i32, i32 [[T17]], align 4			; NO_MSSA-NEXT: [[T18:%.]] = load i32, i32 [[T17]], align 4
	; CHECK-NEXT: [[T19:%.]] = getelementptr inbounds [[STRUCT_S]], %struct.s [[T]], i32 0, i32 1			; NO_MSSA-NEXT: [[T19:%.]] = getelementptr inbounds [[STRUCT_S]], %struct.s [[T]], i32 0, i32 1
	; CHECK-NEXT: [[T20:%.]] = load i32, i32 [[T19]], align 4			; NO_MSSA-NEXT: [[T20:%.]] = load i32, i32 [[T19]], align 4
	; CHECK-NEXT: [[T21:%.*]] = add nsw i32 [[T18]], [[T20]]			; NO_MSSA-NEXT: [[T21:%.*]] = add nsw i32 [[T18]], [[T20]]
	; CHECK-NEXT: ret i32 [[T21]]			; NO_MSSA-NEXT: ret i32 [[T21]]
	; CHECK: bb25:			; NO_MSSA: bb25:
	; CHECK-NEXT: unreachable			; NO_MSSA-NEXT: unreachable
				;
				; MSSA-LABEL: @baz(
				; MSSA-NEXT: bb:
				; MSSA-NEXT: [[S:%.]] = alloca [[STRUCT_S:%.]], align 4
				; MSSA-NEXT: [[T:%.*]] = alloca [[STRUCT_S]], align 4
				; MSSA-NEXT: [[S3:%.]] = bitcast %struct.s [[S]] to i8*
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[S3]], i8* align 4 bitcast (%struct.s* @s_baz to i8*), i64 8, i1 false)
				; MSSA-NEXT: br i1 [[T5:%.]], label [[BB6:%.]], label [[BB22:%.*]]
				; MSSA: bb6:
				; MSSA-NEXT: invoke void @__cxa_throw(i8* null, i8* bitcast (i8** @i to i8), i8 null)
				; MSSA-NEXT: to label [[BB25:%.]] unwind label [[BB9:%.]]
				; MSSA: bb9:
				; MSSA-NEXT: [[T10:%.]] = landingpad { i8, i32 }
				; MSSA-NEXT: catch i8* null
				; MSSA-NEXT: br label [[BB13:%.*]]
				; MSSA: bb13:
				; MSSA-NEXT: [[T15:%.]] = call i8 @__cxa_begin_catch(i8* null)
				; MSSA-NEXT: br label [[BB23:%.*]]
				; MSSA: bb22:
				; MSSA-NEXT: [[T23:%.]] = bitcast %struct.s [[T]] to i8*
				; MSSA-NEXT: [[S24:%.]] = bitcast %struct.s [[S]] to i8*
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 [[T23]], i8* align 4 bitcast (%struct.s* @s_baz to i8*), i64 8, i1 false)
				; MSSA-NEXT: br label [[BB23]]
				; MSSA: bb23:
				; MSSA-NEXT: [[T17:%.]] = getelementptr inbounds [[STRUCT_S]], %struct.s [[T]], i32 0, i32 0
				; MSSA-NEXT: [[T18:%.]] = load i32, i32 [[T17]], align 4
				; MSSA-NEXT: [[T19:%.]] = getelementptr inbounds [[STRUCT_S]], %struct.s [[T]], i32 0, i32 1
				; MSSA-NEXT: [[T20:%.]] = load i32, i32 [[T19]], align 4
				; MSSA-NEXT: [[T21:%.*]] = add nsw i32 [[T18]], [[T20]]
				; MSSA-NEXT: ret i32 [[T21]]
				; MSSA: bb25:
				; MSSA-NEXT: unreachable
	;			;
	bb:			bb:
	%s = alloca %struct.s, align 4			%s = alloca %struct.s, align 4
	%t = alloca %struct.s, align 4			%t = alloca %struct.s, align 4
	%s3 = bitcast %struct.s* %s to i8*			%s3 = bitcast %struct.s* %s to i8*
	call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %s3, i8* align 4 bitcast (%struct.s* @s_baz to i8*), i64 8, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %s3, i8* align 4 bitcast (%struct.s* @s_baz to i8*), i64 8, i1 false)
	br i1 %t5, label %bb6, label %bb22			br i1 %t5, label %bb6, label %bb22

	Show All 30 Lines

llvm/test/Transforms/MemCpyOpt/stackrestore.ll

	Show All 10 Lines

	@str = internal constant [9 x i8] c"abcdxxxxx"			@str = internal constant [9 x i8] c"abcdxxxxx"


	; Test that we can propagate memcpy through an unescaped dynamic alloca across			; Test that we can propagate memcpy through an unescaped dynamic alloca across
	; a call to @external.			; a call to @external.

	define i32 @test_norestore(i32 %n) {			define i32 @test_norestore(i32 %n) {
	; CHECK-LABEL: @test_norestore(			; NO_MSSA-LABEL: @test_norestore(
	; CHECK-NEXT: [[TMPMEM:%.*]] = alloca [10 x i8], align 4			; NO_MSSA-NEXT: [[TMPMEM:%.*]] = alloca [10 x i8], align 4
	; CHECK-NEXT: [[TMP:%.]] = getelementptr inbounds [10 x i8], [10 x i8] [[TMPMEM]], i32 0, i32 0			; NO_MSSA-NEXT: [[TMP:%.]] = getelementptr inbounds [10 x i8], [10 x i8] [[TMPMEM]], i32 0, i32 0
	; CHECK-NEXT: [[P:%.]] = alloca i8, i32 [[N:%.]], align 4			; NO_MSSA-NEXT: [[P:%.]] = alloca i8, i32 [[N:%.]], align 4
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[P]], i8* align 1 getelementptr inbounds ([9 x i8], [9 x i8]* @str, i32 0, i32 0), i32 9, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[P]], i8* align 1 getelementptr inbounds ([9 x i8], [9 x i8]* @str, i32 0, i32 0), i32 9, i1 false)
	; CHECK-NEXT: [[P10:%.]] = getelementptr inbounds i8, i8 [[P]], i32 9			; NO_MSSA-NEXT: [[P10:%.]] = getelementptr inbounds i8, i8 [[P]], i32 9
	; CHECK-NEXT: store i8 0, i8* [[P10]], align 1			; NO_MSSA-NEXT: store i8 0, i8* [[P10]], align 1
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[TMP]], i8* [[P]], i32 10, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[TMP]], i8* [[P]], i32 10, i1 false)
	; CHECK-NEXT: call void @external()			; NO_MSSA-NEXT: call void @external()
	; CHECK-NEXT: [[HEAP:%.]] = call i8 @malloc(i32 9)			; NO_MSSA-NEXT: [[HEAP:%.]] = call i8 @malloc(i32 9)
	; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[HEAP]], i8* [[P]], i32 9, i1 false)			; NO_MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[HEAP]], i8* [[P]], i32 9, i1 false)
	; CHECK-NEXT: call void @useit(i8* [[HEAP]])			; NO_MSSA-NEXT: call void @useit(i8* [[HEAP]])
	; CHECK-NEXT: ret i32 0			; NO_MSSA-NEXT: ret i32 0
				;
				; MSSA-LABEL: @test_norestore(
				; MSSA-NEXT: [[TMPMEM:%.*]] = alloca [10 x i8], align 4
				; MSSA-NEXT: [[TMP:%.]] = getelementptr inbounds [10 x i8], [10 x i8] [[TMPMEM]], i32 0, i32 0
				; MSSA-NEXT: [[P:%.]] = alloca i8, i32 [[N:%.]], align 4
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[P]], i8* align 1 getelementptr inbounds ([9 x i8], [9 x i8]* @str, i32 0, i32 0), i32 9, i1 false)
				; MSSA-NEXT: [[P10:%.]] = getelementptr inbounds i8, i8 [[P]], i32 9
				; MSSA-NEXT: store i8 0, i8* [[P10]], align 1
				; MSSA-NEXT: call void @llvm.memcpy.p0i8.p0i8.i32(i8* [[TMP]], i8* [[P]], i32 10, i1 false)
				; MSSA-NEXT: call void @external()
				; MSSA-NEXT: [[HEAP:%.]] = call i8 @malloc(i32 9)
				; MSSA-NEXT: call void @llvm.memmove.p0i8.p0i8.i32(i8* [[HEAP]], i8* align 1 getelementptr inbounds ([9 x i8], [9 x i8]* @str, i32 0, i32 0), i32 9, i1 false)
				; MSSA-NEXT: call void @useit(i8* [[HEAP]])
				; MSSA-NEXT: ret i32 0
	;			;
	%tmpmem = alloca [10 x i8], align 4			%tmpmem = alloca [10 x i8], align 4
	%tmp = getelementptr inbounds [10 x i8], [10 x i8]* %tmpmem, i32 0, i32 0			%tmp = getelementptr inbounds [10 x i8], [10 x i8]* %tmpmem, i32 0, i32 0

	; Make a dynamic alloca, initialize it.			; Make a dynamic alloca, initialize it.
	%p = alloca i8, i32 %n, align 4			%p = alloca i8, i32 %n, align 4
	call void @llvm.memcpy.p0i8.p0i8.i32(i8* %p, i8* align 1 getelementptr inbounds ([9 x i8], [9 x i8]* @str, i32 0, i32 0), i32 9, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i32(i8* %p, i8* align 1 getelementptr inbounds ([9 x i8], [9 x i8]* @str, i32 0, i32 0), i32 9, i1 false)

	▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[MemCpyOpt] Port to MemorySSAClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 308678

llvm/include/llvm/Transforms/Scalar/MemCpyOptimizer.h

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp

llvm/test/Transforms/MemCpyOpt/callslot.ll

llvm/test/Transforms/MemCpyOpt/invariant.start.ll

llvm/test/Transforms/MemCpyOpt/memcpy-invoke-memcpy.ll

llvm/test/Transforms/MemCpyOpt/memcpy.ll

llvm/test/Transforms/MemCpyOpt/merge-into-memset.ll

llvm/test/Transforms/MemCpyOpt/mixed-sizes.ll

llvm/test/Transforms/MemCpyOpt/nonlocal-memcpy-memcpy.ll

llvm/test/Transforms/MemCpyOpt/stackrestore.ll

[MemCpyOpt] Port to MemorySSA
ClosedPublic