Download Raw Diff

Details

Reviewers

Commits

rGdc73a6edde86: Reapply "[MemCpyOpt] memset->memcpy forwarding with undef tail"
rG94b8e2ea4ec9: [MemCpyOpt] memset->memcpy forwarding with undef tail
rL349078: Reapply "[MemCpyOpt] memset->memcpy forwarding with undef tail"
rL348645: [MemCpyOpt] memset->memcpy forwarding with undef tail

Summary

Currently memcpyopt optimizes cases like

memset(a, byte, N);
memcpy(b, a, M);
// to
memset(a, byte, N);
memset(b, byte, M);

if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely.

This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start.

This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844.

For the implementation, I'm reusing a bit of code for a similar existing optimization (direct memcpy of undef).

Diff Detail

Event Timeline

nikic created this revision.Nov 30 2018, 5:18 AM

Herald added subscribers: llvm-commits, JDevlieghere. · View Herald TranscriptNov 30 2018, 5:18 AM

An alternative way to handle this would be to tell call slot optimization that if the call happens to be a memset, it does not need to check that the destination is non-trapping. This would drop the memcpy entirely right away. However, it is also less general, because it will not handle cases where the source value is still used afterwards (the malloc/free example here).

jrmuizel added a subscriber: jrmuizel.Nov 30 2018, 7:56 AM

rkruppe added a subscriber: rkruppe.Nov 30 2018, 2:10 PM

efriedma added inline comments.Dec 4 2018, 1:08 PM

lib/Analysis/MemoryDependenceAnalysis.cpp
179	This probably makes sense, but it looks like it'll impact other transforms; would it be possible to test separately? Or does it actually not have any impact on other transforms for some reason? Looks fine otherwise.

nikic marked an inline comment as done.Dec 6 2018, 4:27 AM

nikic added inline comments.

lib/Analysis/MemoryDependenceAnalysis.cpp
179	It seems that it indeed does not impact other transforms, at least not in any way that I can see. The main consumer of `GetLocation()` is `getDependency()`, for which: GVN only deals with loads. DSE has it's own mechanism for determining location. MemCpyOpt does not call getDependency on memsets (prior to this change). The other consumer is `getCallSiteDependencyFrom()`, where previously memset would have been treated as a normal call-site. However in both cases (known location or call-site) the result will ultimately be determinded by AA. What I'm unsure about is whether this is correct wrt handling of volatile. I see in the code above that an ordered (and non-monotonic) StoreInst will return ModRef with empty location. Maybe memset needs the same treatment for volatile?

efriedma added inline comments.Dec 6 2018, 11:13 AM

lib/Analysis/MemoryDependenceAnalysis.cpp
179	We don't really use volatile memsets anywhere important, but yes, probably best to be conservative and treat them as ModRef.

Conversatively return ModRef for volatile memsets. Add a few more tests.

Forwarding happens for the case of volatile memset + non-volatile memcpy and doesn't for non-volatile memset + volatile memcpy, which I *think* is correct. (In either case this is existing behavior, this patch doesn't change it.)

LGTM

This revision is now accepted and ready to land.Dec 7 2018, 12:50 PM

Closed by commit rL348645: [MemCpyOpt] memset->memcpy forwarding with undef tail (authored by nikic). · Explain WhyDec 7 2018, 1:19 PM

This revision was automatically updated to reflect the committed changes.

nikic mentioned this in rL348644: [MemCpyOpt] Add tests for memset->memcpy forwaring with undef tail; NFC.

Heads-up that this causes crashes during static initialization in one binary in the chrome/win build, https://bugs.chromium.org/p/chromium/issues/detail?id=913423 I don't have a reduced repro yet, but my feeling so far is that this will be a compiler bug, not a code bug. (Nothing to do here yet until I have a repro.)

Reopening, as this was reverted by rL349002, with a reduced test case showing miscompilation at http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610520.html.

This revision is now accepted and ready to land.Dec 13 2018, 3:27 AM

nikic planned changes to this revision.Dec 13 2018, 4:27 AM

Fix memory dependence query. We need to query at the location of the memset, but with the size of the memcpy. Otherwise we miss possible writes in the region between the end of the memset and the end of the memcpy. Ideally we'd only query for locations in that region, but as there is no easy way to do so we just use the whole memcpy source region.

Add additional tests which have writes prior to the memset, either in the memset region (legal to optimize, but we currently don't), as well as after the memset region, or overlapping both (not legal to optimize).

This revision is now accepted and ready to land.Dec 13 2018, 9:00 AM

Fix typos in comment.

LGTM. Sorry, I should have considered the dependency check a little more carefully the first time around.

Closed by commit rL349078: Reapply "[MemCpyOpt] memset->memcpy forwarding with undef tail" (authored by nikic). · Explain WhyDec 13 2018, 12:08 PM

This revision was automatically updated to reflect the committed changes.

Diff 177213

lib/Analysis/MemoryDependenceAnalysis.cpp

Show First 20 Lines • Show All 148 Lines • ▼ Show 20 Lines	static ModRefInfo GetLocation(const Instruction *Inst, MemoryLocation &Loc,
}		}

if (const CallInst *CI = isFreeCall(Inst, &TLI)) {		if (const CallInst *CI = isFreeCall(Inst, &TLI)) {
// calls to free() deallocate the entire structure		// calls to free() deallocate the entire structure
Loc = MemoryLocation(CI->getArgOperand(0));		Loc = MemoryLocation(CI->getArgOperand(0));
return ModRefInfo::Mod;		return ModRefInfo::Mod;
}		}

		if (const MemSetInst *MI = dyn_cast<MemSetInst>(Inst)) {
		Loc = MemoryLocation::getForDest(MI);
		// Conversatively assume ModRef for volatile memset.
		return MI->isVolatile() ? ModRefInfo::ModRef : ModRefInfo::Mod;
		}

if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst)) {		if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst)) {
switch (II->getIntrinsicID()) {		switch (II->getIntrinsicID()) {
case Intrinsic::lifetime_start:		case Intrinsic::lifetime_start:
case Intrinsic::lifetime_end:		case Intrinsic::lifetime_end:
case Intrinsic::invariant_start:		case Intrinsic::invariant_start:
Loc = MemoryLocation::getForArgument(II, 1, TLI);		Loc = MemoryLocation::getForArgument(II, 1, TLI);
// These intrinsics don't really modify the memory, but returning Mod		// These intrinsics don't really modify the memory, but returning Mod
// will allow them to be handled conservatively.		// will allow them to be handled conservatively.
return ModRefInfo::Mod;		return ModRefInfo::Mod;
case Intrinsic::invariant_end:		case Intrinsic::invariant_end:
Loc = MemoryLocation::getForArgument(II, 2, TLI);		Loc = MemoryLocation::getForArgument(II, 2, TLI);
// These intrinsics don't really modify the memory, but returning Mod		// These intrinsics don't really modify the memory, but returning Mod
// will allow them to be handled conservatively.		// will allow them to be handled conservatively.
return ModRefInfo::Mod;		return ModRefInfo::Mod;
default:		default:
break;		break;
}		}
		efriedmaUnsubmitted Not Done Reply Inline Actions This probably makes sense, but it looks like it'll impact other transforms; would it be possible to test separately? Or does it actually not have any impact on other transforms for some reason? Looks fine otherwise. efriedma: This probably makes sense, but it looks like it'll impact other transforms; would it be…
		nikicAuthorUnsubmitted Done Reply Inline Actions It seems that it indeed does not impact other transforms, at least not in any way that I can see. The main consumer of `GetLocation()` is `getDependency()`, for which: GVN only deals with loads. DSE has it's own mechanism for determining location. MemCpyOpt does not call getDependency on memsets (prior to this change). The other consumer is `getCallSiteDependencyFrom()`, where previously memset would have been treated as a normal call-site. However in both cases (known location or call-site) the result will ultimately be determinded by AA. What I'm unsure about is whether this is correct wrt handling of volatile. I see in the code above that an ordered (and non-monotonic) StoreInst will return ModRef with empty location. Maybe memset needs the same treatment for volatile? nikic: It seems that it indeed does not impact other transforms, at least not in any way that I can…
		efriedmaUnsubmitted Not Done Reply Inline Actions We don't really use volatile memsets anywhere important, but yes, probably best to be conservative and treat them as ModRef. efriedma: We don't really use volatile memsets anywhere important, but yes, probably best to be…
}		}

// Otherwise, just do the coarse-grained thing that always works.		// Otherwise, just do the coarse-grained thing that always works.
if (Inst->mayWriteToMemory())		if (Inst->mayWriteToMemory())
return ModRefInfo::ModRef;		return ModRefInfo::ModRef;
if (Inst->mayReadFromMemory())		if (Inst->mayReadFromMemory())
return ModRefInfo::Ref;		return ModRefInfo::Ref;
return ModRefInfo::NoModRef;		return ModRefInfo::NoModRef;
▲ Show 20 Lines • Show All 1,639 Lines • Show Last 20 Lines

lib/Transforms/Scalar/MemCpyOptimizer.cpp

Show First 20 Lines • Show All 1,138 Lines • ▼ Show 20 Lines	bool MemCpyOptPass::processMemSetMemCpyDependence(MemCpyInst *MemCpy,
Builder.CreateMemSet(Builder.CreateGEP(Dest, SrcSize), MemSet->getOperand(1),		Builder.CreateMemSet(Builder.CreateGEP(Dest, SrcSize), MemSet->getOperand(1),
MemsetLen, Align);		MemsetLen, Align);

MD->removeInstruction(MemSet);		MD->removeInstruction(MemSet);
MemSet->eraseFromParent();		MemSet->eraseFromParent();
return true;		return true;
}		}

		/// Determine whether the instruction has undefined content for the given Size,
		/// either because it was freshly alloca'd or started its lifetime.
		static bool hasUndefContents(Instruction I, ConstantInt Size) {
		if (isa<AllocaInst>(I))
		return true;

		if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I))
		if (II->getIntrinsicID() == Intrinsic::lifetime_start)
		if (ConstantInt *LTSize = dyn_cast<ConstantInt>(II->getArgOperand(0)))
		if (LTSize->getZExtValue() >= Size->getZExtValue())
		return true;

		return false;
		}

/// Transform memcpy to memset when its source was just memset.		/// Transform memcpy to memset when its source was just memset.
/// In other words, turn:		/// In other words, turn:
/// \code		/// \code
/// memset(dst1, c, dst1_size);		/// memset(dst1, c, dst1_size);
/// memcpy(dst2, dst1, dst2_size);		/// memcpy(dst2, dst1, dst2_size);
/// \endcode		/// \endcode
/// into:		/// into:
/// \code		/// \code
/// memset(dst1, c, dst1_size);		/// memset(dst1, c, dst1_size);
/// memset(dst2, c, dst2_size);		/// memset(dst2, c, dst2_size);
/// \endcode		/// \endcode
/// When dst2_size <= dst1_size.		/// When dst2_size <= dst1_size.
///		///
/// The \p MemCpy must have a Constant length.		/// The \p MemCpy must have a Constant length.
bool MemCpyOptPass::performMemCpyToMemSetOptzn(MemCpyInst *MemCpy,		bool MemCpyOptPass::performMemCpyToMemSetOptzn(MemCpyInst *MemCpy,
MemSetInst *MemSet) {		MemSetInst *MemSet) {
AliasAnalysis &AA = LookupAliasAnalysis();		AliasAnalysis &AA = LookupAliasAnalysis();

// Make sure that memcpy(..., memset(...), ...), that is we are memsetting and		// Make sure that memcpy(..., memset(...), ...), that is we are memsetting and
// memcpying from the same address. Otherwise it is hard to reason about.		// memcpying from the same address. Otherwise it is hard to reason about.
if (!AA.isMustAlias(MemSet->getRawDest(), MemCpy->getRawSource()))		if (!AA.isMustAlias(MemSet->getRawDest(), MemCpy->getRawSource()))
return false;		return false;

ConstantInt *CopySize = cast<ConstantInt>(MemCpy->getLength());		// A known memset size is required.
ConstantInt *MemSetSize = dyn_cast<ConstantInt>(MemSet->getLength());		ConstantInt *MemSetSize = dyn_cast<ConstantInt>(MemSet->getLength());
		if (!MemSetSize)
		return false;

// Make sure the memcpy doesn't read any more than what the memset wrote.		// Make sure the memcpy doesn't read any more than what the memset wrote.
// Don't worry about sizes larger than i64.		// Don't worry about sizes larger than i64.
if (!MemSetSize \|\| CopySize->getZExtValue() > MemSetSize->getZExtValue())		ConstantInt *CopySize = cast<ConstantInt>(MemCpy->getLength());
		if (CopySize->getZExtValue() > MemSetSize->getZExtValue()) {
		// If the memcpy is larger than the memset, but the memory was undef prior
		// to the memset, we can just ignore the tail.
		MemDepResult DepInfo = MD->getDependency(MemSet);
		if (DepInfo.isDef() && hasUndefContents(DepInfo.getInst(), CopySize))
		CopySize = MemSetSize;
		else
return false;		return false;
		}

IRBuilder<> Builder(MemCpy);		IRBuilder<> Builder(MemCpy);
Builder.CreateMemSet(MemCpy->getRawDest(), MemSet->getOperand(1),		Builder.CreateMemSet(MemCpy->getRawDest(), MemSet->getOperand(1),
CopySize, MemCpy->getDestAlignment());		CopySize, MemCpy->getDestAlignment());
return true;		return true;
}		}

/// Perform simplification of memcpy's. If we have memcpy A		/// Perform simplification of memcpy's. If we have memcpy A
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	bool MemCpyOptPass::processMemCpy(MemCpyInst *M) {
MemoryLocation SrcLoc = MemoryLocation::getForSource(M);		MemoryLocation SrcLoc = MemoryLocation::getForSource(M);
MemDepResult SrcDepInfo = MD->getPointerDependencyFrom(		MemDepResult SrcDepInfo = MD->getPointerDependencyFrom(
SrcLoc, true, M->getIterator(), M->getParent());		SrcLoc, true, M->getIterator(), M->getParent());

if (SrcDepInfo.isClobber()) {		if (SrcDepInfo.isClobber()) {
if (MemCpyInst *MDep = dyn_cast<MemCpyInst>(SrcDepInfo.getInst()))		if (MemCpyInst *MDep = dyn_cast<MemCpyInst>(SrcDepInfo.getInst()))
return processMemCpyMemCpyDependence(M, MDep);		return processMemCpyMemCpyDependence(M, MDep);
} else if (SrcDepInfo.isDef()) {		} else if (SrcDepInfo.isDef()) {
Instruction *I = SrcDepInfo.getInst();		if (hasUndefContents(SrcDepInfo.getInst(), CopySize)) {
bool hasUndefContents = false;

if (isa<AllocaInst>(I)) {
hasUndefContents = true;
} else if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
if (II->getIntrinsicID() == Intrinsic::lifetime_start)
if (ConstantInt *LTSize = dyn_cast<ConstantInt>(II->getArgOperand(0)))
if (LTSize->getZExtValue() >= CopySize->getZExtValue())
hasUndefContents = true;
}

if (hasUndefContents) {
MD->removeInstruction(M);		MD->removeInstruction(M);
M->eraseFromParent();		M->eraseFromParent();
++NumMemCpyInstr;		++NumMemCpyInstr;
return true;		return true;
}		}
}		}

if (SrcDepInfo.isClobber())		if (SrcDepInfo.isClobber())
▲ Show 20 Lines • Show All 239 Lines • Show Last 20 Lines

test/Transforms/MemCpyOpt/memset-memcpy-oversized.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -memcpyopt -S %s \| FileCheck %s

				; memset -> memcpy forwarding, if memcpy is larger than memset, but trailing
				; bytes are known to be undef.


				%T = type { i64, i32, i32 }

				define void @test_alloca(i8* %result) {
				; CHECK-LABEL: @test_alloca(
				; CHECK-NEXT: [[A:%.]] = alloca [[T:%.]], align 8
				; CHECK-NEXT: [[B:%.]] = bitcast %T [[A]] to i8*
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[B]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* [[RESULT:%.*]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: ret void
				;
				%a = alloca %T, align 8
				%b = bitcast %T* %a to i8*
				call void @llvm.memset.p0i8.i64(i8* align 8 %b, i8 0, i64 12, i1 false)
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %result, i8* align 8 %b, i64 16, i1 false)
				ret void
				}

				define void @test_alloca_with_lifetimes(i8* %result) {
				; CHECK-LABEL: @test_alloca_with_lifetimes(
				; CHECK-NEXT: [[A:%.]] = alloca [[T:%.]], align 8
				; CHECK-NEXT: [[B:%.]] = bitcast %T [[A]] to i8*
				; CHECK-NEXT: call void @llvm.lifetime.start.p0i8(i64 16, i8* [[B]])
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[B]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* [[RESULT:%.*]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: call void @llvm.lifetime.end.p0i8(i64 16, i8* [[B]])
				; CHECK-NEXT: ret void
				;
				%a = alloca %T, align 8
				%b = bitcast %T* %a to i8*
				call void @llvm.lifetime.start.p0i8(i64 16, i8* %b)
				call void @llvm.memset.p0i8.i64(i8* align 8 %b, i8 0, i64 12, i1 false)
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %result, i8* align 8 %b, i64 16, i1 false)
				call void @llvm.lifetime.end.p0i8(i64 16, i8* %b)
				ret void
				}

				define void @test_malloc_with_lifetimes(i8* %result) {
				; CHECK-LABEL: @test_malloc_with_lifetimes(
				; CHECK-NEXT: [[A:%.]] = call i8 @malloc(i64 16)
				; CHECK-NEXT: call void @llvm.lifetime.start.p0i8(i64 16, i8* [[A]])
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[A]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* [[RESULT:%.*]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: call void @llvm.lifetime.end.p0i8(i64 16, i8* [[A]])
				; CHECK-NEXT: call void @free(i8* [[A]])
				; CHECK-NEXT: ret void
				;
				%a = call i8* @malloc(i64 16)
				call void @llvm.lifetime.start.p0i8(i64 16, i8* %a)
				call void @llvm.memset.p0i8.i64(i8* align 8 %a, i8 0, i64 12, i1 false)
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %result, i8* align 8 %a, i64 16, i1 false)
				call void @llvm.lifetime.end.p0i8(i64 16, i8* %a)
				call void @free(i8* %a)
				ret void
				}

				; memcpy size is larger than lifetime, don't optimize.
				define void @test_copy_larger_than_lifetime_size(i8* %result) {
				; CHECK-LABEL: @test_copy_larger_than_lifetime_size(
				; CHECK-NEXT: [[A:%.]] = call i8 @malloc(i64 16)
				; CHECK-NEXT: call void @llvm.lifetime.start.p0i8(i64 12, i8* [[A]])
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[A]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* [[RESULT:%.]], i8 align 8 [[A]], i64 16, i1 false)
				; CHECK-NEXT: call void @llvm.lifetime.end.p0i8(i64 12, i8* [[A]])
				; CHECK-NEXT: call void @free(i8* [[A]])
				; CHECK-NEXT: ret void
				;
				%a = call i8* @malloc(i64 16)
				call void @llvm.lifetime.start.p0i8(i64 12, i8* %a)
				call void @llvm.memset.p0i8.i64(i8* align 8 %a, i8 0, i64 12, i1 false)
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %result, i8* align 8 %a, i64 16, i1 false)
				call void @llvm.lifetime.end.p0i8(i64 12, i8* %a)
				call void @free(i8* %a)
				ret void
				}

				; The trailing bytes are not known to be undef, we can't ignore them.
				define void @test_not_undef_memory(i8* %result, i8* %input) {
				; CHECK-LABEL: @test_not_undef_memory(
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[INPUT:%.*]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* [[RESULT:%.]], i8 align 8 [[INPUT]], i64 16, i1 false)
				; CHECK-NEXT: ret void
				;
				call void @llvm.memset.p0i8.i64(i8* align 8 %input, i8 0, i64 12, i1 false)
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %result, i8* align 8 %input, i64 16, i1 false)
				ret void
				}

				; Memset is volatile, memcpy is not. Can be optimized.
				define void @test_volatile_memset(i8* %result) {
				; CHECK-LABEL: @test_volatile_memset(
				; CHECK-NEXT: [[A:%.]] = alloca [[T:%.]], align 8
				; CHECK-NEXT: [[B:%.]] = bitcast %T [[A]] to i8*
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[B]], i8 0, i64 12, i1 true)
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* [[RESULT:%.*]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: ret void
				;
				%a = alloca %T, align 8
				%b = bitcast %T* %a to i8*
				call void @llvm.memset.p0i8.i64(i8* align 8 %b, i8 0, i64 12, i1 true)
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %result, i8* align 8 %b, i64 16, i1 false)
				ret void
				}

				; Memcpy is volatile, memset is not. Cannot be optimized.
				define void @test_volatile_memcpy(i8* %result) {
				; CHECK-LABEL: @test_volatile_memcpy(
				; CHECK-NEXT: [[A:%.]] = alloca [[T:%.]], align 8
				; CHECK-NEXT: [[B:%.]] = bitcast %T [[A]] to i8*
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[B]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* [[RESULT:%.]], i8 align 8 [[B]], i64 16, i1 true)
				; CHECK-NEXT: ret void
				;
				%a = alloca %T, align 8
				%b = bitcast %T* %a to i8*
				call void @llvm.memset.p0i8.i64(i8* align 8 %b, i8 0, i64 12, i1 false)
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %result, i8* align 8 %b, i64 16, i1 true)
				ret void
				}

				; Write between memset and memcpy, can't optimize.
				define void @test_write_between(i8* %result) {
				; CHECK-LABEL: @test_write_between(
				; CHECK-NEXT: [[A:%.]] = alloca [[T:%.]], align 8
				; CHECK-NEXT: [[B:%.]] = bitcast %T [[A]] to i8*
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 8 [[B]], i8 0, i64 12, i1 false)
				; CHECK-NEXT: store i8 -1, i8* [[B]]
				; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* [[RESULT:%.]], i8 align 8 [[B]], i64 16, i1 false)
				; CHECK-NEXT: ret void
				;
				%a = alloca %T, align 8
				%b = bitcast %T* %a to i8*
				call void @llvm.memset.p0i8.i64(i8* align 8 %b, i8 0, i64 12, i1 false)
				store i8 -1, i8* %b
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %result, i8* align 8 %b, i64 16, i1 false)
				ret void
				}

				declare i8* @malloc(i64)
				declare void @free(i8*)

				declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i1)
				declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i1)

				declare void @llvm.lifetime.start.p0i8(i64, i8* nocapture)
				declare void @llvm.lifetime.end.p0i8(i64, i8* nocapture)

This is an archive of the discontinued LLVM Phabricator instance.

[MemCpyOpt] memset->memcpy forwarding with undef tail
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 177213

lib/Analysis/MemoryDependenceAnalysis.cpp

lib/Transforms/Scalar/MemCpyOptimizer.cpp

test/Transforms/MemCpyOpt/memset-memcpy-oversized.ll

This is an archive of the discontinued LLVM Phabricator instance.

[MemCpyOpt] memset->memcpy forwarding with undef tailClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 177213

lib/Analysis/MemoryDependenceAnalysis.cpp

lib/Transforms/Scalar/MemCpyOptimizer.cpp

test/Transforms/MemCpyOpt/memset-memcpy-oversized.ll

[MemCpyOpt] memset->memcpy forwarding with undef tail
ClosedPublic