This is an archive of the discontinued LLVM Phabricator instance.

[MemCpyOpt] Optimize MemoryDef insertion
ClosedPublic

Authored by nikic on Aug 7 2021, 2:16 PM.

Download Raw Diff

Details

Reviewers

asbirlea
aeubanks
george.burgess.iv

Commits

rG17db125b487f: [MemCpyOpt] Optimize MemoryDef insertion

Summary

When converting a store into a memset, we currently insert the new MemoryDef after the store MemoryDef, which requires all uses to be renamed to the new def using a whole block scan. Instead, we can insert the new MemoryDef before the store and not rename uses, because we know that the location is immediately overwritten, so all uses should still refer to the old MemoryDef. Those uses will get renamed when the old MemoryDef is actually dropped, which is efficient.

I expect something similar can be done for some of the other MSSA updates in MemCpyOpt. This may be an alternative to D107513, at least for this particular case.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nikic created this revision.Aug 7 2021, 2:16 PM

Herald added a subscriber: hiraditya. · View Herald TranscriptAug 7 2021, 2:16 PM

nikic requested review of this revision.Aug 7 2021, 2:16 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2021, 2:16 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B118522: Diff 364985.Aug 7 2021, 3:06 PM

this does make memcpyopt not insanely slow for large functions

import sys

count = int(sys.argv[1])

with open(sys.argv[2], 'w') as f:
    for i in xrange(count):
        f.write('@g{} = global [2 x i64] zeroinitializer\n'.format(i))
    f.write('define void @a() {\n')
    for i in xrange(count):
        f.write('  store [2 x i64] zeroinitializer, [2 x i64]* @g{}\n'.format(i))
    f.write('  ret void\n')
    f.write('}\n')

$ python /tmp/a.py 100000 /tmp/a.ll
$ opt -passes=memcpyopt -disable-output /tmp/a.ll

for a function with 100000 stores, it previously took 88s, with this patch it takes 1s

aeubanks mentioned this in D107513: [MemCpyOpt/MemorySSA] Do not run the pass for prohibitively large number of memory accesses..Aug 9 2021, 4:41 PM

Thanks, this LGTM, and it does resolve the efficiency case.

This revision is now accepted and ready to land.Aug 9 2021, 5:27 PM

Closed by commit rG17db125b487f: [MemCpyOpt] Optimize MemoryDef insertion (authored by nikic). · Explain WhyAug 10 2021, 12:28 PM

This revision was automatically updated to reflect the committed changes.

nikic added a commit: rG17db125b487f: [MemCpyOpt] Optimize MemoryDef insertion.

nikic mentioned this in D117926: [SLP] Optionally preserve MemorySSA.Jan 21 2022, 2:57 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

MemCpyOptimizer.cpp

11 lines

Diff 365584

llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp

Show First 20 Lines • Show All 793 Lines • ▼ Show 20 Lines	if (Value *ByteVal = isBytewiseValue(V, DL)) {
if (T->isAggregateType()) {		if (T->isAggregateType()) {
uint64_t Size = DL.getTypeStoreSize(T);		uint64_t Size = DL.getTypeStoreSize(T);
IRBuilder<> Builder(SI);		IRBuilder<> Builder(SI);
auto *M = Builder.CreateMemSet(SI->getPointerOperand(), ByteVal, Size,		auto *M = Builder.CreateMemSet(SI->getPointerOperand(), ByteVal, Size,
SI->getAlign());		SI->getAlign());

LLVM_DEBUG(dbgs() << "Promoting " << SI << " to " << M << "\n");		LLVM_DEBUG(dbgs() << "Promoting " << SI << " to " << M << "\n");

assert(isa<MemoryDef>(MSSAU->getMemorySSA()->getMemoryAccess(SI)));		// The newly inserted memset is immediately overwritten by the original
auto *LastDef =		// store, so we do not need to rename uses.
cast<MemoryDef>(MSSAU->getMemorySSA()->getMemoryAccess(SI));		auto *StoreDef = cast<MemoryDef>(MSSA->getMemoryAccess(SI));
auto *NewAccess = MSSAU->createMemoryAccessAfter(M, LastDef, LastDef);		auto *NewAccess = MSSAU->createMemoryAccessBefore(
MSSAU->insertDef(cast<MemoryDef>(NewAccess), /RenameUses=/true);		M, StoreDef->getDefiningAccess(), StoreDef);
		MSSAU->insertDef(cast<MemoryDef>(NewAccess), /RenameUses=/false);

eraseInstruction(SI);		eraseInstruction(SI);
NumMemSetInfer++;		NumMemSetInfer++;

// Make sure we do not invalidate the iterator.		// Make sure we do not invalidate the iterator.
BBI = M->getIterator();		BBI = M->getIterator();
return true;		return true;
}		}
▲ Show 20 Lines • Show All 790 Lines • Show Last 20 Lines