This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
11/15
DeadStoreElimination.cpp
-
test/Transforms/DeadStoreElimination/MSSA/
-
Transforms/
-
DeadStoreElimination/
-
MSSA/
-
multiblock-memintrinsics.ll
-
multiblock-multipath.ll
-
multiblock-simple.ll

Differential D78932

[DSE,MSSA] Relax post-dom restriction for objs visible after return.
ClosedPublic

Authored by fhahn on Apr 27 2020, 7:56 AM.

Download Raw Diff

Details

Reviewers

dmgreen
bryant
asbirlea
Tyker
efriedma
george.burgess.iv

Commits

rG67671024c8cb: [DSE,MSSA] Relax post-dom restriction for objs visible after return.

Summary

This patch relaxes the post-dominance requirement for accesses to
objects visible after the function returns.

Instead of requiring the killing def to post-dominate the access to
eliminate, the set of 'killing blocks' (= blocks that completely
overwrite the original access) is collected.

If all paths from the access to eliminate and an exit block go through a
killing block, the access can be removed.

To check this property, we first get the common post-dominator block for
the killing blocks. If this block does not post-dominate the access
block, there may be a path from DomAccess to an exit block not involving
any killing block.

Otherwise we have to check if there is a path from the DomAccess to the
common post-dominator, that does not contain a killing block. If there
is no such path, we can remove DomAccess. For this check, we start at
the common post-dominator and then traverse the CFG backwards. Paths are
terminated when we hit a killing block or a block that is not executed
between DomAccess and a killing block according to the post-order
numbering (if the post order number of a block is greater than the one
of DomAccess, the block cannot be in in a path starting at DomAccess).

This gives the following improvements on the total number of stores
after DSE for MultiSource, SPEC2K, SPEC2006:

Tests: 237
Same hash: 206 (filtered out)
Remaining: 31
Metric: dse.NumRemainingStores

Program base new100 diff
test-suite...CFP2000/188.ammp/188.ammp.test 3624.00 3544.00 -2.2%
test-suite...ch/g721/g721encode/encode.test 128.00 126.00 -1.6%
test-suite.../Benchmarks/Olden/mst/mst.test 73.00 72.00 -1.4%
test-suite...CFP2006/433.milc/433.milc.test 3202.00 3163.00 -1.2%
test-suite...000/186.crafty/186.crafty.test 5062.00 5010.00 -1.0%
test-suite...-typeset/consumer-typeset.test 40460.00 40248.00 -0.5%
test-suite...Source/Benchmarks/sim/sim.test 642.00 639.00 -0.5%
test-suite...nchmarks/McCat/09-vor/vor.test 642.00 644.00 0.3%
test-suite...lications/sqlite3/sqlite3.test 35664.00 35563.00 -0.3%
test-suite...T2000/300.twolf/300.twolf.test 7202.00 7184.00 -0.2%
test-suite...lications/ClamAV/clamscan.test 19475.00 19444.00 -0.2%
test-suite...INT2000/164.gzip/164.gzip.test 2199.00 2196.00 -0.1%
test-suite...peg2/mpeg2dec/mpeg2decode.test 2380.00 2378.00 -0.1%
test-suite.../Benchmarks/Bullet/bullet.test 39335.00 39309.00 -0.1%
test-suite...:: External/Povray/povray.test 36951.00 36927.00 -0.1%
test-suite...marks/7zip/7zip-benchmark.test 67396.00 67356.00 -0.1%
test-suite...6/464.h264ref/464.h264ref.test 31497.00 31481.00 -0.1%
test-suite...006/453.povray/453.povray.test 51441.00 51416.00 -0.0%
test-suite...T2006/401.bzip2/401.bzip2.test 4450.00 4448.00 -0.0%
test-suite...Applications/kimwitu++/kc.test 23481.00 23471.00 -0.0%
test-suite...chmarks/MallocBench/gs/gs.test 6286.00 6284.00 -0.0%
test-suite.../CINT2000/254.gap/254.gap.test 13719.00 13715.00 -0.0%
test-suite.../Applications/SPASS/SPASS.test 30345.00 30338.00 -0.0%
test-suite...006/450.soplex/450.soplex.test 15018.00 15016.00 -0.0%
test-suite...ications/JM/lencod/lencod.test 27780.00 27777.00 -0.0%
test-suite.../CINT2006/403.gcc/403.gcc.test 105285.00 105276.00 -0.0%

There might be potential to pre-compute some of the information of which
blocks are on the path to an exit for each block, but the overall
benefit might be comparatively small.

On the set of benchmarks, 15738 times out of 20322 we reach the
CFG check, the CFG check is successful. The total number of iterations
in the CFG check is 187810, so on average we need less than 10 steps in
the check loop. Bumping the threshold in the loop from 50 to 150 gives a
few small improvements, but I don't think they warrant such a big bump
at the moment. This is all pending further tuning in the future.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fhahn created this revision.Apr 27 2020, 7:56 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2020, 7:56 AM

Herald added subscribers: hiraditya, Prazek. · View Herald Transcript

fhahn mentioned this in D73763: [DSE] Lift post-dominance restriction..Apr 27 2020, 7:58 AM

Harbormaster failed remote builds in B54806: Diff 260325!Apr 27 2020, 8:03 AM

The PostDominatorTree doesn't include implicit unwind edges. I'm not completely sure whether your change handles that correctly... but either way, please make sure we have testcases.

In D78932#2006268, @efriedma wrote:

The PostDominatorTree doesn't include implicit unwind edges. I'm not completely sure whether your change handles that correctly... but either way, please make sure we have testcases.

To double check, as I am not too familiar with the exception handling stuff, are you referring to an example like the one below, where ehcleanup exits to the caller through unwinding? IIUC the problematic scenario would be when we have a store before an invoke, which gets overwritten on the path to the regular exit, but not on the exception path. Calls to throwing function currently block DSE, and I think that would also prevent eliminating in cases mentioned before (even if it is overwritten on the unwind edges too).

define void @test1(i32* %p, i1 %c) personality i32 (...)* @__CxxFrameHandler3 {
entry:
  store i32 2, i32* %p
  br i1 %c, label %exit, label %invoke.cont

invoke.cont:
  invoke void @throw()
  to label %exit unwind label %catch.dispatch

catch.dispatch:                                   ; preds = %invoke.cont
  %cs = catchswitch within none [label %invoke.cont1] unwind label %ehcleanup

invoke.cont1:                                     ; preds = %catch.dispatch
  %catch = catchpad within %cs [i8* null, i32 64, i8* null]
  invoke void @throw() [ "funclet"(token %catch) ]
  to label %exit unwind label %ehcleanup

ehcleanup:                                        ; preds = %invoke.cont1, %catch.dispatch
  %cleanup = cleanuppad within none []
  call void @release(i64 0) [ "funclet"(token %cleanup) ]
  cleanupret from %cleanup unwind to caller

exit:                                      ; preds = %invoke.cont1, %invoke.cont
  store i32 1, i32* %p
  ret void
}

declare i32 @__CxxFrameHandler3(...)

declare void @throw()

declare void @release(i64)

postdominance should work with invokes. I was thinking more of plain calls that unwind. But now I'm remembering: MemorySSA treats a potentially throwing call as a read of the value, even if the call itself can't actually read the value. But still, the new code isn't consistent with the mayThrowBetween() check. I forget the end result of the discussion of whether we still need that check at all.

In D78932#2008860, @efriedma wrote:

postdominance should work with invokes. I was thinking more of plain calls that unwind. But now I'm remembering: MemorySSA treats a potentially throwing call as a read of the value, even if the call itself can't actually read the value. But still, the new code isn't consistent with the mayThrowBetween() check. I forget the end result of the discussion of whether we still need that check at all.

I've pushed additional tests for that scenario (e018b8bbb0ba).

mayThrowInBetween is needed for cases where we have may throw functions marked as read none. Those won't be part of MemorySSA and the extra check is needed. We have to ensure that there are no throws between any of the killing blocks. mayThrowInBetween currently conservatively returns true if there are any throwing blocks, so for now it should be fine. If mayThrowInBetween gets more powerful in the future, we have to make sure all killing blocks are checked. I can add a TODO.

Okay, makes sense.

ping. @george.burgess.iv do you think this patch is along the lines you suggested while reviewing D73763?

sorry for the latency -- a bit busy now, but I hope to get to this by EOD Wednesday :)

Thanks for the patch :)

I haven't reasoned through this fully, but the overall idea of "mark all killing blocks, then walk from the exits and see that all of them hit any of the killing blocks (or never pass through the block with the store)" seems functionally correct to me. I have a first round of comments/questions; happy to dig deeper on whatever direction we decide to head in.

do you think this patch is along the lines you suggested

What I envisioned was more or less keeping a SmallPtrSet<MemoryAccess *, N> populated with potentially-terminating Defs/Phis (e.g., for each Def/Phi in there, there exists a path in the function such that that's the last Def/Phi before the function exits). When walking for memory which is readable after the function exits, if we would call PushMemUses on something in this set, we give up. This can set can be built once lazily, and then queried for the rest of the life of run(). We could build it by {B,D}FS'ing starting at all exit blocks for our current Function, and terminating the search each time we discover any MemoryDefs/MemoryPhis in a block (we have special use-lists for these, so this is a constant-time query per block).

If there aren't holes in it, I think that approach would be generally faster/simpler, but it also introduces the problem of cache invalidation. Cache invalidation makes many things harder, including writing tests + making test-cases. :)

Assuming the development practice we're trying to follow is still "build what's needed, get feedback, iterate," that you mentioned on an earlier review, I'm happy to chalk that up as a future optimization if you'd prefer this approach.

if the post order number of a block is greater than the one
of DomAccess, the block cannot be in in a path starting at DomAccess

I'm not certain I'm interpreting "a path starting at," correctly, but the way I'm reading it, I don't agree:

  A
  |
  B<-----
 / \    |
C   D   |
    |   |
    E --|

(where all edges are downward except E -> B)

C = 1, E = 2, D = 3, B = 4, and A = 5, yet there's a path from E -> B, no? Admittedly, I'm unsure at this point if this is pedantry on my part, or if there are correctness implications on the algorithm as presented.

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1664	Accesses to objects that are was this accidentally left in?
1745–1756	nit: looks like we're only using `KillingBlocks` if `DefVisibleToCallerAfterRet`. Can we save work in the `!DefVisibleToCallerAfterRet` case by ignoring this bit of the `if` then?
1753	it's a bit odd to read if (isCompleteOverwrite(...)) { if (isOverwrite(...) == OW_Complete) { // ... } } Is there a subtle difference between the two that I'm missing?
1778	nit: for consistency with elsewhere, can we `.count`?
1784	tiny nits: one space between `//` and the comment please also, `s/post-dominates/post-dominate/`
1799	i'm unsure how we reach this case. AFAICT, `PDT.dominates(nullptr, AnythingElse) == false`, since the PDT should fail to lookup the node for `nullptr`, no?
1822	nit: 50 is a bit of a magical number to give up at here. is it worth making a `-mllvm` option (or perhaps hoisting to a constant with a bit of prose about how it was chosen)?

Address comments, thanks!

In D78932#2024077, @george.burgess.iv wrote:

Thanks for the patch :)

Thank you very much for taking the time to take a look :)

I haven't reasoned through this fully, but the overall idea of "mark all killing blocks, then walk from the exits and see that all of them hit any of the killing blocks (or never pass through the block with the store)" seems functionally correct to me. I have a first round of comments/questions; happy to dig deeper on whatever direction we decide to head in.

do you think this patch is along the lines you suggested

What I envisioned was more or less keeping a SmallPtrSet<MemoryAccess *, N> populated with potentially-terminating Defs/Phis (e.g., for each Def/Phi in there, there exists a path in the function such that that's the last Def/Phi before the function exits). When walking for memory which is readable after the function exits, if we would call PushMemUses on something in this set, we give up. This can set can be built once lazily, and then queried for the rest of the life of run(). We could build it by {B,D}FS'ing starting at all exit blocks for our current Function, and terminating the search each time we discover any MemoryDefs/MemoryPhis in a block (we have special use-lists for these, so this is a constant-time query per block).

Oh that's convenient, I was not aware of that. That should indeed make it quite straigth-forward to build this set.

If there aren't holes in it, I think that approach would be generally faster/simpler, but it also introduces the problem of cache invalidation. Cache invalidation makes many things harder, including writing tests + making test-cases. :)

Assuming the development practice we're trying to follow is still "build what's needed, get feedback, iterate," that you mentioned on an earlier review, I'm happy to chalk that up as a future optimization if you'd prefer this approach.

IIUC, this approach is mostly the same as in the original patch, just with a different way of detecting 'last' MemoryDefs, right? I think there potentially are cases that would be handled more efficiently with the current approach (e.g. if there are a large number of memoryDefs/uses to traverse). If my understanding is correct, the alternative approach should be relatively straight forward to implement. I would be happy to iterate on that once we have a correct version in tree (that makes balancing the patches and benchmarking a bit easier I think), but I could also try to do it up-front, if that's preferred.

if the post order number of a block is greater than the one
of DomAccess, the block cannot be in in a path starting at DomAccess

I'm not certain I'm interpreting "a path starting at," correctly, but the way I'm reading it, I don't agree:

Sorry, the statement is a bit incomplete I think. It meant to refer to 'all paths to any exit block, starting at DomAccess'. In case you described, there technically is a path through DomAccess, but that should be fine, as we already check all paths from DomAccess to any exit.

  A
  |
  B<-----
 / \    |
C   D   |
    |   |
    E --|
(where all edges are downward except E -> B)

C = 1, E = 2, D = 3, B = 4, and A = 5, yet there's a path from E -> B, no? Admittedly, I'm unsure at this point if this is pedantry on my part, or if there are correctness implications on the algorithm as presented.

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1664	Yes, removed!
1753	Ah right, I've refactored isCompleteOverwrite to just use isOVerwrite().
1799	We can hit this scenario when there's no single exit block that post-dominates all killing blocks, like in the example below. Instead the special 'nullptr' block is used, but for starting the traversal we need to start with all exit blocks I think. define void @test12(i32* %P) { store i32 0, i32* %P br i1 true, label %bb1, label %bb2 bb1: ; preds = %0 store i32 1, i32* %P br label %bb3 bb2: ; preds = %0 store i32 1, i32* %P ret void bb3: ; preds = %bb1 ret void }
1822	I've made it into an option, thanks!

Harbormaster failed remote builds in B56088: Diff 262711!May 7 2020, 12:28 PM

Ping :)

Sorry for the latency :)

I would be happy to iterate on that once we have a correct version in tree (that makes balancing the patches and benchmarking a bit easier I think), but I could also try to do it up-front, if that's preferred.

WFM.

I think there potentially are cases that would be handled more efficiently with the current approach (e.g. if there are a large number of memoryDefs/uses to traverse)

Agreed, though again, the set-based approach only needs to be built once whereas this one needs to happen on every query, so it all depends on how many times we fall back to this codepath, how large the function is, and how sparse the Defs/Phis in the function are.

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1799	Sorry, I should've been a bit more clear. Specifically, this code is guarded by `PDT.dominates(CommonPred, DomAccess->getBlock())`. `DomAccess->getBlock()` should never be `nullptr` and `PDT.dominates(nullptr, not_nullptr)` should always be `false` AFAIK, so I think we should never hit this `else` if `!CommonPred`? Or maybe we do for `DomAccess`es which are in unreachable code (but in that case, do we care to do extra work?)

Rebased

In D78932#2047134, @george.burgess.iv wrote:

Sorry for the latency :)

I would be happy to iterate on that once we have a correct version in tree (that makes balancing the patches and benchmarking a bit easier I think), but I could also try to do it up-front, if that's preferred.

WFM.

Great!

I think there potentially are cases that would be handled more efficiently with the current approach (e.g. if there are a large number of memoryDefs/uses to traverse)

Agreed, though again, the set-based approach only needs to be built once whereas this one needs to happen on every query, so it all depends on how many times we fall back to this codepath, how large the function is, and how sparse the Defs/Phis in the function are.

Yep, we definitely should collect data on that. Hopefully Ii'll be able to put the alternative together in the next week or so. As said earlier, it would be easier to get the first approach in as a baseline.

fhahn added inline comments.May 22 2020, 10:40 AM

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1799	PDT.dominates(nullptr, not_nullptr) should always be false Hm, IIUC nullptr stands for the (virtual) root node in the post dominator tree. Unless I am missing something, I think `PDT.dominates(nullptr, not_nullptr)` should always be `true`, i.e. each node in the function is post-dominated by the (virtual) root.

Harbormaster completed remote builds in B57662: Diff 265771.May 22 2020, 12:20 PM

ping :)

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1799	@george.burgess.iv does the above make sense?

ping :)

Eesh -- sorry for taking forever. :)

I think my questions are answered, and seeing how this works SGTM. If we need something faster (or wanna experiment with other approaches), I think we have a solid way forward on that.

That said, LGTM. Thanks again!

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
1799	Yup!

This revision is now accepted and ready to land.Jun 9 2020, 8:33 PM

Closed by commit rG67671024c8cb: [DSE,MSSA] Relax post-dom restriction for objs visible after return. (authored by fhahn). · Explain WhyJun 10 2020, 2:42 AM

This revision was automatically updated to reflect the committed changes.

In D78932#2083972, @george.burgess.iv wrote:

Eesh -- sorry for taking forever. :)

I think my questions are answered, and seeing how this works SGTM. If we need something faster (or wanna experiment with other approaches), I think we have a solid way forward on that.

That said, LGTM. Thanks again!

Awesome, thanks! I think there are 2 more patches or so to catch most outstanding cases of legacy DSE. After that I'll move to compile-time tuning.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

DeadStoreElimination.cpp

120 lines

test/

Transforms/

DeadStoreElimination/

MSSA/

multiblock-memintrinsics.ll

3 lines

multiblock-multipath.ll

21 lines

multiblock-simple.ll

6 lines

Diff 269772

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp

Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines

STATISTIC(NumRemainingStores, "Number of stores remaining after DSE");		STATISTIC(NumRemainingStores, "Number of stores remaining after DSE");
STATISTIC(NumRedundantStores, "Number of redundant stores deleted");		STATISTIC(NumRedundantStores, "Number of redundant stores deleted");
STATISTIC(NumFastStores, "Number of stores deleted");		STATISTIC(NumFastStores, "Number of stores deleted");
STATISTIC(NumFastOther, "Number of other instrs removed");		STATISTIC(NumFastOther, "Number of other instrs removed");
STATISTIC(NumCompletePartials, "Number of stores dead by later partials");		STATISTIC(NumCompletePartials, "Number of stores dead by later partials");
STATISTIC(NumModifiedStores, "Number of stores modified");		STATISTIC(NumModifiedStores, "Number of stores modified");
STATISTIC(NumNoopStores, "Number of noop stores deleted");		STATISTIC(NumNoopStores, "Number of noop stores deleted");
		STATISTIC(NumCFGChecks, "Number of stores modified");
		STATISTIC(NumCFGTries, "Number of stores modified");
		STATISTIC(NumCFGSuccess, "Number of stores modified");

DEBUG_COUNTER(MemorySSACounter, "dse-memoryssa",		DEBUG_COUNTER(MemorySSACounter, "dse-memoryssa",
"Controls which MemoryDefs are eliminated.");		"Controls which MemoryDefs are eliminated.");

static cl::opt<bool>		static cl::opt<bool>
EnablePartialOverwriteTracking("enable-dse-partial-overwrite-tracking",		EnablePartialOverwriteTracking("enable-dse-partial-overwrite-tracking",
cl::init(true), cl::Hidden,		cl::init(true), cl::Hidden,
cl::desc("Enable partial-overwrite tracking in DSE"));		cl::desc("Enable partial-overwrite tracking in DSE"));
Show All 12 Lines	MemorySSAScanLimit("dse-memoryssa-scanlimit", cl::init(100), cl::Hidden,
cl::desc("The number of memory instructions to scan for "		cl::desc("The number of memory instructions to scan for "
"dead store elimination (default = 100)"));		"dead store elimination (default = 100)"));

static cl::opt<unsigned> MemorySSADefsPerBlockLimit(		static cl::opt<unsigned> MemorySSADefsPerBlockLimit(
"dse-memoryssa-defs-per-block-limit", cl::init(5000), cl::Hidden,		"dse-memoryssa-defs-per-block-limit", cl::init(5000), cl::Hidden,
cl::desc("The number of MemoryDefs we consider as candidates to eliminated "		cl::desc("The number of MemoryDefs we consider as candidates to eliminated "
"other stores per basic block (default = 5000)"));		"other stores per basic block (default = 5000)"));

		static cl::opt<unsigned> MemorySSAPathCheckLimit(
		"dse-memoryssa-path-check-limit", cl::init(50), cl::Hidden,
		cl::desc("The maximum number of blocks to check when trying to prove that "
		"all paths to an exit go through a killing block (default = 50)"));

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Helper functions		// Helper functions
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
using OverlapIntervalsTy = std::map<int64_t, int64_t>;		using OverlapIntervalsTy = std::map<int64_t, int64_t>;
using InstOverlapIntervalsTy = DenseMap<Instruction *, OverlapIntervalsTy>;		using InstOverlapIntervalsTy = DenseMap<Instruction *, OverlapIntervalsTy>;

/// Delete this instruction. Before we do, go through and zero out all the		/// Delete this instruction. Before we do, go through and zero out all the
/// operands of this instruction. If any of them become dead, delete them and		/// operands of this instruction. If any of them become dead, delete them and
▲ Show 20 Lines • Show All 1,464 Lines • ▼ Show 20 Lines	bool isCompleteOverwrite(MemoryLocation DefLoc, Instruction *UseInst) const {
// MemoryDef.		// MemoryDef.
if (!UseInst->mayWriteToMemory())		if (!UseInst->mayWriteToMemory())
return false;		return false;

if (auto *CB = dyn_cast<CallBase>(UseInst))		if (auto *CB = dyn_cast<CallBase>(UseInst))
if (CB->onlyAccessesInaccessibleMemory())		if (CB->onlyAccessesInaccessibleMemory())
return false;		return false;

ModRefInfo MR = AA.getModRefInfo(UseInst, DefLoc);		int64_t InstWriteOffset, DepWriteOffset;
// If necessary, perform additional analysis.		auto CC = getLocForWriteEx(UseInst);
if (isModSet(MR) && isa<CallBase>(UseInst))		InstOverlapIntervalsTy IOL;
MR = AA.callCapturesBefore(UseInst, DefLoc, &DT);
		const DataLayout &DL = F.getParent()->getDataLayout();

Optional<MemoryLocation> UseLoc = getLocForWriteEx(UseInst);		return CC &&
return isModSet(MR) && isMustSet(MR) &&		isOverwrite(DefLoc, *CC, DL, TLI, DepWriteOffset, InstWriteOffset,
(UseLoc->Size.hasValue() && DefLoc.Size.hasValue() &&		UseInst, IOL, AA, &F) == OW_Complete;
UseLoc->Size.getValue() >= DefLoc.Size.getValue());
}		}

/// Returns true if \p Use may read from \p DefLoc.		/// Returns true if \p Use may read from \p DefLoc.
bool isReadClobber(MemoryLocation DefLoc, Instruction *UseInst) const {		bool isReadClobber(MemoryLocation DefLoc, Instruction *UseInst) const {
if (!UseInst->mayReadFromMemory())		if (!UseInst->mayReadFromMemory())
return false;		return false;

if (auto *CB = dyn_cast<CallBase>(UseInst))		if (auto *CB = dyn_cast<CallBase>(UseInst))
Show All 37 Lines	do {
DomAccess = MSSA.getSkipSelfWalker()->getClobberingMemoryAccess(CurrentUD,		DomAccess = MSSA.getSkipSelfWalker()->getClobberingMemoryAccess(CurrentUD,
DefLoc);		DefLoc);
if (MSSA.isLiveOnEntryDef(DomAccess))		if (MSSA.isLiveOnEntryDef(DomAccess))
return None;		return None;

if (isa<MemoryPhi>(DomAccess))		if (isa<MemoryPhi>(DomAccess))
break;		break;

// Check if we can skip DomDef for DSE. For accesses to objects that are		// Check if we can skip DomDef for DSE.
		george.burgess.ivUnsubmitted Done Reply Inline Actions Accesses to objects that are was this accidentally left in? george.burgess.iv: > Accesses to objects that are was this accidentally left in?
		fhahnAuthorUnsubmitted Done Reply Inline Actions Yes, removed! fhahn: Yes, removed!
// accessible after the function returns, KillingDef must execute whenever
// DomDef executes and use post-dominance to ensure that.
MemoryDef *DomDef = dyn_cast<MemoryDef>(DomAccess);		MemoryDef *DomDef = dyn_cast<MemoryDef>(DomAccess);
if ((DomDef && canSkipDef(DomDef, DefVisibleToCallerBeforeRet)) \|\|		if (DomDef && canSkipDef(DomDef, DefVisibleToCallerBeforeRet)) {
(DefVisibleToCallerAfterRet &&
!PDT.dominates(KillingDef->getBlock(), DomDef->getBlock()))) {
StepAgain = true;		StepAgain = true;
Current = DomDef->getDefiningAccess();		Current = DomDef->getDefiningAccess();
}		}

} while (StepAgain);		} while (StepAgain);

		// Accesses to objects accessible after the function returns can only be
		// eliminated if the access is killed along all paths to the exit. Collect
		// the blocks with killing (=completely overwriting MemoryDefs) and check if
		// they cover all paths from DomAccess to any function exit.
		SmallPtrSet<BasicBlock *, 16> KillingBlocks = {KillingDef->getBlock()};
LLVM_DEBUG({		LLVM_DEBUG({
dbgs() << " Checking for reads of " << *DomAccess;		dbgs() << " Checking for reads of " << *DomAccess;
if (isa<MemoryDef>(DomAccess))		if (isa<MemoryDef>(DomAccess))
dbgs() << " (" << *cast<MemoryDef>(DomAccess)->getMemoryInst() << ")\n";		dbgs() << " (" << *cast<MemoryDef>(DomAccess)->getMemoryInst() << ")\n";
else		else
dbgs() << ")\n";		dbgs() << ")\n";
});		});

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	for (unsigned I = 0; I < WorkList.size(); I++) {
// miss cases like the following		// miss cases like the following
// 1 = Def(LoE) ; <----- DomDef stores [0,1]		// 1 = Def(LoE) ; <----- DomDef stores [0,1]
// 2 = Def(1) ; (2, 1) = NoAlias, stores [2,3]		// 2 = Def(1) ; (2, 1) = NoAlias, stores [2,3]
// Use(2) ; MayAlias 2 and 1, loads [0, 3].		// Use(2) ; MayAlias 2 and 1, loads [0, 3].
// (The Use points to the first Def it may alias)		// (The Use points to the first Def it may alias)
// 3 = Def(1) ; <---- Current (3, 2) = NoAlias, (3,1) = MayAlias,		// 3 = Def(1) ; <---- Current (3, 2) = NoAlias, (3,1) = MayAlias,
// stores [0,1]		// stores [0,1]
if (MemoryDef *UseDef = dyn_cast<MemoryDef>(UseAccess)) {		if (MemoryDef *UseDef = dyn_cast<MemoryDef>(UseAccess)) {
if (!isCompleteOverwrite(DefLoc, UseInst))		if (isCompleteOverwrite(DefLoc, UseInst)) {
		if (DefVisibleToCallerAfterRet && UseAccess != DomAccess) {
		BasicBlock *MaybeKillingBlock = UseInst->getParent();
		if (PostOrderNumbers.find(MaybeKillingBlock)->second <
		PostOrderNumbers.find(DomAccess->getBlock())->second) {

		LLVM_DEBUG(dbgs() << " ... found killing block "
		<< MaybeKillingBlock->getName() << "\n");
		KillingBlocks.insert(MaybeKillingBlock);
		george.burgess.ivUnsubmitted Not Done Reply Inline Actions it's a bit odd to read if (isCompleteOverwrite(...)) { if (isOverwrite(...) == OW_Complete) { // ... } } Is there a subtle difference between the two that I'm missing? george.burgess.iv: it's a bit odd to read ``` if (isCompleteOverwrite(...)) { if (isOverwrite(...) ==…
		fhahnAuthorUnsubmitted Done Reply Inline Actions Ah right, I've refactored isCompleteOverwrite to just use isOVerwrite(). fhahn: Ah right, I've refactored isCompleteOverwrite to just use isOVerwrite().
		}
		}
		} else
		george.burgess.ivUnsubmitted Done Reply Inline Actions nit: looks like we're only using `KillingBlocks` if `DefVisibleToCallerAfterRet`. Can we save work in the `!DefVisibleToCallerAfterRet` case by ignoring this bit of the `if` then? george.burgess.iv: nit: looks like we're only using `KillingBlocks` if `DefVisibleToCallerAfterRet`. Can we save…
PushMemUses(UseDef);		PushMemUses(UseDef);
}		}
}		}

		// For accesses to locations visible after the function returns, make sure
		// that the location is killed (=overwritten) along all paths from DomAccess
		// to the exit.
		if (DefVisibleToCallerAfterRet) {
		assert(!KillingBlocks.empty() &&
		"Expected at least a single killing block");
		// Find the common post-dominator of all killing blocks.
		BasicBlock CommonPred = KillingBlocks.begin();
		for (auto I = std::next(KillingBlocks.begin()), E = KillingBlocks.end();
		I != E; I++) {
		if (!CommonPred)
		break;
		CommonPred = PDT.findNearestCommonDominator(CommonPred, *I);
		}

		// If CommonPred is in the set of killing blocks, just check if it
		// post-dominates DomAccess.
		if (KillingBlocks.count(CommonPred)) {
		george.burgess.ivUnsubmitted Done Reply Inline Actions nit: for consistency with elsewhere, can we `.count`? george.burgess.iv: nit: for consistency with elsewhere, can we `.count`?
		if (PDT.dominates(CommonPred, DomAccess->getBlock()))
		return {DomAccess};
		return None;
		}

		// If the common post-dominator does not post-dominate DomAccess, there
		george.burgess.ivUnsubmitted Done Reply Inline Actions tiny nits: one space between `//` and the comment please also, `s/post-dominates/post-dominate/` george.burgess.iv: tiny nits: - one space between `//` and the comment please - also, `s/post-dominates/post…
		// is a path from DomAccess to an exit not going through a killing block.
		if (PDT.dominates(CommonPred, DomAccess->getBlock())) {
		SetVector<BasicBlock *> WorkList;

		// DomAccess's post-order number provides an upper bound of the blocks
		// on a path starting at DomAccess.
		unsigned UpperBound =
		PostOrderNumbers.find(DomAccess->getBlock())->second;

		// If CommonPred is null, there are multiple exits from the function.
		// They all have to be added to the worklist.
		if (CommonPred)
		WorkList.insert(CommonPred);
		else
		for (BasicBlock *R : PDT.getRoots()) {
		george.burgess.ivUnsubmitted Not Done Reply Inline Actions i'm unsure how we reach this case. AFAICT, `PDT.dominates(nullptr, AnythingElse) == false`, since the PDT should fail to lookup the node for `nullptr`, no? george.burgess.iv: i'm unsure how we reach this case. AFAICT, `PDT.dominates(nullptr, AnythingElse) == false`…
		fhahnAuthorUnsubmitted Done Reply Inline Actions We can hit this scenario when there's no single exit block that post-dominates all killing blocks, like in the example below. Instead the special 'nullptr' block is used, but for starting the traversal we need to start with all exit blocks I think. define void @test12(i32* %P) { store i32 0, i32* %P br i1 true, label %bb1, label %bb2 bb1: ; preds = %0 store i32 1, i32* %P br label %bb3 bb2: ; preds = %0 store i32 1, i32* %P ret void bb3: ; preds = %bb1 ret void } fhahn: We can hit this scenario when there's no single exit block that post-dominates all killing…
		george.burgess.ivUnsubmitted Not Done Reply Inline Actions Sorry, I should've been a bit more clear. Specifically, this code is guarded by `PDT.dominates(CommonPred, DomAccess->getBlock())`. `DomAccess->getBlock()` should never be `nullptr` and `PDT.dominates(nullptr, not_nullptr)` should always be `false` AFAIK, so I think we should never hit this `else` if `!CommonPred`? Or maybe we do for `DomAccess`es which are in unreachable code (but in that case, do we care to do extra work?) george.burgess.iv: Sorry, I should've been a bit more clear. Specifically, this code is guarded by `PDT.dominates…
		fhahnAuthorUnsubmitted Done Reply Inline Actions PDT.dominates(nullptr, not_nullptr) should always be false Hm, IIUC nullptr stands for the (virtual) root node in the post dominator tree. Unless I am missing something, I think `PDT.dominates(nullptr, not_nullptr)` should always be `true`, i.e. each node in the function is post-dominated by the (virtual) root. fhahn: > PDT.dominates(nullptr, not_nullptr) should always be false Hm, IIUC nullptr stands for the…
		fhahnAuthorUnsubmitted Done Reply Inline Actions @george.burgess.iv does the above make sense? fhahn: @george.burgess.iv does the above make sense?
		george.burgess.ivUnsubmitted Not Done Reply Inline Actions Yup! george.burgess.iv: Yup!
		if (!DT.isReachableFromEntry(R))
		continue;
		WorkList.insert(R);
		}

		NumCFGTries++;
		// Check if all paths starting from an exit node go through one of the
		// killing blocks before reaching DomAccess.
		for (unsigned I = 0; I < WorkList.size(); I++) {
		NumCFGChecks++;
		BasicBlock *Current = WorkList[I];
		if (KillingBlocks.count(Current))
		continue;
		if (Current == DomAccess->getBlock())
		return None;
		unsigned CPO = PostOrderNumbers.find(Current)->second;
		// Current block is not on a path starting at DomAccess.
		if (CPO > UpperBound)
		continue;
		for (BasicBlock *Pred : predecessors(Current))
		WorkList.insert(Pred);

		if (WorkList.size() >= MemorySSAPathCheckLimit)
		george.burgess.ivUnsubmitted Done Reply Inline Actions nit: 50 is a bit of a magical number to give up at here. is it worth making a `-mllvm` option (or perhaps hoisting to a constant with a bit of prose about how it was chosen)? george.burgess.iv: nit: 50 is a bit of a magical number to give up at here. is it worth making a `-mllvm` option…
		fhahnAuthorUnsubmitted Done Reply Inline Actions I've made it into an option, thanks! fhahn: I've made it into an option, thanks!
		return None;
		}
		NumCFGSuccess++;
		return {DomAccess};
		}
		return None;
		}

// No aliasing MemoryUses of DomAccess found, DomAccess is potentially dead.		// No aliasing MemoryUses of DomAccess found, DomAccess is potentially dead.
return {DomAccess};		return {DomAccess};
}		}

// Delete dead memory defs		// Delete dead memory defs
void deleteDeadInstruction(Instruction *SI) {		void deleteDeadInstruction(Instruction *SI) {
MemorySSAUpdater Updater(&MSSA);		MemorySSAUpdater Updater(&MSSA);
SmallVector<Instruction *, 32> NowDeadInsts;		SmallVector<Instruction *, 32> NowDeadInsts;
▲ Show 20 Lines • Show All 362 Lines • Show Last 20 Lines

llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-memintrinsics.ll

	Show All 40 Lines
	}			}

	; Post-dominating store.			; Post-dominating store.
	define void @accessible_after_return_2(i32* noalias %P, i1 %c) {			define void @accessible_after_return_2(i32* noalias %P, i1 %c) {
	; CHECK-LABEL: @accessible_after_return_2(			; CHECK-LABEL: @accessible_after_return_2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[ARRAYIDX0:%.]] = getelementptr inbounds i32, i32 [[P:%.*]], i64 1			; CHECK-NEXT: [[ARRAYIDX0:%.]] = getelementptr inbounds i32, i32 [[P:%.*]], i64 1
	; CHECK-NEXT: [[P3:%.]] = bitcast i32 [[ARRAYIDX0]] to i8*			; CHECK-NEXT: [[P3:%.]] = bitcast i32 [[ARRAYIDX0]] to i8*
	; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 4 [[P3]], i8 0, i64 28, i1 false)			; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i8, i8 [[P3]], i64 4
				; CHECK-NEXT: call void @llvm.memset.p0i8.i64(i8* align 4 [[TMP0]], i8 0, i64 24, i1 false)
	; CHECK-NEXT: br i1 [[C:%.]], label [[BB1:%.]], label [[BB2:%.*]]			; CHECK-NEXT: br i1 [[C:%.]], label [[BB1:%.]], label [[BB2:%.*]]
	; CHECK: bb1:			; CHECK: bb1:
	; CHECK-NEXT: [[ARRAYIDX1:%.]] = getelementptr inbounds i32, i32 [[P]], i64 1			; CHECK-NEXT: [[ARRAYIDX1:%.]] = getelementptr inbounds i32, i32 [[P]], i64 1
	; CHECK-NEXT: store i32 1, i32* [[ARRAYIDX1]], align 4			; CHECK-NEXT: store i32 1, i32* [[ARRAYIDX1]], align 4
	; CHECK-NEXT: br label [[BB3:%.*]]			; CHECK-NEXT: br label [[BB3:%.*]]
	; CHECK: bb2:			; CHECK: bb2:
	; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds i32, i32 [[P]], i64 1			; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds i32, i32 [[P]], i64 1
	; CHECK-NEXT: store i32 1, i32* [[ARRAYIDX2]], align 4			; CHECK-NEXT: store i32 1, i32* [[ARRAYIDX2]], align 4
	▲ Show 20 Lines • Show All 183 Lines • Show Last 20 Lines

llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-multipath.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -basicaa -dse -enable-dse-memoryssa -S \| FileCheck %s		; RUN: opt < %s -basicaa -dse -enable-dse-memoryssa -S \| FileCheck %s

target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"		target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"

declare void @use(i32 *)		declare void @use(i32 *)

; Tests where the pointer/object is accessible after the function returns.		; Tests where the pointer/object is accessible after the function returns.

define void @accessible_after_return_1(i32* noalias %P, i1 %c1) {		define void @accessible_after_return_1(i32* noalias %P, i1 %c1) {
; CHECK-LABEL: @accessible_after_return_1(		; CHECK-LABEL: @accessible_after_return_1(
; CHECK-NEXT: store i32 1, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 [[C1:%.]], label [[BB1:%.]], label [[BB2:%.*]]		; CHECK-NEXT: br i1 [[C1:%.]], label [[BB1:%.]], label [[BB2:%.*]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: store i32 0, i32* [[P]], align 4		; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4
; CHECK-NEXT: br label [[BB5:%.*]]		; CHECK-NEXT: br label [[BB5:%.*]]
; CHECK: bb2:		; CHECK: bb2:
; CHECK-NEXT: store i32 3, i32* [[P]], align 4		; CHECK-NEXT: store i32 3, i32* [[P]], align 4
; CHECK-NEXT: br label [[BB5]]		; CHECK-NEXT: br label [[BB5]]
; CHECK: bb5:		; CHECK: bb5:
; CHECK-NEXT: call void @use(i32* [[P]])		; CHECK-NEXT: call void @use(i32* [[P]])
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
Show All 9 Lines

bb5:		bb5:
call void @use(i32* %P)		call void @use(i32* %P)
ret void		ret void
}		}

define void @accessible_after_return_2(i32* noalias %P, i1 %c.1, i1 %c.2) {		define void @accessible_after_return_2(i32* noalias %P, i1 %c.1, i1 %c.2) {
; CHECK-LABEL: @accessible_after_return_2(		; CHECK-LABEL: @accessible_after_return_2(
; CHECK-NEXT: store i32 1, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]		; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: store i32 0, i32* [[P]], align 4		; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4
; CHECK-NEXT: br label [[BB5:%.*]]		; CHECK-NEXT: br label [[BB5:%.*]]
; CHECK: bb2:		; CHECK: bb2:
; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]		; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]
; CHECK: bb3:		; CHECK: bb3:
; CHECK-NEXT: store i32 3, i32* [[P]], align 4		; CHECK-NEXT: store i32 3, i32* [[P]], align 4
; CHECK-NEXT: br label [[BB5]]		; CHECK-NEXT: br label [[BB5]]
; CHECK: bb4:		; CHECK: bb4:
; CHECK-NEXT: store i32 5, i32* [[P]], align 4		; CHECK-NEXT: store i32 5, i32* [[P]], align 4
Show All 19 Lines	bb4:
store i32 5, i32* %P		store i32 5, i32* %P
br label %bb5		br label %bb5

bb5:		bb5:
call void @use(i32* %P)		call void @use(i32* %P)
ret void		ret void
}		}

		; Cannot remove store in entry block because it is not overwritten on path
		; entry->bb2->bb5.
define void @accessible_after_return_3(i32* noalias %P, i1 %c1) {		define void @accessible_after_return_3(i32* noalias %P, i1 %c1) {
; CHECK-LABEL: @accessible_after_return_3(		; CHECK-LABEL: @accessible_after_return_3(
; CHECK-NEXT: store i32 1, i32* [[P:%.*]], align 4		; CHECK-NEXT: store i32 1, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 [[C1:%.]], label [[BB1:%.]], label [[BB2:%.*]]		; CHECK-NEXT: br i1 [[C1:%.]], label [[BB1:%.]], label [[BB2:%.*]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: store i32 0, i32* [[P]], align 4		; CHECK-NEXT: store i32 0, i32* [[P]], align 4
; CHECK-NEXT: br label [[BB5:%.*]]		; CHECK-NEXT: br label [[BB5:%.*]]
; CHECK: bb2:		; CHECK: bb2:
Show All 12 Lines
bb2:		bb2:
br label %bb5		br label %bb5

bb5:		bb5:
call void @use(i32* %P)		call void @use(i32* %P)
ret void		ret void
}		}

		; Cannot remove store in entry block because it is not overwritten on path
		; entry->bb2->bb5.
define void @accessible_after_return_4(i32* noalias %P, i1 %c1) {		define void @accessible_after_return_4(i32* noalias %P, i1 %c1) {
; CHECK-LABEL: @accessible_after_return_4(		; CHECK-LABEL: @accessible_after_return_4(
; CHECK-NEXT: store i32 1, i32* [[P:%.*]], align 4		; CHECK-NEXT: store i32 1, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 [[C1:%.]], label [[BB1:%.]], label [[BB2:%.*]]		; CHECK-NEXT: br i1 [[C1:%.]], label [[BB1:%.]], label [[BB2:%.*]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: store i32 0, i32* [[P]], align 4		; CHECK-NEXT: store i32 0, i32* [[P]], align 4
; CHECK-NEXT: call void @use(i32* [[P]])		; CHECK-NEXT: call void @use(i32* [[P]])
; CHECK-NEXT: br label [[BB5:%.*]]		; CHECK-NEXT: br label [[BB5:%.*]]
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
bb5:		bb5:
ret void		ret void
}		}

; Can remove store in entry block, because it is overwritten before each return.		; Can remove store in entry block, because it is overwritten before each return.
define void @accessible_after_return6(i32* %P, i1 %c.1, i1 %c.2) {		define void @accessible_after_return6(i32* %P, i1 %c.1, i1 %c.2) {
; CHECK-LABEL: @accessible_after_return6(		; CHECK-LABEL: @accessible_after_return6(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]		; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]		; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]
; CHECK: bb2:		; CHECK: bb2:
; CHECK-NEXT: store i32 1, i32* [[P]], align 4		; CHECK-NEXT: store i32 1, i32* [[P:%.*]], align 4
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
; CHECK: bb3:		; CHECK: bb3:
; CHECK-NEXT: store i32 2, i32* [[P]], align 4		; CHECK-NEXT: store i32 2, i32* [[P]], align 4
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
; CHECK: bb4:		; CHECK: bb4:
; CHECK-NEXT: store i32 3, i32* [[P]], align 4		; CHECK-NEXT: store i32 3, i32* [[P]], align 4
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
Show All 19 Lines

; Can remove store in bb1, because it is overwritten along each path		; Can remove store in bb1, because it is overwritten along each path
; from bb1 to the exit.		; from bb1 to the exit.
define void @accessible_after_return7(i32* %P, i1 %c.1, i1 %c.2) {		define void @accessible_after_return7(i32* %P, i1 %c.1, i1 %c.2) {
; CHECK-LABEL: @accessible_after_return7(		; CHECK-LABEL: @accessible_after_return7(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]		; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]		; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]
; CHECK: bb3:		; CHECK: bb3:
; CHECK-NEXT: store i32 2, i32* [[P]], align 4		; CHECK-NEXT: store i32 2, i32* [[P:%.*]], align 4
; CHECK-NEXT: br label [[BB5:%.*]]		; CHECK-NEXT: br label [[BB5:%.*]]
; CHECK: bb4:		; CHECK: bb4:
; CHECK-NEXT: store i32 1, i32* [[P]], align 4		; CHECK-NEXT: store i32 1, i32* [[P]], align 4
; CHECK-NEXT: br label [[BB5]]		; CHECK-NEXT: br label [[BB5]]
; CHECK: bb2:		; CHECK: bb2:
; CHECK-NEXT: br label [[BB5]]		; CHECK-NEXT: br label [[BB5]]
; CHECK: bb5:		; CHECK: bb5:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
Show All 17 Lines	bb2:
br label %bb5		br label %bb5

bb5:		bb5:
ret void		ret void
}		}


; Cannot remove store in entry block, because it is overwritten along each path to		; Cannot remove store in entry block, because it is overwritten along each path to
; the exit (entry->bb1->bb5->bb5).		; the exit (entry->bb1->bb4->bb5).
define void @accessible_after_return8(i32* %P, i1 %c.1, i1 %c.2) {		define void @accessible_after_return8(i32* %P, i1 %c.1, i1 %c.2) {
; CHECK-LABEL: @accessible_after_return8(		; CHECK-LABEL: @accessible_after_return8(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4		; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]		; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]		; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]
; CHECK: bb2:		; CHECK: bb2:
▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines
for.inc:		for.inc:
%c.3 = call i1 @cond()		%c.3 = call i1 @cond()
br i1 %c.3, label %for.body, label %for.end		br i1 %c.3, label %for.body, label %for.end

for.end:		for.end:
ret void		ret void
}		}

		; Cannot remove store in entry block because it is not overwritten on path
		; entry->bb2->bb4. Also make sure we deal with dead exit blocks without
		; crashing.
define void @accessible_after_return10_dead_block(i32* %P, i1 %c.1, i1 %c.2) {		define void @accessible_after_return10_dead_block(i32* %P, i1 %c.1, i1 %c.2) {
; CHECK-LABEL: @accessible_after_return10_dead_block(		; CHECK-LABEL: @accessible_after_return10_dead_block(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4		; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]		; CHECK-NEXT: br i1 [[C_1:%.]], label [[BB1:%.]], label [[BB2:%.*]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]		; CHECK-NEXT: br i1 [[C_2:%.]], label [[BB3:%.]], label [[BB4:%.*]]
; CHECK: bb2:		; CHECK: bb2:
▲ Show 20 Lines • Show All 216 Lines • Show Last 20 Lines

llvm/test/Transforms/DeadStoreElimination/MSSA/multiblock-simple.ll

Show First 20 Lines • Show All 193 Lines • ▼ Show 20 Lines	bb2:
ret void		ret void
bb3:		bb3:
ret void		ret void
}		}


define void @test12(i32* %P) {		define void @test12(i32* %P) {
; CHECK-LABEL: @test12(		; CHECK-LABEL: @test12(
; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 true, label [[BB1:%.]], label [[BB2:%.]]		; CHECK-NEXT: br i1 true, label [[BB1:%.]], label [[BB2:%.]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: store i32 1, i32* [[P]], align 4		; CHECK-NEXT: store i32 1, i32* [[P:%.*]], align 4
; CHECK-NEXT: br label [[BB3:%.*]]		; CHECK-NEXT: br label [[BB3:%.*]]
; CHECK: bb2:		; CHECK: bb2:
; CHECK-NEXT: store i32 1, i32* [[P]], align 4		; CHECK-NEXT: store i32 1, i32* [[P]], align 4
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
; CHECK: bb3:		; CHECK: bb3:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
store i32 0, i32* %P		store i32 0, i32* %P
br i1 true, label %bb1, label %bb2		br i1 true, label %bb1, label %bb2
bb1:		bb1:
store i32 1, i32* %P		store i32 1, i32* %P
br label %bb3		br label %bb3
bb2:		bb2:
store i32 1, i32* %P		store i32 1, i32* %P
ret void		ret void
bb3:		bb3:
ret void		ret void
}		}


define void @test13(i32* %P) {		define void @test13(i32* %P) {
; CHECK-LABEL: @test13(		; CHECK-LABEL: @test13(
; CHECK-NEXT: store i32 0, i32* [[P:%.*]], align 4
; CHECK-NEXT: br i1 true, label [[BB1:%.]], label [[BB2:%.]]		; CHECK-NEXT: br i1 true, label [[BB1:%.]], label [[BB2:%.]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: store i32 1, i32* [[P]], align 4		; CHECK-NEXT: store i32 1, i32* [[P:%.*]], align 4
; CHECK-NEXT: br label [[BB3:%.*]]		; CHECK-NEXT: br label [[BB3:%.*]]
; CHECK: bb2:		; CHECK: bb2:
; CHECK-NEXT: store i32 1, i32* [[P]], align 4		; CHECK-NEXT: store i32 1, i32* [[P]], align 4
; CHECK-NEXT: br label [[BB3]]		; CHECK-NEXT: br label [[BB3]]
; CHECK: bb3:		; CHECK: bb3:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
store i32 0, i32* %P		store i32 0, i32* %P
Show All 10 Lines