This is an archive of the discontinued LLVM Phabricator instance.

[DSE] Avoid iterator invalidation bugs.
ClosedPublic

Authored by mcrosier on Jun 22 2016, 10:41 AM.

Details

Summary

This patch is the (rewritten) mechanical part of D21007 to avoid problems with iterator invalidation.

The test committed with http://reviews.llvm.org/rL273141 is removed because this patch no longer performs any type of back tracking, which is what was causing the codegen differences with and without debug information.

No correctness issues across SPEC2000, SPEC2006, and llvm-test-suite. In fact, there were no codegen changes across this suite of benchmarks. However, codegen changes are possible because we no longer backtrack when the current instruction is deleted.

Chad

FWIW, I'm in the process of trying to port DSE to use MemorySSA and I thought pushing this change sooner rather than later would simplify my life. Sorry to hijack your change, Eli.

Diff Detail

Repository
rL LLVM

Event Timeline

mcrosier updated this revision to Diff 61568.Jun 22 2016, 10:41 AM
mcrosier retitled this revision from to [DSE] Avoid iterator invalidation bugs by deferring deletion..
mcrosier updated this object.
mcrosier added a subscriber: llvm-commits.
mcrosier updated this revision to Diff 61572.Jun 22 2016, 10:46 AM
mcrosier added a reviewer: gberry.

-Fix a few comments that wrapped due to clang-format.

eli.friedman edited edge metadata.Jun 22 2016, 11:21 AM

It's no problem that you're stealing some of my changes. :)

Instead of adding an outer loop, you could solve the rL273141 testcase by only delaying the erasure of PHI nodes rather than the erasure of all operands. I think that wouldn't introduce any invalidation problems because we only visit reachable blocks. That said, the outer loop actually catches more cases: if you switch up the stores a little, you can come up with a testcase which current DSE can't handle, but this version can. Actually, you could even end up with cases where erasing a store in one block allows erasing a store in a different block... but iterating the whole DSE pass might be too expensive to be worthwhile.

Looks fine otherwise.

lib/Transforms/Scalar/DeadStoreElimination.cpp
557 ↗(On Diff #61572)

handleFree might need some changes to avoid iterator invalidation issues: I think you can end up with a problem with a block that branches to itself? You can't do that without tripping over https://llvm.org/bugs/show_bug.cgi?id=28014, I guess, so not really important, but a FIXME would be nice. (I think I was planning to do something in my version; I forget why I didn't.)

Henric edited edge metadata.Jun 23 2016, 5:23 AM

I've addressed the regressed test by added an outer loop around the kernel code in eliminateDeadStores that reevaluates the block until there is no change (as was suggested in D21076)

If the extra compilation time does not turn out to be a problem, then I think it's a good improvement. The code that is now encapsulated in eliminateNoopStore could already do that (reevaluate the block) so you could probably already create a rather bad test case if you really wanted.

I remember someone suggested in mail to the review list that it might be possible to remove stores if you process them in the right order. But I never understood how, at least not in the general case where you don't know the address origin of a load. For example if the use comes from an input param to the function, two examples:

load -> r1
store *r1 -> g1
load -> r2
store *r2 -> g2
...
store -> g1
store -> g2

compared to

load -> r2
store *r2 -> g2
load -> r1
store *r1 -> g1
...
store -> g1
store -> g2

It's no problem that you're stealing some of my changes. :)

Thanks! :D

Instead of adding an outer loop, you could solve the rL273141 testcase by only delaying the erasure of PHI nodes rather than the erasure of all operands.

I'm not sure I follow. Test case for reference:

define i16 @S0() !dbg !17 {
bb1:
  store i32 4, i32* @g_31, align 1, !dbg !20
  %_tmp16.i.i = load volatile i16*, i16** @g_30, align 1, !dbg !28
  %_tmp17.i.i = load i16, i16* %_tmp16.i.i, align 1, !dbg !28
  store i16 %_tmp17.i.i, i16* @g_118, align 1, !dbg !20
  store i32 0, i32* @g_31, align 1, !dbg !31
  tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !40, metadata !41), !dbg !42
  store i16 0, i16* @g_118, align 1, !dbg !43
  br label %bb2.i, !dbg !44

bb2.i:
  br label %bb2.i, !dbg !44
}

IIUC, the first store (store i32 4, i32* @g_31) is clobbered by the 3rd store (store i32 0, i32* @g_31), but DSE is unable to realize this because of an aliasing load (%_tmp17.i.i = load i16, i16* %_tmp16.i.i). The second store (and the loads that alias with the third store) is/are removed when DSE realizes the fourth store (store i16 0, i16* @g_118) clobbers the second. We fail to catch the rL273141 test case because this change doesn't do any type of back tracking like what is currently done.

Did I miss something, @eli.friedman/@Henric? I'm not sure how deleting non-phi nodes resolves this issue, Eli.

FWIW, I'd prefer to *not* have the outer loop, but I'm not sure I can resolve rL273141's test case in another way. My long term goal is to have DSE use MemorySSA and eventually revive rL255247, which was reverted due to compile-time issues; I don't think having the outer loop is inline with my current goals.

One might also argue that regressing rL273141's test case is acceptable, if we're fixing/avoiding iterator invalidation issues. Obviously, I'd like the best of both worlds, but I may need some direction on how to achieve that goal.

Oh, I see what you mean about the rL273141 testcase; I thought the stores in the testcase were ordered differently. You run into the issue I was describing if you reorder the last two stores, like this:

store i32 4, i32* @g_31, align 1, !dbg !20
%_tmp16.i.i = load volatile i16*, i16** @g_30, align 1, !dbg !28
%_tmp17.i.i = load i16, i16* %_tmp16.i.i, align 1, !dbg !28
store i16 %_tmp17.i.i, i16* @g_118, align 1, !dbg !20
store i16 0, i16* @g_118, align 1, !dbg !43
store i32 0, i32* @g_31, align 1, !dbg !31

The primary point of rL273141 is that DSE shouldn't change behavior based on the presence of debug info... so as long as you're not breaking that invariant, it should be fine to modify/delete the testcase as necessary.

Oh, I see what you mean about the rL273141 testcase; I thought the stores in the testcase were ordered differently. You run into the issue I was describing if you reorder the last two stores, like this:

store i32 4, i32* @g_31, align 1, !dbg !20
%_tmp16.i.i = load volatile i16*, i16** @g_30, align 1, !dbg !28
%_tmp17.i.i = load i16, i16* %_tmp16.i.i, align 1, !dbg !28
store i16 %_tmp17.i.i, i16* @g_118, align 1, !dbg !20
store i16 0, i16* @g_118, align 1, !dbg !43
store i32 0, i32* @g_31, align 1, !dbg !31

Yes, I believe we'll miss this case if we defer the deletion of aliasing loads/stores. Let me see if I can take your suggested approach of deleting non-phi nodes, @eli.friedman. Also, can you please explain why the phi nodes must stick around (at least in the short term)? I don't follow..

Alternatively, I'm wondering if we could use @dberlin's suggestion from D21076 and utilize the iterator that is returned by BB->eraseFromParent() to address the iterator problem.

The primary point of rL273141 is that DSE shouldn't change behavior based on the presence of debug info... so as long as you're not breaking that invariant, it should be fine to modify/delete the testcase as necessary.

AFAICT, this implementation won't suffer from this problem, so I'm thinking it would be okay to drop the test.

" can you please explain why the phi nodes must stick around (at least in the short term)? I don't follow.."

Err, the PHI nodes don't need to stick around; you just need to make sure you don't delete the instruction the iterator is pointing at, and the operand of a phi node could be that instruction. There are other ways around the issue, I guess... you could even pass a pointer to the iterator to deleteDeadInstruction and make it update the iterator, I guess, instead of postponing deleting it.

" can you please explain why the phi nodes must stick around (at least in the short term)? I don't follow.."

Err, the PHI nodes don't need to stick around; you just need to make sure you don't delete the instruction the iterator is pointing at, and the operand of a phi node could be that instruction. There are other ways around the issue, I guess... you could even pass a pointer to the iterator to deleteDeadInstruction and make it update the iterator, I guess, instead of postponing deleting it.

Okay, my understanding of the problem appears to be the same as yours; I must have just misinterpreted one of your earlier comments.

mcrosier updated this revision to Diff 61983.Jun 27 2016, 11:08 AM
mcrosier retitled this revision from [DSE] Avoid iterator invalidation bugs by deferring deletion. to [DSE] Avoid iterator invalidation bugs..
mcrosier updated this object.
mcrosier edited edge metadata.
eli.friedman added inline comments.Jun 27 2016, 11:24 AM
lib/Transforms/Scalar/DeadStoreElimination.cpp
69 ↗(On Diff #61983)

Maybe it would make sense to pass in a BasicBlock::iterator* and make this function update it, rather than returning a BasicBlock::iterator?

110 ↗(On Diff #61983)

Wrong indentation.

822 ↗(On Diff #61983)

handleFree can invalidate BBI, I think?

mcrosier updated this revision to Diff 62005.Jun 27 2016, 12:46 PM
mcrosier marked 2 inline comments as done.

-Address Eli's comments.

lib/Transforms/Scalar/DeadStoreElimination.cpp
69 ↗(On Diff #61983)

Sure.

822 ↗(On Diff #61983)

How so? IIUC, deleting the 'free' instruction would invalidate the iterator, but I'm not convinced that can happen. Some care is needed to make sure we pass a valid iterator to MD-getPointerDependencyFrom(), but I believe I've handled that correctly.

Sorry about the delay; I thought I had sent review comments, but apparently they were saved/unsent.

lib/Transforms/Scalar/DeadStoreElimination.cpp
822 ↗(On Diff #62005)

BBI doesn't point at the free instruction here.

912 ↗(On Diff #62005)

I don't follow how Inst could ever be killed by deleteDeadInstruction... an instruction which writes to memory can't be trivially dead.

mcrosier updated this revision to Diff 62866.Jul 6 2016, 7:51 AM

Address Eli's feedback.

mcrosier marked 4 inline comments as done.Jul 6 2016, 7:53 AM
mcrosier added inline comments.
lib/Transforms/Scalar/DeadStoreElimination.cpp
922 ↗(On Diff #62866)

Fixed.

822 ↗(On Diff #62005)

Ah, right. Should be fixed now.

eli.friedman accepted this revision.Jul 6 2016, 11:36 AM
eli.friedman edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Jul 6 2016, 11:36 AM
This revision was automatically updated to reflect the committed changes.
mcrosier marked 2 inline comments as done.
llvm/trunk/lib/Transforms/Scalar/DeadStoreElimination.cpp